SAS and R data management, statistical analysis, and graphics
Kleinman, Ken
2014-01-01
An Up-to-Date, All-in-One Resource for Using SAS and R to Perform Frequent TasksThe first edition of this popular guide provided a path between SAS and R using an easy-to-understand, dictionary-like approach. Retaining the same accessible format, SAS and R: Data Management, Statistical Analysis, and Graphics, Second Edition explains how to easily perform an analytical task in both SAS and R, without having to navigate through the extensive, idiosyncratic, and sometimes unwieldy software documentation. The book covers many common tasks, such as data management, descriptive summaries, inferentia
SAS and R data management, statistical analysis, and graphics
Kleinman, Ken
2009-01-01
An All-in-One Resource for Using SAS and R to Carry out Common TasksProvides a path between languages that is easier than reading complete documentationSAS and R: Data Management, Statistical Analysis, and Graphics presents an easy way to learn how to perform an analytical task in both SAS and R, without having to navigate through the extensive, idiosyncratic, and sometimes unwieldy software documentation. The book covers many common tasks, such as data management, descriptive summaries, inferential procedures, regression analysis, and the creation of graphics, along with more complex applicat
Statistical analysis of medical data using SAS
Der, Geoff
2005-01-01
An Introduction to SASDescribing and Summarizing DataBasic InferenceScatterplots Correlation: Simple Regression and SmoothingAnalysis of Variance and CovarianceMultiple RegressionLogistic RegressionThe Generalized Linear ModelGeneralized Additive ModelsNonlinear Regression ModelsThe Analysis of Longitudinal Data IThe Analysis of Longitudinal Data II: Models for Normal Response VariablesThe Analysis of Longitudinal Data III: Non-Normal ResponseSurvival AnalysisAnalysis Multivariate Date: Principal Components and Cluster AnalysisReferences
Applied medical statistics using SAS
Der, Geoff
2012-01-01
""Each chapter in the book is well laid out, contains examples with SAS code, and ends with a concise summary. The chapters in the book contain the right level of information to use SAS to apply different statistical methods. … a good overview of how to apply in SAS 9.3 the many possible statistical analysis methods.""-Caroline Kennedy, Takeda Development Centre Europe Ltd., Statistical Methods for Medical Research, 2015""… a well-organized and thorough exploration of broad coverage in medical statistics. The book is an excellent reference of statistical methods
Peng, Chao-Ying Joanne
2008-01-01
"Peng provides an excellent overview of data analysis using the powerful statistical software package SAS. This book is quite appropriate as a self-placed tutorial for researchers, as well as a textbook or supplemental workbook for data analysis courses such as statistics or research methods. Peng provides detailed coverage of SAS capabilities using step-by-step procedures and includes numerous comprehensive graphics and figures, as well as SAS printouts. Readers do not need a background in computer science or programming. Includes numerous examples in education, health sciences, and business.
Conducting Meta-Analysis Using SAS
Arthur, Winfried; Huffcutt, Allen I; Arthur, Winfred
2001-01-01
Conducting Meta-Analysis Using SAS reviews the meta-analysis statistical procedure and shows the reader how to conduct one using SAS. It presents and illustrates the use of the PROC MEANS procedure in SAS to perform the data computations called for by the two most commonly used meta-analytic procedures, the Hunter & Schmidt and Glassian approaches. This book serves as both an operational guide and user's manual by describing and explaining the meta-analysis procedures and then presenting the appropriate SAS program code for computing the pertinent statistics. The practical, step-by-step instru
Statistical hypothesis testing with SAS and R
Taeger, Dirk
2014-01-01
A comprehensive guide to statistical hypothesis testing with examples in SAS and R When analyzing datasets the following questions often arise:Is there a short hand procedure for a statistical test available in SAS or R?If so, how do I use it?If not, how do I program the test myself? This book answers these questions and provides an overview of the most commonstatistical test problems in a comprehensive way, making it easy to find and performan appropriate statistical test. A general summary of statistical test theory is presented, along with a basicdescription for each test, including the
Design and analysis of experiments with SAS
Lawson, John
2010-01-01
IntroductionStatistics and Data Collection Beginnings of Statistically Planned Experiments Definitions and Preliminaries Purposes of Experimental Design Types of Experimental Designs Planning Experiments Performing the Experiments Use of SAS SoftwareCompletely Randomized Designs with One Factor Introduction Replication and Randomization A Historical Example Linear Model for Completely Randomized Design (CRD) Verifying Assumptions of the Linear Model Analysis Strategies When Assumptions Are Violated Determining the Number of Replicates Comparison of Treatments after the F-TestFactorial Designs
Extending and Enhancing SAS (Static Analysis Suite)
Ho, David
2016-01-01
The Static Analysis Suite (SAS) is an open-source software package used to perform static analysis on C and C++ code, helping to ensure safety, readability and maintainability. In this Summer Student project, SAS was enhanced to improve ease of use and user customisation. A straightforward method of integrating static analysis into a project at compilation time was provided using the automated build tool CMake. The process of adding checkers to the suite was streamlined and simplied by developing an automatic code generator. To make SAS more suitable for continuous integration, a reporting mechanism summarising results was added. This suitability has been demonstrated by inclusion of SAS in the Future Circular Collider Software nightly build system. Scalability of the improved package was demonstrated by using the tool to analyse the ROOT code base.
Essentials of Excel, Excel VBA, SAS and Minitab for statistical and financial analyses
Lee, Cheng-Few; Chang, Jow-Ran; Tai, Tzu
2016-01-01
This introductory textbook for business statistics teaches statistical analysis and research methods via business case studies and financial data using Excel, MINITAB, and SAS. Every chapter in this textbook engages the reader with data of individual stock, stock indices, options, and futures. One studies and uses statistics to learn how to study, analyze, and understand a data set of particular interest. Some of the more popular statistical programs that have been developed to use statistical and computational methods to analyze data sets are SAS, SPSS, and MINITAB. Of those, we look at MINITAB and SAS in this textbook. One of the main reasons to use MINITAB is that it is the easiest to use among the popular statistical programs. We look at SAS because it is the leading statistical package used in industry. We also utilize the much less costly and ubiquitous Microsoft Excel to do statistical analysis, as the benefits of Excel have become widely recognized in the academic world and its analytical capabilities...
Measuring Statistics Anxiety: Cross-Country Validity of the Statistical Anxiety Scale (SAS)
Chiesi, Francesca; Primi, Caterina; Carmona, Jose
2011-01-01
The aim of the research was to test the psychometric properties of the Italian version of the Vigil-Colet et al.'s Statistical Anxiety Scale (SAS), taking into account evidences based on (a) internal structure (factorial structure and cross-country invariance) and (b) relationships to other variables (the statistics anxiety's nomological network).…
CTTITEM: SAS macro and SPSS syntax for classical item analysis.
Lei, Pui-Wa; Wu, Qiong
2007-08-01
This article describes the functions of a SAS macro and an SPSS syntax that produce common statistics for conventional item analysis including Cronbach's alpha, item difficulty index (p-value or item mean), and item discrimination indices (D-index, point biserial and biserial correlations for dichotomous items and item-total correlation for polytomous items). These programs represent an improvement over the existing SAS and SPSS item analysis routines in terms of completeness and user-friendliness. To promote routine evaluations of item qualities in instrument development of any scale, the programs are available at no charge for interested users. The program codes along with a brief user's manual that contains instructions and examples are downloadable from suen.ed.psu.edu/-pwlei/plei.htm.
A handbook of statistical graphics using SAS ODS
Der, Geoff
2014-01-01
An Introduction to Graphics: Good Graphics, Bad Graphics, Catastrophic Graphics and Statistical GraphicsThe Challenger DisasterGraphical DisplaysA Little History and Some Early Graphical DisplaysGraphical DeceptionAn Introduction to ODS GraphicsGenerating ODS GraphsODS DestinationsStatistical Graphics ProceduresODS Graphs from Statistical ProceduresControlling ODS GraphicsControlling Labelling in GraphsODS Graphics EditorGraphs for Displaying the Characteristics of Univariate Data: Horse Racing, Mortality Rates, Forearm Lengths, Survival Times and Geyser EruptionsIntroductionPie Chart, Bar Cha
McDaniel, Stephen
2010-01-01
The fun and easy way to learn to use this leading business intelligence tool Written by an author team who is directly involved with SAS, this easy-to-follow guide is fully updated for the latest release of SAS and covers just what you need to put this popular software to work in your business. SAS allows any business or enterprise to improve data delivery, analysis, reporting, movement across a company, data mining, forecasting, statistical analysis, and more. SAS For Dummies, 2nd Edition gives you the necessary background on what SAS can do for you and explains how to use the Enterprise Gui
Nueva evidencia sobre la Statistical Anxiety Scale (SAS
Directory of Open Access Journals (Sweden)
Amparo Oliver
2014-01-01
Full Text Available Las asignaturas relacionadas con la estadística suelen tener problemas de rendimiento académico. La ansiedad se relaciona de forma negativa con el rendimiento y en particular, la ansiedad estadística puede ser un constructo clave en la mejora de la enseñanza de esta materia y afines. La Statistical Anxiety Scale (Vigil-Colet, Lorenzo-Seva y Condon, 2008 se creó con la pretensión de ser útil para predecir el rendimiento académico en estadística. Se fundamenta en tres dimensiones de ansiedad referidas a tres aspectos específicos: respecto al examen, cuando se pide ayuda en la comprensión de estadística y en el proceso de interpretación de resultados. Esta estructura de tres factores fue hallada en un primer momento por los autores de la escala y en una primera validación corroborada en estudiantes italianos y españoles. El presente estudio pretende añadir nueva evidencia sobre la fiabilidad y validez de la escala, empleando en el estudio de fiabilidad técnicas estadísticas robustas, y ampliando el estudio de la validez respecto a su principal criterio, el rendimiento académico, ya que no puede ser considerado sinónimo de autoeficacia.
A SAS Interface for Bayesian Analysis with WinBUGS
Zhang, Zhiyong; McArdle, John J.; Wang, Lijuan; Hamagami, Fumiaki
2008-01-01
Bayesian methods are becoming very popular despite some practical difficulties in implementation. To assist in the practical application of Bayesian methods, we show how to implement Bayesian analysis with WinBUGS as part of a standard set of SAS routines. This implementation procedure is first illustrated by fitting a multiple regression model…
A new SAS program for behavioral analysis of Electrical Penetration Graph (EPG) data
A new program is introduced that uses SAS software to duplicate output of descriptive statistics from the Sarria Excel workbook for EPG waveform analysis. Not only are publishable means and standard errors or deviations output, the user also is guided through four relatively simple sub-programs for ...
SAS essentials mastering SAS for data analytics
Elliott, Alan C
2015-01-01
A step-by-step introduction to using SAS® statistical software as a foundational approach to data analysis and interpretation Presenting a straightforward introduction from the ground up, SAS® Essentials: Mastering SAS for Data Analytics, Second Edition illustrates SAS using hands-on learning techniques and numerous real-world examples. Keeping different experience levels in mind, the highly-qualified author team has developed the book over 20 years of teaching introductory SAS courses. Divided into two sections, the first part of the book provides an introduction to data manipulation, st
Categorical Data Analysis With Sas and Spss Applications
Lawal, H Bayo
2003-01-01
This book covers the fundamental aspects of categorical data analysis with an emphasis on how to implement the models used in the book using SAS and SPSS. This is accomplished through the frequent use of examples, with relevant codes and instructions, that are closely related to the problems in the text. Concepts are explained in detail so that students can reproduce similar results on their own. Beginning with chapter two, exercises at the end of each chapter further strengthen students' understanding of the concepts by requiring them to apply some of the ideas expressed in the text in a more
%ERA: A SAS Macro for Extended Redundancy Analysis
Directory of Open Access Journals (Sweden)
Pietro Giorgio Lovaglio
2016-10-01
Full Text Available A new approach to structural equation modeling based on so-called extended redundancy analysis has been recently proposed in the literature, enhanced with the added characteristic of generalizing redundancy analysis and reduced-rank regression models for more than two blocks. In this approach, the relationships between the observed exogenous variables and the observed endogenous variables are moderated by the presence of unobservable composites that were estimated as linear combinations of exogenous variables, permitting a great flexibility to specify and fit a variety of structural relationships. In this paper, we propose the SAS macro %ERA to specify and fit structural relationships in the extended redundancy analysis (ERA framework. Two examples (simulation and real data are provided in order to reproduce results appearing in the original article where ERA was proposed.
McDaniel, Stephen
2010-01-01
The fun and easy way to learn to use this leading business intelligence tool Written by an author team who is directly involved with SAS, this easy-to-follow guide is fully updated for the latest release of SAS and covers just what you need to put this popular software to work in your business. SAS allows any business or enterprise to improve data delivery, analysis, reporting, movement across a company, data mining, forecasting, statistical analysis, and more. SAS For Dummies, 2nd Edition gives you the necessary background on what SAS can do for you and explains how to use the Enterprise Guide. SAS provides statistical and data analysis tools to help you deal with all kinds of data: operational, financial, performance, and more Places special emphasis on Enterprise Guide and other analytical tools, covering all commonly used features Covers all commonly used features and shows you the practical applications you can put to work in your business Explores how to get various types of data into the software and...
Institute of Scientific and Technical Information of China (English)
段锋; 魏振满; 陈大为; 朱珍真; 屠舒; 毕京峰; 李文淑; 孙斌
2013-01-01
Objective To realize automatically statements of measurement data in phase I clinical trials by SAS software. Methods Based on the SAS software, we used its functions, such as spliting table, statistical analysis, choosing results, transposing table, mergering tables horizontally or lengthways, to built statements of measurement data step by step. Results The statistical analysis results were imported to EXELE form automaticlly as report requirement. Conclusion The process is accurate and reliable, and saves time and effort. It is suitable for the non-professional statistical researchers to realize automatically statements of measurement data.%目的 借助SAS软件实现Ⅰ期临床试验计量资料统计分析的自动报表.方法 基于SAS软件,应用其拆分表格、统计分析、结果选择、表格转置、横向合并表格、纵向合并表格等程序逐步导出报表.结果 借助EXELE表格成功实现药物Ⅰ期临床试验计量资料统计分析结果的自动报表.结论 该过程省时省力,准确可靠,为非统计专业医学研究人员提供了一种简便、可靠的自动报表方法.
PROC LCA: A SAS Procedure for Latent Class Analysis
Lanza, Stephanie T.; Collins, Linda M.; Lemmon, David R.; Schafer, Joseph L.
2007-01-01
Latent class analysis (LCA) is a statistical method used to identify a set of discrete, mutually exclusive latent classes of individuals based on their responses to a set of observed categorical variables. In multiple-group LCA, both the measurement part and structural part of the model can vary across groups, and measurement invariance across…
PROC LCA: A SAS Procedure for Latent Class Analysis
Lanza, Stephanie T.; Collins, Linda M.; Lemmon, David R.; Schafer, Joseph L.
2007-01-01
Latent class analysis (LCA) is a statistical method used to identify a set of discrete, mutually exclusive latent classes of individuals based on their responses to a set of observed categorical variables. In multiple-group LCA, both the measurement part and structural part of the model can vary across groups, and measurement invariance across…
Design of experiment and data analysis by JMP (SAS institute) in analytical method validation.
Ye, C; Liu, J; Ren, F; Okafo, N
2000-08-15
Validation of an analytical method through a series of experiments demonstrates that the method is suitable for its intended purpose. Due to multi-parameters to be examined and a large number of experiments involved in validation, it is important to design the experiments scientifically so that appropriate validation parameters can be examined simultaneously to provide a sound, overall knowledge of the capabilities of the analytical method. A statistical method through design of experiment (DOE) was applied to the validation of a HPLC analytical method for the quantitation of a small molecule in drug product in terms of intermediate precision and robustness study. The data were analyzed in JMP (SAS institute) software using analyses of variance method. Confidence intervals for outcomes and control limits for individual parameters were determined. It was demonstrated that the experimental design and statistical analysis used in this study provided an efficient and systematic approach to evaluating intermediate precision and robustness for a HPLC analytical method for small molecule quantitation.
The SAS4A/SASSYS-1 Safety Analysis Code System, Version 5
Energy Technology Data Exchange (ETDEWEB)
Fanning, T. H. [Argonne National Lab. (ANL), Argonne, IL (United States); Brunett, A. J. [Argonne National Lab. (ANL), Argonne, IL (United States); Sumner, T. [Argonne National Lab. (ANL), Argonne, IL (United States)
2017-01-01
The SAS4A/SASSYS-1 computer code is developed by Argonne National Laboratory for thermal, hydraulic, and neutronic analysis of power and flow transients in liquidmetal- cooled nuclear reactors (LMRs). SAS4A was developed to analyze severe core disruption accidents with coolant boiling and fuel melting and relocation, initiated by a very low probability coincidence of an accident precursor and failure of one or more safety systems. SASSYS-1, originally developed to address loss-of-decay-heat-removal accidents, has evolved into a tool for margin assessment in design basis accident (DBA) analysis and for consequence assessment in beyond-design-basis accident (BDBA) analysis. SAS4A contains detailed, mechanistic models of transient thermal, hydraulic, neutronic, and mechanical phenomena to describe the response of the reactor core, its coolant, fuel elements, and structural members to accident conditions. The core channel models in SAS4A provide the capability to analyze the initial phase of core disruptive accidents, through coolant heat-up and boiling, fuel element failure, and fuel melting and relocation. Originally developed to analyze oxide fuel clad with stainless steel, the models in SAS4A have been extended and specialized to metallic fuel with advanced alloy cladding. SASSYS-1 provides the capability to perform a detailed thermal/hydraulic simulation of the primary and secondary sodium coolant circuits and the balance-ofplant steam/water circuit. These sodium and steam circuit models include component models for heat exchangers, pumps, valves, turbines, and condensers, and thermal/hydraulic models of pipes and plena. SASSYS-1 also contains a plant protection and control system modeling capability, which provides digital representations of reactor, pump, and valve controllers and their response to input signal changes.
O'Connor, B P
2000-08-01
Popular statistical software packages do not have the proper procedures for determining the number of components in factor and principal components analyses. Parallel analysis and Velicer's minimum average partial (MAP) test are validated procedures, recommended widely by statisticians. However, many researchers continue to use alternative, simpler, but flawed procedures, such as the eigenvalues-greater-than-one rule. Use of the proper procedures might be increased if these procedures could be conducted within familiar software environments. This paper describes brief and efficient programs for using SPSS and SAS to conduct parallel analyses and the MAP test.
Coupling the System Analysis Module with SAS4A/SASSYS-1
Energy Technology Data Exchange (ETDEWEB)
Fanning, T. H. [Argonne National Lab. (ANL), Argonne, IL (United States); Hu, R. [Argonne National Lab. (ANL), Argonne, IL (United States)
2016-09-30
SAS4A/SASSYS-1 is a simulation tool used to perform deterministic analysis of anticipated events as well as design basis and beyond design basis accidents for advanced reactors, with an emphasis on sodium fast reactors. SAS4A/SASSYS-1 has been under development and in active use for nearly forty-five years, and is currently maintained by the U.S. Department of Energy under the Office of Advanced Reactor Technology. Although SAS4A/SASSYS-1 contains a very capable primary and intermediate system modeling component, PRIMAR-4, it also has some shortcomings: outdated data management and code structure makes extension of the PRIMAR-4 module somewhat difficult. The user input format for PRIMAR-4 also limits the number of volumes and segments that can be used to describe a given system. The System Analysis Module (SAM) is a fairly new code development effort being carried out under the U.S. DOE Nuclear Energy Advanced Modeling and Simulation (NEAMS) program. SAM is being developed with advanced physical models, numerical methods, and software engineering practices; however, it is currently somewhat limited in the system components and phenomena that can be represented. For example, component models for electromagnetic pumps and multi-layer stratified volumes have not yet been developed. Nor is there support for a balance of plant model. Similarly, system-level phenomena such as control-rod driveline expansion and vessel elongation are not represented. This report documents fiscal year 2016 work that was carried out to couple the transient safety analysis capabilities of SAS4A/SASSYS-1 with the system modeling capabilities of SAM under the joint support of the ART and NEAMS programs. The coupling effort was successful and is demonstrated by evaluating an unprotected loss of flow transient for the Advanced Burner Test Reactor (ABTR) design. There are differences between the stand-alone SAS4A/SASSYS-1 simulations and the coupled SAS/SAM simulations, but these are mainly
Analysis of Variance: What Is Your Statistical Software Actually Doing?
Li, Jian; Lomax, Richard G.
2011-01-01
Users assume statistical software packages produce accurate results. In this article, the authors systematically examined Statistical Package for the Social Sciences (SPSS) and Statistical Analysis System (SAS) for 3 analysis of variance (ANOVA) designs, mixed-effects ANOVA, fixed-effects analysis of covariance (ANCOVA), and nested ANOVA. For each…
Valeri, Linda; Vanderweele, Tyler J
2013-06-01
Mediation analysis is a useful and widely employed approach to studies in the field of psychology and in the social and biomedical sciences. The contributions of this article are several-fold. First we seek to bring the developments in mediation analysis for nonlinear models within the counterfactual framework to the psychology audience in an accessible format and compare the sorts of inferences about mediation that are possible in the presence of exposure-mediator interaction when using a counterfactual versus the standard statistical approach. Second, the work by VanderWeele and Vansteelandt (2009, 2010) is extended here to allow for dichotomous mediators and count outcomes. Third, we provide SAS and SPSS macros to implement all of these mediation analysis techniques automatically, and we compare the types of inferences about mediation that are allowed by a variety of software macros.
O'Connor, Brian P
2004-02-01
Levels-of-analysis issues arise whenever individual-level data are collected from more than one person from the same dyad, family, classroom, work group, or other interaction unit. Interdependence in data from individuals in the same interaction units also violates the independence-of-observations assumption that underlies commonly used statistical tests. This article describes the data analysis challenges that are presented by these issues and presents SPSS and SAS programs for conducting appropriate analyses. The programs conduct the within-and-between-analyses described by Dansereau, Alutto, and Yammarino (1984) and the dyad-level analyses described by Gonzalez and Griffin (1999) and Griffin and Gonzalez (1995). Contrasts with general multilevel modeling procedures are then discussed.
Mathematical and statistical analysis
Houston, A. Glen
1988-01-01
The goal of the mathematical and statistical analysis component of RICIS is to research, develop, and evaluate mathematical and statistical techniques for aerospace technology applications. Specific research areas of interest include modeling, simulation, experiment design, reliability assessment, and numerical analysis.
On Teaching Problems about Medical Statistics SAS Experimental Courses%医学统计学SAS实验课程相关教学问题探讨
Institute of Scientific and Technical Information of China (English)
汤在祥
2012-01-01
SAS软件实验课是医学统计学的重要课程，上好SAS软件实验课对于学生掌握和应用统计学方法具有重要实践意义。针对医学统计学SAS软件实验课特点，以及当前教学过程中存在的问题，对SAS软件实验课了提出了几点探讨，为提高医学院校SAS软件实验课的教学效果提供参考。%SAS software experimental course is an important course about medical statistics, which is of great significance for students to master and apply the statistical method. According to characteristics of medical statistics SAS software, as well as problems in the educational process, the paper discusses the experimental course and hopes to improve the teach-ing effect of the SAS software.
Design and analysis of experiments classical and regression approaches with SAS
Onyiah, Leonard C
2008-01-01
Introductory Statistical Inference and Regression Analysis Elementary Statistical Inference Regression Analysis Experiments, the Completely Randomized Design (CRD)-Classical and Regression Approaches Experiments Experiments to Compare Treatments Some Basic Ideas Requirements of a Good Experiment One-Way Experimental Layout or the CRD: Design and Analysis Analysis of Experimental Data (Fixed Effects Model) Expected Values for the Sums of Squares The Analysis of Variance (ANOVA) Table Follow-Up Analysis to Check fo
Deconstructing Statistical Analysis
Snell, Joel
2014-01-01
Using a very complex statistical analysis and research method for the sake of enhancing the prestige of an article or making a new product or service legitimate needs to be monitored and questioned for accuracy. 1) The more complicated the statistical analysis, and research the fewer the number of learned readers can understand it. This adds a…
Sato, Akira; Itoh, Tasuku; Imamichi, Shoji; Kikuhara, Sota; Fujimori, Hiroaki; Hirai, Takahisa; Saito, Soichiro; Sakurai, Yoshinori; Tanaka, Hiroki; Nakamura, Hiroyuki; Suzuki, Minoru; Murakami, Yasufumi; Baiseitov, Diaz; Berikkhanova, Kulzhan; Zhumadilov, Zhaxybay; Imahori, Yoshio; Itami, Jun; Ono, Koji; Masunaga, Shinichiro; Masutani, Mitsuko
2015-12-01
To understand the mechanism of cell death induced by boron neutron capture reaction (BNCR), we performed proteome analyses of human squamous tumor SAS cells after BNCR. Cells were irradiated with thermal neutron beam at KUR after incubation under boronophenylalanine (BPA)(+) and BPA(-) conditions. BNCR mainly induced typical apoptosis in SAS cells 24h post-irradiation. Proteomic analysis in SAS cells suggested that proteins functioning in endoplasmic reticulum, DNA repair, and RNA processing showed dynamic changes at early phase after BNCR and could be involved in the regulation of cellular response to BNCR. We found that the BNCR induces fragments of endoplasmic reticulum-localized lymphoid-restricted protein (LRMP). The fragmentation of LRMP was also observed in the rat tumor graft model 20 hours after BNCT treatment carried out at the National Nuclear Center of the Republic of Kazakhstan. These data suggest that dynamic changes of LRMP could be involved during cellular response to BNCR.
NParCov3: A SAS/IML Macro for Nonparametric Randomization-Based Analysis of Covariance
Directory of Open Access Journals (Sweden)
Richard C. Zink
2012-07-01
Full Text Available Analysis of covariance serves two important purposes in a randomized clinical trial. First, there is a reduction of variance for the treatment effect which provides more powerful statistical tests and more precise confidence intervals. Second, it provides estimates of the treatment effect which are adjusted for random imbalances of covariates between the treatment groups. The nonparametric analysis of covariance method of Koch, Tangen, Jung, and Amara (1998 defines a very general methodology using weighted least-squares to generate covariate-adjusted treatment effects with minimal assumptions. This methodology is general in its applicability to a variety of outcomes, whether continuous, binary, ordinal, incidence density or time-to-event. Further, its use has been illustrated in many clinical trial settings, such as multi-center, dose-response and non-inferiority trials.NParCov3 is a SAS/IML macro written to conduct the nonparametric randomization-based covariance analyses of Koch et al. (1998. The software can analyze a variety of outcomes and can account for stratification. Data from multiple clinical trials will be used for illustration.
Beginning statistics with data analysis
Mosteller, Frederick; Rourke, Robert EK
2013-01-01
This introduction to the world of statistics covers exploratory data analysis, methods for collecting data, formal statistical inference, and techniques of regression and analysis of variance. 1983 edition.
Associative Analysis in Statistics
Directory of Open Access Journals (Sweden)
Mihaela Muntean
2015-03-01
Full Text Available In the last years, the interest in technologies such as in-memory analytics and associative search has increased. This paper explores how you can use in-memory analytics and an associative model in statistics. The word “associative” puts the emphasis on understanding how datasets relate to one another. The paper presents the main characteristics of “associative” data model. Also, the paper presents how to design an associative model for labor market indicators analysis. The source is the EU Labor Force Survey. Also, this paper presents how to make associative analysis.
Per Object statistical analysis
DEFF Research Database (Denmark)
2008-01-01
Variable. This procedure was developed in order to be able to export objects as ESRI shape data with the 90-percentile of the Hue of each object's pixels as an item in the shape attribute table. This procedure uses a sub-level single pixel chessboard segmentation, loops for each of the objects......This RS code is to do Object-by-Object analysis of each Object's sub-objects, e.g. statistical analysis of an object's individual image data pixels. Statistics, such as percentiles (so-called "quartiles") are derived by the process, but the return of that can only be a Scene Variable, not an Object...... of a specific class in turn, and uses as pair of PPO stages to derive the statistics and then assign them to the objects' Object Variables. It may be that this could all be done in some other, simply way, but several other ways that were tried did not succeed. The procedure ouptut has been tested against...
Applied multivariate statistical analysis
Härdle, Wolfgang Karl
2015-01-01
Focusing on high-dimensional applications, this 4th edition presents the tools and concepts used in multivariate data analysis in a style that is also accessible for non-mathematicians and practitioners. It surveys the basic principles and emphasizes both exploratory and inferential statistics; a new chapter on Variable Selection (Lasso, SCAD and Elastic Net) has also been added. All chapters include practical exercises that highlight applications in different multivariate data analysis fields: in quantitative financial studies, where the joint dynamics of assets are observed; in medicine, where recorded observations of subjects in different locations form the basis for reliable diagnoses and medication; and in quantitative marketing, where consumers’ preferences are collected in order to construct models of consumer behavior. All of these examples involve high to ultra-high dimensions and represent a number of major fields in big data analysis. The fourth edition of this book on Applied Multivariate ...
Material analysis on engineering statistics
Energy Technology Data Exchange (ETDEWEB)
Lee, Seung Hun
2008-03-15
This book is about material analysis on engineering statistics using mini tab, which includes technical statistics and seven tools of QC, probability distribution, presumption and checking, regression analysis, tim series analysis, control chart, process capacity analysis, measurement system analysis, sampling check, experiment planning, response surface analysis, compound experiment, Taguchi method, and non parametric statistics. It is good for university and company to use because it deals with theory first and analysis using mini tab on 6 sigma BB and MBB.
DEFF Research Database (Denmark)
Milhøj, Anders
2016-01-01
I sommeren 2015 blev Analytical Produts opgraderet version 14.1, men stadig til Base SAS, version 9.4 sendt på markedet. Denne opdatering til nu Analytical Products 14.1 indeholder opdateringer af de analytiske programpakker indenfor statistik, økonometri, operationsanalyse etc. Disse opdateringer...... er nu løsrevet fra samtidige opdateringer af det samlede SAS-program. Desuden er den gratis SAS-applikation SAS-U opdateret; især er det bemærkelsesværdigt at der via SAS-U nu også stilles procedurer til rådighed for avancerede ikke-modelbaserede tidsrækkeanalyser....
DEFF Research Database (Denmark)
Andersen, Aage T.; Feilberg, Michael; Milhøj, Anders
IML i SAS anvendes til matrixberegninger som en integreret del af SAS pakken. Den anvendes typisk til matrixberegninger i tilknytning til andre SAS applikationer, fx kan matricer let dannes ud fra datasæt i SAS. Desuden indeholder IML en række videregående matematiske og statistiske subroutiner......, der ikke findes andre steder i SAS.I bogen gennemgås matrixregning svarende til et indledende matematikkursus ved en højere læreanstalt. Den teoretiske gennemgang understøttes og demonstreres konsekvent med IML programmering, så fokus er på praktiske anvendelser af matricer....
Energy Technology Data Exchange (ETDEWEB)
Jankovsky, Zachary Kyle [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Denman, Matthew R. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2017-05-01
It is difficult to assess the consequences of a transient in a sodium-cooled fast reactor (SFR) using traditional probabilistic risk assessment (PRA) methods, as numerous safety-related sys- tems have passive characteristics. Often there is significant dependence on the value of con- tinuous stochastic parameters rather than binary success/failure determinations. One form of dynamic PRA uses a system simulator to represent the progression of a transient, tracking events through time in a discrete dynamic event tree (DDET). In order to function in a DDET environment, a simulator must have characteristics that make it amenable to changing physical parameters midway through the analysis. The SAS4A SFR system analysis code did not have these characteristics as received. This report describes the code modifications made to allow dynamic operation as well as the linking to a Sandia DDET driver code. A test case is briefly described to demonstrate the utility of the changes.
The Spectral Analysis of X-Ray Binaries from the XMM-Newton Space Craft Data using SAS Software
Baki, P.; Mito, C. O.
2009-10-01
A spectral data analysis on a luminous object of sky-coordinates 12h52m24.28s-29d115'02.3'12.6arcsec using Science Analysis Software (SAS) is presented. The analysis, based on data acquired by the Reflection Grating Spectrometer (RGS) camera aboard the XMM-Newton Space satellite, shows that the primary constituents of the X-ray source are Fe (Iron) and O (oxygen). This suggests that the source may be a magnetized plasma in a binary system and as this magnetic field accelerates the cooling of a star, one may speculate that this may be a compact star in its last stages of a thermonuclear fusion process. Nous présentons une analyse du spectre d'une source a rayons X située -- en coordonnées sidérales - à 12h52m24.28s - 29d115'02.312.6 arcsec. Science Analysis Software (SAS) est le programme informatique utilisé pour l'analyse des données. Cette analyse est basée sur les données provenant du spectromètre à haute résolution (RGS) à bord du satellite spatiale XMM-Newton. Nous montrons que ladite source est principalement constituée de Fer (Fe) et d'oxygene (O). Ce résultat suggère que la source pourrait être un plasma magnétisé au sein d'un système binaire. Et du fait que ce champ magnétique accélère le refroidissement de l'étoile, nous supposons que cette étoile pourrait ètre un objet compact en phase terminale d'un processus de fusion thermonucléaire.
Vicente-Muñoz, Sara; Romero, Paco; Magraner-Pardo, Lorena; Martinez-Jimenez, Celia P; Tordera, Vicente; Pamblanco, Mercè
2014-01-01
Histone acetylation affects several aspects of gene regulation, from chromatin remodelling to gene expression, by modulating the interplay between chromatin and key transcriptional regulators. The exact molecular mechanism underlying acetylation patterns and crosstalk with other epigenetic modifications requires further investigation. In budding yeast, these epigenetic markers are produced partly by histone acetyltransferase enzymes, which act as multi-protein complexes. The Sas3-dependent NuA3 complex has received less attention than other histone acetyltransferases (HAT), such as Gcn5-dependent complexes. Here, we report our analysis of Sas3p-interacting proteins using tandem affinity purification (TAP), coupled with mass spectrometry. This analysis revealed Pdp3p, a recently described component of NuA3, to be one of the most abundant Sas3p-interacting proteins. The PDP3 gene, was TAP-tagged and protein complex purification confirmed that Pdp3p co-purified with the NuA3 protein complex, histones, and several transcription-related and chromatin remodelling proteins. Our results also revealed that the protein complexes associated with Sas3p presented HAT activity even in the absence of Gcn5p and vice versa. We also provide evidence that Sas3p cannot substitute Gcn5p in acetylation of lysine 9 in histone H3 in vivo. Genome-wide occupancy of Sas3p using ChIP-on-chip tiled microarrays showed that Sas3p was located preferentially within the 5'-half of the coding regions of target genes, indicating its probable involvement in the transcriptional elongation process. Hence, this work further characterises the function and regulation of the NuA3 complex by identifying novel post-translational modifications in Pdp3p, additional Pdp3p-co-purifying chromatin regulatory proteins involved in chromatin-modifying complex dynamics and gene regulation, and a subset of genes whose transcriptional elongation is controlled by this complex.
Statistical Analysis and validation
Hoefsloot, H.C.J.; Horvatovich, P.; Bischoff, R.
2013-01-01
In this chapter guidelines are given for the selection of a few biomarker candidates from a large number of compounds with a relative low number of samples. The main concepts concerning the statistical validation of the search for biomarkers are discussed. These complicated methods and concepts are
Energy Technology Data Exchange (ETDEWEB)
Crandley, J F; Greenwalt, R J; Magnoli, D E; Randazzo, A S
2002-02-05
This study consists of work done in support of the U.S. delegation to the NATO SAS-023 Antipersonnel Landmine Study Group, supplemented by additional work done for the U.S. Office of the Secretary of Defense Antipersonnel Landmine Alternative Concept Exploration Program (Track III). It explores the battlefield utility of current antipersonnel landmines (APL) in both pure and mixed APL/antitank minefields and evaluates the value of military suggested non-materiel alternatives. The historical record is full of examples where the presence (or absence) of antipersonnel landmines made a critical difference in battle. The current generation of military thinkers and writers lack any significant combat experience employing either mixed or antipersonnel minefields, which leaves a critical gap in available expert advice for policy and decision-makers. Because of this lack of experienced-based professional military knowledge, Lawrence Livermore National Laboratory analyzed the employment of antipersonnel landmines in tactical mixed minefields and in protective antipersonnel minefields. The scientific method was employed where hypotheses were generated from the tactics and doctrine of the antipersonnel landmine era and tested in a simulation laboratory. A high-resolution, U.S. Joint Forces Command combat simulation model (the Joint Conflict and Tactical Simulation--JCATS) was used as the laboratory instrument. A realistic European scenario was obtained from a multi-national USAREUR exercise and was approved by the SAS-023 panel members. Additional scenarios were provided by U.S. CINC conferences and were based on Southwest Asia and Northeast Asia. Weapons data was obtained from the U.S. family of Joint Munitions Effectiveness Manuals. The U.S. Army Materiel Systems Analysis Agency conducted a limited verification and validation assessment of JCATS for purposes of this study.
Institute of Scientific and Technical Information of China (English)
陈强; 田杰; 黄海宁; 张春华
2013-01-01
Synthetic aperture sonar (SAS) images can effectively describe the topography,geomorphology and substrate of seabed;however,one single SAS image usually corresponds to a larger area ;so it is necessary to segment the SAS image into different regions according to certain property,which benefits further analyzing the image,and detecting and identifying the target.Study found that SAS images of different substrates have different statistical and texture features;in this paper,the statistical properties,such as the mean,standard deviation and kurtosis of the grey level histogram,as well as the texture features,such as the energy,correlation,contrast and entropy of the grey level co-occurrence matrix are selected and used to describe different regions of the SAS image.These selected features are used as the support vector machine (SVM) training characteristics and the classifier is obtained for the SAS image segmentation.The experiment results show that the proposed SVM algorithm is a good segmentation method for the region segmentation of SAS image.%合成孔径声呐图像可以有效反映海底的地形、地貌和底质等情况,但是单幅SAS图像通常对应一片较大的区域,需要按照某种性质将不同性质的区域分割开来,以有利于下一步的图像分析以及目标检测和识别.研究发现,不同底质区域的SAS图像具有不同的统计和纹理特征,选取灰度直方图的均值、标准差、峰度等统计特性和灰度共生矩阵的能量、相关性、对比度、熵值等纹理特性用以描述SAS图像的不同区域.将选取的特征作为SVM的训练特征,进而得到SVM分类器,用于SAS图像分割.实验结果表明,SVM算法可以很好地对SAS图像进行区域分割.
Yasarapu, S.
This chapter focuses on the different types of solid state drives. The chapter details the differences between consumer and enterprise solid state drives and also details the differences between SAS and SATA solid state drive and what lies ahead for SATA and SAS protocols for SSDs.
Energy Technology Data Exchange (ETDEWEB)
Hahn, A.A.
1994-11-01
The complexity of instrumentation sometimes requires data analysis to be done before the result is presented to the control room. This tutorial reviews some of the theoretical assumptions underlying the more popular forms of data analysis and presents simple examples to illuminate the advantages and hazards of different techniques.
Validity of Simpson-Angus Scale (SAS in a naturalistic schizophrenia population
Directory of Open Access Journals (Sweden)
Tuisku Katinka
2005-03-01
Full Text Available Abstract Background Simpson-Angus Scale (SAS is an established instrument for neuroleptic-induced parkinsonism (NIP, but its statistical properties have been studied insufficiently. Some shortcomings concerning its content have been suggested as well. According to a recent report, the widely used SAS mean score cut-off value 0.3 of for NIP detection may be too low. Our aim was to evaluate SAS against DSM-IV diagnostic criteria for NIP and objective motor assessment (actometry. Methods Ninety-nine chronic institutionalised schizophrenia patients were evaluated during the same interview by standardised actometric recording and SAS. The diagnosis of NIP was based on DSM-IV criteria. Internal consistency measured by Cronbach's α, convergence to actometry and the capacity for NIP case detection were assessed. Results Cronbach's α for the scale was 0.79. SAS discriminated between DSM-IV NIP and non-NIP patients. The actometric findings did not correlate with SAS. ROC-analysis yielded a good case detection power for SAS mean score. The optimal threshold value of SAS mean score was between 0.65 and 0.95, i.e. clearly higher than previously suggested threshold value. Conclusion We conclude that SAS seems a reliable and valid instrument. The previously commonly used cut-off mean score of 0.3 has been too low resulting in low specificity, and we suggest a new cut-off value of 0.65, whereby specificity could be doubled without loosing sensitivity.
Institute of Scientific and Technical Information of China (English)
裘炯良; 郑剑宁; 林军; 马立新
2012-01-01
目的 探索基于SAS的统计地图绘制方法,及其在出入境检验检疫系统卫生检疫外来媒介生物监测工作中的应用.方法 应用统计地图法并采用国际权威的统计分析标准软件SAS分析平台进行编程,实现外来媒介生物截获信息的分类统计、数据分析,以及在世界地图上的实时、动态、全自动标注.结果 计算机自动绘制世界地图,根据20个国家首都的经纬度数据在图中用星号(★)标注出相应的国家位置,并用蓝、黄、橙、红4种颜色分别代表截获的媒介生物数量等级.来自巴拿马、巴西、泰国等国家的外来媒介生物量均以蓝星标注,表明截获量相对较少；来自韩国、南非等国家的外来媒介生物量均以红星标注,提示来自这些国家的外来医学媒介生物截获量相对较多.结论 统计地图可以直观、简便、动态地呈现外来媒介生物截获量在不同国家的分布情况,有助于检验检疫部门及时、准确、合理地分配力量,重点加强媒介截获量较大的来源国卫生检疫监管工作.%Objective To explore the drawing method of a statistical map based on statistics analysis system (SAS) and its application in the surveillance of exotic medical vectors and risk analysis in entry - exit inspection and quarantine system. Methods The statistical map method based on SAS program was applied for the classification statistics and data analysis of exotic medical vectors, with real-time, dynamic and automatic marking on the world map actualized. Results A world map was automatically drawn by computer and the related data and messages were labelled real-timely, dynamically and automatically on the world map in different colours. The locations of the twenty countries covered in this study were marked as asterisk in the map based on the longitude and latitude degree values of the capitals in those countries. Blue, yellow, orange and red colours were used to represent the
DEFF Research Database (Denmark)
Ris Hansen, Inge; Søgaard, Karen; Gram, Bibi
2015-01-01
This is the analysis plan for the multicentre randomised control study looking at the effect of training and exercises in chronic neck pain patients that is being conducted in Jutland and Funen, Denmark. This plan will be used as a work description for the analyses of the data collected....
Janina Mrozek, Dorota; van der Veen, Carina; Hofmann, Magdalena E. G.; Chen, Huilin; Kivi, Rigel; Heikkinen, Pauli; Röckmann, Thomas
2016-11-01
We present the set-up and a scientific application of the Stratospheric Air Sub-sampler (SAS), a device to collect and to store the vertical profile of air collected with an AirCore (Karion et al., 2010) in numerous sub-samples for later analysis in the laboratory. The SAS described here is a 20 m long 1/4 inch stainless steel tubing that is separated by eleven valves to divide the tubing into 10 identical segments, but it can be easily adapted to collect smaller or larger samples. In the collection phase the SAS is directly connected to the outlet of an optical analyzer that measures the mole fractions of CO2, CH4 and CO from an AirCore sampler. The stratospheric part (or if desired any part of the AirCore air) is then directed through the SAS. When the SAS is filled with the selected air, the valves are closed and the vertical profile is maintained in the different segments of the SAS. The segments can later be analysed to retrieve vertical profiles of other trace gas signatures that require slower instrumentation. As an application, we describe the coupling of the SAS to an analytical system to determine the 17O excess of CO2, which is a tracer for photochemical processing of stratospheric air. For this purpose the analytical system described by Mrozek et al. (2015) was adapted for analysis of air directly from the SAS. The performance of the coupled system is demonstrated for a set of air samples from an AirCore flight in November 2014 near Sodankylä, Finland. The standard error for a 25 mL air sample at stratospheric CO2 mole fraction is 0.56 ‰ (1σ) for δ17O and 0.03 ‰ (1σ) for both δ18O and δ13C. Measured Δ17O(CO2) values show a clear correlation with N2O in agreement with already published data.
Research design and statistical analysis
Myers, Jerome L; Lorch Jr, Robert F
2013-01-01
Research Design and Statistical Analysis provides comprehensive coverage of the design principles and statistical concepts necessary to make sense of real data. The book's goal is to provide a strong conceptual foundation to enable readers to generalize concepts to new research situations. Emphasis is placed on the underlying logic and assumptions of the analysis and what it tells the researcher, the limitations of the analysis, and the consequences of violating assumptions. Sampling, design efficiency, and statistical models are emphasized throughout. As per APA recommendations
Statistical Analysis with Missing Data
Little, Roderick J A
2002-01-01
Praise for the First Edition of Statistical Analysis with Missing Data ""An important contribution to the applied statistics literature.... I give the book high marks for unifying and making accessible much of the past and current work in this important area."" Ã¢ÂÂWilliam E. Strawderman, Rutgers University ""This book...provide[s] interesting real-life examples, stimulating end-of-chapter exercises, and up-to-date references. It should be on every applied statisticianÃ¢ÂÂs bookshelf."" Ã¢ÂÂThe Statistician ""The book should be studied in the statistical methods d
Bayesian Methods for Statistical Analysis
Puza, Borek
2015-01-01
Bayesian methods for statistical analysis is a book on statistical methods for analysing a wide variety of data. The book consists of 12 chapters, starting with basic concepts and covering numerous topics, including Bayesian estimation, decision theory, prediction, hypothesis testing, hierarchical models, Markov chain Monte Carlo methods, finite population inference, biased sampling and nonignorable nonresponse. The book contains many exercises, all with worked solutions, including complete c...
DEFF Research Database (Denmark)
Milhøj, Anders
2014-01-01
I juli 2013 blev version 9.4 af SAS sendt på markedet. Denne release indeholdt ikke mange nye faciliteter indenfor statistik, økonometri, operationsanalyse etc, men alligevel er der vigtige aspekter at forholde sig til. Ved årsskiftet 2013/2014 er der siden kommet en opdatering af de analytiske...... programpakker for statistik, økonometri, operationsanalyse etc uden en samtidig opdatering af det samlede SAS-program. Denne opdatering kaldes Analytical Products 13.1 og er en væsentlig opdatering fra version 12, der i første version kom i august 2012....
DEFF Research Database (Denmark)
Milhøj, Anders
2015-01-01
I juli 2013 blev version 9.4 af SAS sendt på markedet. Denne release indeholdt ikke i sig selv mange nye faciliteter indenfor statistik, økonometri, operations-analyse etc, men alligevel er der vigtige aspekter at forholde sig til. Ved årsskiftet 2013/2014 kom opdateringen Analytical Products 13.......1 af de analytiske programpakker for statistik, økonometri, operationsanalyse etc uden en samtidig opdatering af det samlede SAS-program. Denne opdatering er siden fulgt op med en yderligere opdatering 13.2 i august 2014....
Common pitfalls in statistical analysis: Clinical versus statistical significance
Ranganathan, Priya; Pramesh, C. S.; Buyse, Marc
2015-01-01
In clinical research, study results, which are statistically significant are often interpreted as being clinically important. While statistical significance indicates the reliability of the study results, clinical significance reflects its impact on clinical practice. The third article in this series exploring pitfalls in statistical analysis clarifies the importance of differentiating between statistical significance and clinical significance. PMID:26229754
Statistical methods for bioimpedance analysis
Directory of Open Access Journals (Sweden)
Christian Tronstad
2014-04-01
Full Text Available This paper gives a basic overview of relevant statistical methods for the analysis of bioimpedance measurements, with an aim to answer questions such as: How do I begin with planning an experiment? How many measurements do I need to take? How do I deal with large amounts of frequency sweep data? Which statistical test should I use, and how do I validate my results? Beginning with the hypothesis and the research design, the methodological framework for making inferences based on measurements and statistical analysis is explained. This is followed by a brief discussion on correlated measurements and data reduction before an overview is given of statistical methods for comparison of groups, factor analysis, association, regression and prediction, explained in the context of bioimpedance research. The last chapter is dedicated to the validation of a new method by different measures of performance. A flowchart is presented for selection of statistical method, and a table is given for an overview of the most important terms of performance when evaluating new measurement technology.
Meta-Analysis of Ordinal Data and Its Solution by SAS%有序数据的Meta分析方法及SAS实现
Institute of Scientific and Technical Information of China (English)
张天嵩; 熊茜
2012-01-01
目的 介绍有序数据的Meta分析方法及其在SAS软件的实现.方法 以实例说明,采用两步法:第一步,基于累积比数模型,采用SAS中的GENMOD过程计算产生每个研究的效应量及其标准误；第二步,采用固定效应模型和随机效应模型以SAS中的MIXED过程进行经典Meta分析.结果 固定效应和随机效应模型Meta分析结果显示,汇总比值比及95％可信区间均为2.853 9(2.106 4,3.864 8).结论 对于有序数据,可以应用SAS中的GENMOD过程和MIXED过程进行Meta分析.%Objective To introduce methodology for undertaking a meta-analysis on ordinal data and its realization in SAS. Methods It was illustrated on an example data by using two-step method. In the first step, it was calculated treatment effect and its standard error for an individual study by using PROC GENMOD in SAS based on cumulative odds models; in the second step, a classical meta-analysis was proposed for fixed and random effect model by using PROC MIXED in SAS. Results Results from the traditional meta-analysis of summary odds ratio and its 95% CI were same 2.853 9(2.106 4,3.864 8) for fixed and random effect model. Conclusions The procedure SAS PROC GENMOD and MIXED were utilized to undertake a meta-analysis on ordinal data.
Regularized Statistical Analysis of Anatomy
DEFF Research Database (Denmark)
Sjöstrand, Karl
2007-01-01
This thesis presents the application and development of regularized methods for the statistical analysis of anatomical structures. Focus is on structure-function relationships in the human brain, such as the connection between early onset of Alzheimer’s disease and shape changes of the corpus cal...
Statistical Analysis for Performance Comparison
Directory of Open Access Journals (Sweden)
Priyanka Dutta
2013-07-01
Full Text Available Performance responsiveness and scalability is a make-or-break quality for software. Nearly everyone runsinto performance problems at one time or another. This paper discusses about performance issues facedduring Pre Examination Process Automation System (PEPAS implemented in java technology. Thechallenges faced during the life cycle of the project and the mitigation actions performed. It compares 3java technologies and shows how improvements are made through statistical analysis in response time ofthe application. The paper concludes with result analysis.
Institute of Scientific and Technical Information of China (English)
冯玉强; 马兴辉
2005-01-01
针对SAS统计分析系统局限于单机应用的现状,提出了SAS与Web服务器相结合的方式来弥补SAS系统在Web环境下应用的不足,介绍了利用SAS/IntrNet实现SAS的Web应用的基本原理,给出了SAS/IntrNet程序处理的一般流程和相关配置文件的详细配置方法,解决了SAS/IntrNet系统实现中的几个关键问题.同时给出这一应用的具体实例--网上谈判数据分析系统的具体实现方法.
Institute of Scientific and Technical Information of China (English)
鄢金柱; 李胜; 翁鸿; 曾宪涛
2014-01-01
SAS system is considered as one of international standard software systems in the field of data processing and statistics, however, it required the user has higher capability due to its complex and difficult programming. For well applying SAS for conducting a Meta-analysis, Stephen Senna and his colleagues developed a suite of macros for the Meta-analysis of binary data. These macros will realize the Meta-analysis of binary data conveniently and can generate the forest plot, funnel plot, Galbraith plot, Q-Q plot and other graphs. The authors introduced in the paper how to use these SAS macros for conducting the Meta-analysis of binary data.%SAS被誉为国际上数据处理和统计分析领域的标准软件系统，但其程序编写复杂且难度较大，对使用者的要求较高。为了更好的使用SAS软件实现Meta分析，Stephen Senna等人编写了一套专用于二分类资料的Meta分析的宏命令。这套宏语言可以方便的实现Meta分析，绘制森林图、漏斗图、加尔布雷斯图、Q-Q正态分布图等图形。本文结合实例对应用SAS宏语言实现二分类数据Meta分析的方法进行详细介绍。
Bayesian Inference in Statistical Analysis
Box, George E P
2011-01-01
The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson The Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences Rob
Tools for Basic Statistical Analysis
Luz, Paul L.
2005-01-01
Statistical Analysis Toolset is a collection of eight Microsoft Excel spreadsheet programs, each of which performs calculations pertaining to an aspect of statistical analysis. These programs present input and output data in user-friendly, menu-driven formats, with automatic execution. The following types of calculations are performed: Descriptive statistics are computed for a set of data x(i) (i = 1, 2, 3 . . . ) entered by the user. Normal Distribution Estimates will calculate the statistical value that corresponds to cumulative probability values, given a sample mean and standard deviation of the normal distribution. Normal Distribution from two Data Points will extend and generate a cumulative normal distribution for the user, given two data points and their associated probability values. Two programs perform two-way analysis of variance (ANOVA) with no replication or generalized ANOVA for two factors with four levels and three repetitions. Linear Regression-ANOVA will curvefit data to the linear equation y=f(x) and will do an ANOVA to check its significance.
AnovArray: a set of SAS macros for the analysis of variance of gene expression data
Directory of Open Access Journals (Sweden)
Renard Jean-Paul
2005-06-01
Full Text Available Abstract Background Analysis of variance is a powerful approach to identify differentially expressed genes in a complex experimental design for microarray and macroarray data. The advantage of the anova model is the possibility to evaluate multiple sources of variation in an experiment. Results AnovArray is a package implementing ANOVA for gene expression data using SAS® statistical software. The originality of the package is 1 to quantify the different sources of variation on all genes together, 2 to provide a quality control of the model, 3 to propose two models for a gene's variance estimation and to perform a correction for multiple comparisons. Conclusion AnovArray is freely available at http://www-mig.jouy.inra.fr/stat/AnovArray and requires only SAS® statistical software.
Statistical Analysis of Protein Ensembles
Máté, Gabriell; Heermann, Dieter
2014-04-01
As 3D protein-configuration data is piling up, there is an ever-increasing need for well-defined, mathematically rigorous analysis approaches, especially that the vast majority of the currently available methods rely heavily on heuristics. We propose an analysis framework which stems from topology, the field of mathematics which studies properties preserved under continuous deformations. First, we calculate a barcode representation of the molecules employing computational topology algorithms. Bars in this barcode represent different topological features. Molecules are compared through their barcodes by statistically determining the difference in the set of their topological features. As a proof-of-principle application, we analyze a dataset compiled of ensembles of different proteins, obtained from the Ensemble Protein Database. We demonstrate that our approach correctly detects the different protein groupings.
Statistical Analysis of Protein Ensembles
Directory of Open Access Journals (Sweden)
Gabriell eMáté
2014-04-01
Full Text Available As 3D protein-configuration data is piling up, there is an ever-increasing need for well-defined, mathematically rigorous analysis approaches, especially that the vast majority of the currently available methods rely heavily on heuristics. We propose an analysis framework which stems from topology, the field of mathematics which studies properties preserved under continuous deformations. First, we calculate a barcode representation of the molecules employing computational topology algorithms. Bars in this barcode represent different topological features. Molecules are compared through their barcodes by statistically determining the difference in the set of their topological features. As a proof-of-principle application, we analyze a dataset compiled of ensembles of different proteins, obtained from the Ensemble Protein Database. We demonstrate that our approach correctly detects the different protein groupings.
Parametric statistical change point analysis
Chen, Jie
2000-01-01
This work is an in-depth study of the change point problem from a general point of view and a further examination of change point analysis of the most commonly used statistical models Change point problems are encountered in such disciplines as economics, finance, medicine, psychology, signal processing, and geology, to mention only several The exposition is clear and systematic, with a great deal of introductory material included Different models are presented in each chapter, including gamma and exponential models, rarely examined thus far in the literature Other models covered in detail are the multivariate normal, univariate normal, regression, and discrete models Extensive examples throughout the text emphasize key concepts and different methodologies are used, namely the likelihood ratio criterion, and the Bayesian and information criterion approaches A comprehensive bibliography and two indices complete the study
基于SAS的Web使用日志用户聚类分析%User Cluster Analysis of Web Usages Logs Based on SAS
Institute of Scientific and Technical Information of China (English)
欧阳烽
2013-01-01
The user cluster Analysis of Web Usages Logs based on SAS is the data of Web Usages Logs for data conversion, get-ting the user transaction table which is pre-formed after the handling of the corresponding data, then making the user cluster anal-ysis through the SAS data mining tools. Digital resources are reasonably procured and managed according to the demands of dif-ferent users on digital resources. Personalized services can be provided for different users.%基于SAS的Web使用日志用户聚类分析，即通过SAS数据挖掘工具将由Web使用日志数据经过数据转换和数据预处理后形成的用户事务表数据运用不同的方法进行聚类分析，以达到根据不同类别用户的需求对数字资源进行合理的采购和管理，为用户提供个性化服务的目的。
SPSS and SAS programming for the testing of mediation models.
Dudley, William N; Benuzillo, Jose G; Carrico, Mineh S
2004-01-01
Mediation modeling can explain the nature of the relation among three or more variables. In addition, it can be used to show how a variable mediates the relation between levels of intervention and outcome. The Sobel test, developed in 1990, provides a statistical method for determining the influence of a mediator on an intervention or outcome. Although interactive Web-based and stand-alone methods exist for computing the Sobel test, SPSS and SAS programs that automatically run the required regression analyses and computations increase the accessibility of mediation modeling to nursing researchers. To illustrate the utility of the Sobel test and to make this programming available to the Nursing Research audience in both SAS and SPSS. The history, logic, and technical aspects of mediation testing are introduced. The syntax files sobel.sps and sobel.sas, created to automate the computation of the regression analysis and test statistic, are available from the corresponding author. The reported programming allows the user to complete mediation testing with the user's own data in a single-step fashion. A technical manual included with the programming provides instruction on program use and interpretation of the output. Mediation modeling is a useful tool for describing the relation between three or more variables. Programming and manuals for using this model are made available.
Statistical Analysis of Tsunami Variability
Zolezzi, Francesca; Del Giudice, Tania; Traverso, Chiara; Valfrè, Giulio; Poggi, Pamela; Parker, Eric J.
2010-05-01
similar to that seen in ground motion attenuation correlations used for seismic hazard assessment. The second issue was intra-event variability. This refers to the differences in tsunami wave run-up along a section of coast during a single event. Intra-event variability investigated directly considering field observations. The tsunami events used in the statistical evaluation were selected on the basis of the completeness and reliability of the available data. Tsunami considered for the analysis included the recent and well surveyed tsunami of Boxing Day 2004 (Great Indian Ocean Tsunami), Java 2006, Okushiri 1993, Kocaeli 1999, Messina 1908 and a case study of several historic events in Hawaii. Basic statistical analysis was performed on the field observations from these tsunamis. For events with very wide survey regions, the run-up heights have been grouped in order to maintain a homogeneous distance from the source. Where more than one survey was available for a given event, the original datasets were maintained separately to avoid combination of non-homogeneous data. The observed run-up measurements were used to evaluate the minimum, maximum, average, standard deviation and coefficient of variation for each data set. The minimum coefficient of variation was 0.12 measured for the 2004 Boxing Day tsunami at Nias Island (7 data) while the maximum is 0.98 for the Okushiri 1993 event (93 data). The average coefficient of variation is of the order of 0.45.
SASWeave: Literate Programming Using SAS
Directory of Open Access Journals (Sweden)
Russell V. Lenth
2007-05-01
Full Text Available SASweave is a collection of scripts that allow one to embed SAS code into a LATEX document, and automatically incorporate the results as well. SASweave is patterned after Sweave, which does the same thing for code written in R. In fact, a document may contain both SAS and R code. Besides the convenience of being able to easily incorporate SAS examples in a document, SASweave facilitates the concept of “literate programming”: having code, documentation, and results packaged together. Among other things, this helps to ensure that the SAS output in the document is in concordance with the code.
Mahmoudi, Shima; Keshavarz, Hossein
2017-01-06
Although vaccines would be the ideal tool for control, prevention, elimination, and eradication of many infectious diseases, developing of parasites vaccines such as malaria vaccine is very complex. The most advanced malaria vaccine candidate is RTS,S, a pre-erythrocytic vaccine for which pivotal phase III trial design is underway. Few recent malaria vaccine review articles have attempted to outline of all clinical trials that have occurred globally and no meta-analysis was performed on efficacy of Phase 3 Trial of RTS, S/AS01 Malaria vaccine up to now in infants. Therefore, a systematic review and meta-analysis was carried out to review new and existing data on efficacy of Phase 3 Trial of RTS, S/AS01 Malaria Vaccine in infants. The electronic databases searched were Pubmed (1965-present) and Web of Science (1970-present) (Search date: May, 2016). After full-text review of the papers evaluating clinical/severe malaria in several well-designed phase III field efficacy trials, 5 were determined to meet the eligibility criteria for inclusion in the systematic review. Four out of the 5 publications dealing with efficacy of Phase 3 Trial of RTS, S/AS01 malaria vaccine were included in the qualitative analysis. Pooled estimate of vaccine efficacy in clinical and severe malaria in children aged 5-17 mo was 29% (95% CL: 19%-46%) and 39% (95% CI 20%-74%), while this estimate vaccine in clinical and severe malaria in children aged 6-12 mo was 19% (95% CI 14%-24%) and 21 (95% CI 19%-37%), respectively. On the other hand, higher VE was seen in both per- protocol and intention-to-treat population in children aged 5-17 than the children aged 6-12 mo. The results of this meta-analysis suggest that this candidate malaria vaccine has relatively little efficacy, and the vaccine apparently will not meet the goal of malaria eradication by itself.
Statistical analysis of management data
Gatignon, Hubert
2013-01-01
This book offers a comprehensive approach to multivariate statistical analyses. It provides theoretical knowledge of the concepts underlying the most important multivariate techniques and an overview of actual applications.
SASWeave: Literate Programming Using SAS
DEFF Research Database (Denmark)
Lenth, Russell V; Højsgaard, Søren
2007-01-01
the convenience of being able to easily incorporate SAS examples in a document, SASweave facilitates the concept of "literate programming": having code, documentation, and results packaged together. Among other things, this helps to ensure that the SAS output in the document is in concordance with the code...
SASWeave: Literate Programming Using SAS
DEFF Research Database (Denmark)
Lenth, Russell V; Højsgaard, Søren
2007-01-01
SASweave is a collection of scripts that allow one to embed SAS code into a LATEX document, and automatically incorporate the results as well. SASweave is patterned after Sweave, which does the same thing for code written in R. In fact, a document may contain both SAS and R code. Besides the conv...
Modeling binary correlated responses using SAS, SPSS and R
Wilson, Jeffrey R
2015-01-01
Statistical tools to analyze correlated binary data are spread out in the existing literature. This book makes these tools accessible to practitioners in a single volume. Chapters cover recently developed statistical tools and statistical packages that are tailored to analyzing correlated binary data. The authors showcase both traditional and new methods for application to health-related research. Data and computer programs will be publicly available in order for readers to replicate model development, but learning a new statistical language is not necessary with this book. The inclusion of code for R, SAS, and SPSS allows for easy implementation by readers. For readers interested in learning more about the languages, though, there are short tutorials in the appendix. Accompanying data sets are available for download through the book s website. Data analysis presented in each chapter will provide step-by-step instructions so these new methods can be readily applied to projects. Researchers and graduate stu...
SAS-2 galactic gamma ray results, 1
Thompson, D. J.; Fichtel, C. E.; Hartman, R. C.; Kniffen, D. A.; Bignami, G. F.; Lamb, R. C.; Oegelman, H.; Oezel, M. E.; Tuemer, T.
1976-01-01
Continuing analysis of the data from the SAS-2 high energy gamma-ray experiment has produced an improved picture of the sky at photon energies above 35 MeV. On a large scale, the diffuse emission from the galactic plane is the dominant feature observed by SAS-2. This galactic plane emission is most intense between galactic longitude 310 and 45 deg, corresponding to a region within 7kpc of the galactic center. Within the high-intensity region, SAS-2 observes peaks around galactic longitudes 315 deg, 330 deg, 345 deg, 0 deg, and 35 deg. These peaks appear to be correlated with such galactic features and components as molecular hydrogen, atomic hydrogen, magnetic fields, cosmic ray concentrations, and photon fields.
Statistical Analysis by Statistical Physics Model for the STOCK Markets
Wang, Tiansong; Wang, Jun; Fan, Bingli
A new stochastic stock price model of stock markets based on the contact process of the statistical physics systems is presented in this paper, where the contact model is a continuous time Markov process, one interpretation of this model is as a model for the spread of an infection. Through this model, the statistical properties of Shanghai Stock Exchange (SSE) and Shenzhen Stock Exchange (SZSE) are studied. In the present paper, the data of SSE Composite Index and the data of SZSE Component Index are analyzed, and the corresponding simulation is made by the computer computation. Further, we investigate the statistical properties, fat-tail phenomena, the power-law distributions, and the long memory of returns for these indices. The techniques of skewness-kurtosis test, Kolmogorov-Smirnov test, and R/S analysis are applied to study the fluctuation characters of the stock price returns.
Vapor Pressure Data Analysis and Statistics
2016-12-01
VAPOR PRESSURE DATA ANALYSIS AND STATISTICS ECBC-TR-1422 Ann Brozena RESEARCH AND TECHNOLOGY DIRECTORATE...DATE XX-12-2016 2. REPORT TYPE Final 3. DATES COVERED (From - To) Nov 2015 – Apr 2016 4. TITLE Vapor Pressure Data Analysis and Statistics 5a...1 VAPOR PRESSURE DATA ANALYSIS AND STATISTICS 1. INTRODUCTION Knowledge of the vapor pressure of materials as a function of temperature is
A Statistical Analysis of Cryptocurrencies
Directory of Open Access Journals (Sweden)
Stephen Chan
2017-05-01
Full Text Available We analyze statistical properties of the largest cryptocurrencies (determined by market capitalization, of which Bitcoin is the most prominent example. We characterize their exchange rates versus the U.S. Dollar by fitting parametric distributions to them. It is shown that returns are clearly non-normal, however, no single distribution fits well jointly to all the cryptocurrencies analysed. We find that for the most popular currencies, such as Bitcoin and Litecoin, the generalized hyperbolic distribution gives the best fit, while for the smaller cryptocurrencies the normal inverse Gaussian distribution, generalized t distribution, and Laplace distribution give good fits. The results are important for investment and risk management purposes.
Asymptotic modal analysis and statistical energy analysis
Dowell, Earl H.
1988-07-01
Statistical Energy Analysis (SEA) is defined by considering the asymptotic limit of Classical Modal Analysis, an approach called Asymptotic Modal Analysis (AMA). The general approach is described for both structural and acoustical systems. The theoretical foundation is presented for structural systems, and experimental verification is presented for a structural plate responding to a random force. Work accomplished subsequent to the grant initiation focusses on the acoustic response of an interior cavity (i.e., an aircraft or spacecraft fuselage) with a portion of the wall vibrating in a large number of structural modes. First results were presented at the ASME Winter Annual Meeting in December, 1987, and accepted for publication in the Journal of Vibration, Acoustics, Stress and Reliability in Design. It is shown that asymptotically as the number of acoustic modes excited becomes large, the pressure level in the cavity becomes uniform except at the cavity boundaries. However, the mean square pressure at the cavity corner, edge and wall is, respectively, 8, 4, and 2 times the value in the cavity interior. Also it is shown that when the portion of the wall which is vibrating is near a cavity corner or edge, the response is significantly higher.
DEFF Research Database (Denmark)
Milhøj, Anders
2009-01-01
Den vigtigste i SAS version 9.2 indenfor grafik er mulighederne for at producere dokumentationsgrafik i mange statistiske procedurer. Det drejer sig især om grafikker til modelkontrol, fx. residualdiagrammer, influensplot, normalfraktildiagrammer for residualerne etc.......Den vigtigste i SAS version 9.2 indenfor grafik er mulighederne for at producere dokumentationsgrafik i mange statistiske procedurer. Det drejer sig især om grafikker til modelkontrol, fx. residualdiagrammer, influensplot, normalfraktildiagrammer for residualerne etc....
Common pitfalls in statistical analysis: Logistic regression.
Ranganathan, Priya; Pramesh, C S; Aggarwal, Rakesh
2017-01-01
Logistic regression analysis is a statistical technique to evaluate the relationship between various predictor variables (either categorical or continuous) and an outcome which is binary (dichotomous). In this article, we discuss logistic regression analysis and the limitations of this technique.
Towards a Judgement-Based Statistical Analysis
Gorard, Stephen
2006-01-01
There is a misconception among social scientists that statistical analysis is somehow a technical, essentially objective, process of decision-making, whereas other forms of data analysis are judgement-based, subjective and far from technical. This paper focuses on the former part of the misconception, showing, rather, that statistical analysis…
Multistructure Statistical Model Applied To Factor Analysis
Bentler, Peter M.
1976-01-01
A general statistical model for the multivariate analysis of mean and covariance structures is described. Matrix calculus is used to develop the statistical aspects of one new special case in detail. This special case separates the confounding of principal components and factor analysis. (DEP)
Statistical Power in Meta-Analysis
Liu, Jin
2015-01-01
Statistical power is important in a meta-analysis study, although few studies have examined the performance of simulated power in meta-analysis. The purpose of this study is to inform researchers about statistical power estimation on two sample mean difference test under different situations: (1) the discrepancy between the analytical power and…
Statistical Power in Meta-Analysis
Liu, Jin
2015-01-01
Statistical power is important in a meta-analysis study, although few studies have examined the performance of simulated power in meta-analysis. The purpose of this study is to inform researchers about statistical power estimation on two sample mean difference test under different situations: (1) the discrepancy between the analytical power and…
Statistical methods for astronomical data analysis
Chattopadhyay, Asis Kumar
2014-01-01
This book introduces “Astrostatistics” as a subject in its own right with rewarding examples, including work by the authors with galaxy and Gamma Ray Burst data to engage the reader. This includes a comprehensive blending of Astrophysics and Statistics. The first chapter’s coverage of preliminary concepts and terminologies for astronomical phenomenon will appeal to both Statistics and Astrophysics readers as helpful context. Statistics concepts covered in the book provide a methodological framework. A unique feature is the inclusion of different possible sources of astronomical data, as well as software packages for converting the raw data into appropriate forms for data analysis. Readers can then use the appropriate statistical packages for their particular data analysis needs. The ideas of statistical inference discussed in the book help readers determine how to apply statistical tests. The authors cover different applications of statistical techniques already developed or specifically introduced for ...
Analysis of Correlates in SAS.SDS and MMPI of Stutters%口吃患者MMPI与SAS、SDS的相关分析
Institute of Scientific and Technical Information of China (English)
刘盈; 丁宝坤; 肖可畏; 石文惠; 李晓东; 王晓武; 孙晓敏
2001-01-01
Objective:To investigate the personality feature of stuttering. Methods: SAS, SDS, MMPI were administered to 108 patients with stuttering, 110 normal controls, 103 patients with OCD, 137 patients with depression. Results: The scores of F, K, D, Mf, Pa, Pt, Sc, Si in stuttering group were significantly differen from OCD and depression groups. The scores of SAS were positively correlated with the score of F, D, Ht, Pd, Pt, Sc, Si. Conclusion: The personality feature of stuttering were depressive and compulsive personality.
Statistical analysis with Excel for dummies
Schmuller, Joseph
2013-01-01
Take the mystery out of statistical terms and put Excel to work! If you need to create and interpret statistics in business or classroom settings, this easy-to-use guide is just what you need. It shows you how to use Excel's powerful tools for statistical analysis, even if you've never taken a course in statistics. Learn the meaning of terms like mean and median, margin of error, standard deviation, and permutations, and discover how to interpret the statistics of everyday life. You'll learn to use Excel formulas, charts, PivotTables, and other tools to make sense of everything fro
Institute of Scientific and Technical Information of China (English)
杨舒静; 刘沛
2014-01-01
目的 介绍统计分析软件(statistical analysis system,SAS)网络化模块(SAS/IntrNet)的基本功能并应用于医学实际.方法 应用SAS/IntrNet远程调用SAS程序进行在线分析,从客户端获取参数通过网络传递给服务器端的SAS程序进行分析,并反馈结果给客户端.结果 利用SAS/IntrNet程序处理流程和相关配置文件,成功地实现了在线膳食暴露评估分析实例,使得用户可以在不安装SAS软件的情况下方便地通过网页进行在线SAS统计分析.结论 利用SAS/IntrNet模块实现SAS网络化应用,可使不甚了解SAS编程的使用者通过友好的访问界面即可实现SAS的在线统计分析,大大提高了SAS系统信息共享的能力.
Explorations in Statistics: The Analysis of Change
Curran-Everett, Douglas; Williams, Calvin L.
2015-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This tenth installment of "Explorations in Statistics" explores the analysis of a potential change in some physiological response. As researchers, we often express absolute change as percent change so we can…
Analysis of Preference Data Using Intermediate Test Statistic Abstract
African Journals Online (AJOL)
PROF. O. E. OSUAGWU
2013-06-01
Jun 1, 2013 ... Intermediate statistic is a link between Friedman test statistic and the multinomial statistic. The statistic is ... The null hypothesis Ho .... [7] Taplin, R.H., The Statistical Analysis of Preference Data, Applied Statistics, No. 4, pp.
Hypothesis testing and statistical analysis of microbiome
Directory of Open Access Journals (Sweden)
Yinglin Xia
2017-09-01
Full Text Available After the initiation of Human Microbiome Project in 2008, various biostatistic and bioinformatic tools for data analysis and computational methods have been developed and applied to microbiome studies. In this review and perspective, we discuss the research and statistical hypotheses in gut microbiome studies, focusing on mechanistic concepts that underlie the complex relationships among host, microbiome, and environment. We review the current available statistic tools and highlight recent progress of newly developed statistical methods and models. Given the current challenges and limitations in biostatistic approaches and tools, we discuss the future direction in developing statistical methods and models for the microbiome studies.
Statistical Analysis of English Noun Suffixes
Institute of Scientific and Technical Information of China (English)
王惠灵
2014-01-01
This study discusses the origin of English noun suffixes,including Old English,Latin,French and Greek.A statistical analysis of some typical noun suffixes is followed in two corpora,which focuses on their frequency and distribution.
Spatial analysis statistics, visualization, and computational methods
Oyana, Tonny J
2015-01-01
An introductory text for the next generation of geospatial analysts and data scientists, Spatial Analysis: Statistics, Visualization, and Computational Methods focuses on the fundamentals of spatial analysis using traditional, contemporary, and computational methods. Outlining both non-spatial and spatial statistical concepts, the authors present practical applications of geospatial data tools, techniques, and strategies in geographic studies. They offer a problem-based learning (PBL) approach to spatial analysis-containing hands-on problem-sets that can be worked out in MS Excel or ArcGIS-as well as detailed illustrations and numerous case studies. The book enables readers to: Identify types and characterize non-spatial and spatial data Demonstrate their competence to explore, visualize, summarize, analyze, optimize, and clearly present statistical data and results Construct testable hypotheses that require inferential statistical analysis Process spatial data, extract explanatory variables, conduct statisti...
Statistical Analysis Of Reconnaissance Geochemical Data From ...
African Journals Online (AJOL)
Statistical Analysis Of Reconnaissance Geochemical Data From Orle District, ... The univariate methods used include frequency distribution and cumulative ... The possible mineral potential of the area include base metals (Pb, Zn, Cu, Mo, etc.) ...
Statistical shape analysis with applications in R
Dryden, Ian L
2016-01-01
A thoroughly revised and updated edition of this introduction to modern statistical methods for shape analysis Shape analysis is an important tool in the many disciplines where objects are compared using geometrical features. Examples include comparing brain shape in schizophrenia; investigating protein molecules in bioinformatics; and describing growth of organisms in biology. This book is a significant update of the highly-regarded `Statistical Shape Analysis’ by the same authors. The new edition lays the foundations of landmark shape analysis, including geometrical concepts and statistical techniques, and extends to include analysis of curves, surfaces, images and other types of object data. Key definitions and concepts are discussed throughout, and the relative merits of different approaches are presented. The authors have included substantial new material on recent statistical developments and offer numerous examples throughout the text. Concepts are introduced in an accessible manner, while reta...
Reproducible statistical analysis with multiple languages
DEFF Research Database (Denmark)
Lenth, Russell; Højsgaard, Søren
2011-01-01
This paper describes the system for making reproducible statistical analyses. differs from other systems for reproducible analysis in several ways. The two main differences are: (1) Several statistics programs can be in used in the same document. (2) Documents can be prepared using OpenOffice or ......This paper describes the system for making reproducible statistical analyses. differs from other systems for reproducible analysis in several ways. The two main differences are: (1) Several statistics programs can be in used in the same document. (2) Documents can be prepared using Open......Office or \\LaTeX. The main part of this paper is an example showing how to use and together in an OpenOffice text document. The paper also contains some practical considerations on the use of literate programming in statistics....
Statistical Models and Methods for Network Meta-Analysis.
Madden, L V; Piepho, H-P; Paul, P A
2016-08-01
Meta-analysis, the methodology for analyzing the results from multiple independent studies, has grown tremendously in popularity over the last four decades. Although most meta-analyses involve a single effect size (summary result, such as a treatment difference) from each study, there are often multiple treatments of interest across the network of studies in the analysis. Multi-treatment (or network) meta-analysis can be used for simultaneously analyzing the results from all the treatments. However, the methodology is considerably more complicated than for the analysis of a single effect size, and there have not been adequate explanations of the approach for agricultural investigations. We review the methods and models for conducting a network meta-analysis based on frequentist statistical principles, and demonstrate the procedures using a published multi-treatment plant pathology data set. A major advantage of network meta-analysis is that correlations of estimated treatment effects are automatically taken into account when an appropriate model is used. Moreover, treatment comparisons may be possible in a network meta-analysis that are not possible in a single study because all treatments of interest may not be included in any given study. We review several models that consider the study effect as either fixed or random, and show how to interpret model-fitting output. We further show how to model the effect of moderator variables (study-level characteristics) on treatment effects, and present one approach to test for the consistency of treatment effects across the network. Online supplemental files give explanations on fitting the network meta-analytical models using SAS.
Advances in statistical models for data analysis
Minerva, Tommaso; Vichi, Maurizio
2015-01-01
This edited volume focuses on recent research results in classification, multivariate statistics and machine learning and highlights advances in statistical models for data analysis. The volume provides both methodological developments and contributions to a wide range of application areas such as economics, marketing, education, social sciences and environment. The papers in this volume were first presented at the 9th biannual meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in September 2013 at the University of Modena and Reggio Emilia, Italy.
%HPGLIMMIX: A High-Performance SAS Macro for GLMM Estimation
Directory of Open Access Journals (Sweden)
Liang Xie
2014-06-01
Full Text Available Generalized linear mixed models (GLMMs comprise a class of widely used statistical tools for data analysis with fixed and random effects when the response variable has a conditional distribution in the exponential family. GLMM analysis also has a close relationship with actuarial credibility theory. While readily available programs such as the GLIMMIX procedure in SAS and the lme4 package in R are powerful tools for using this class of models, these progarms are not able to handle models with thousands of levels of fixed and random effects. By using sparse-matrix and other high performance techniques, procedures such as HPMIXED in SAS can easily fit models with thousands of factor levels, but only for normally distributed response variables. In this paper, we present the %HPGLIMMIX SAS macro that fits GLMMs with large number of sparsely populated design matrices using the doubly-iterative linearization (pseudo-likelihood method, in which the sparse-matrix-based HPMIXED is used for the inner iterations with the pseudo-variable constructed from the inverse-link function and the chosen model. Although the macro does not have the full functionality of the GLIMMIX procedure, time and memory savings can be large with the new macro. In applications in which design matrices contain many zeros and there are hundreds or thousands of factor levels, models can be fitted without exhausting computer memory, and 90% or better reduction in running time can be observed. Examples with a Poisson, binomial, and gamma conditional distribution are presented to demonstrate the usage and efficiency of this macro.
SAS formats are a very powerful tool. They allow you to display the data in a more readable manner without modifying it. Formats can also be used to group data into categories for use in various procedures like PROC FREQ, PROC TTEST, and PROC MEANS (as a class variable). As ...
Schmidt decomposition and multivariate statistical analysis
Bogdanov, Yu. I.; Bogdanova, N. A.; Fastovets, D. V.; Luckichev, V. F.
2016-12-01
The new method of multivariate data analysis based on the complements of classical probability distribution to quantum state and Schmidt decomposition is presented. We considered Schmidt formalism application to problems of statistical correlation analysis. Correlation of photons in the beam splitter output channels, when input photons statistics is given by compound Poisson distribution is examined. The developed formalism allows us to analyze multidimensional systems and we have obtained analytical formulas for Schmidt decomposition of multivariate Gaussian states. It is shown that mathematical tools of quantum mechanics can significantly improve the classical statistical analysis. The presented formalism is the natural approach for the analysis of both classical and quantum multivariate systems and can be applied in various tasks associated with research of dependences.
Statistics and analysis of scientific data
Bonamente, Massimiliano
2013-01-01
Statistics and Analysis of Scientific Data covers the foundations of probability theory and statistics, and a number of numerical and analytical methods that are essential for the present-day analyst of scientific data. Topics covered include probability theory, distribution functions of statistics, fits to two-dimensional datasheets and parameter estimation, Monte Carlo methods and Markov chains. Equal attention is paid to the theory and its practical application, and results from classic experiments in various fields are used to illustrate the importance of statistics in the analysis of scientific data. The main pedagogical method is a theory-then-application approach, where emphasis is placed first on a sound understanding of the underlying theory of a topic, which becomes the basis for an efficient and proactive use of the material for practical applications. The level is appropriate for undergraduates and beginning graduate students, and as a reference for the experienced researcher. Basic calculus is us...
A SAS/IML algorithm for an exact permutation test
Directory of Open Access Journals (Sweden)
Neuhäuser, Markus
2009-03-01
Full Text Available An algorithm written in SAS/IML is presented that can perform an exact permutation test for a two-sample comparison. All possible permutations are considered. The Baumgartner-Weiß-Schindler statistic is exemplarily used as the test statistic for the permutation test.
Foundation of statistical energy analysis in vibroacoustics
Le Bot, A
2015-01-01
This title deals with the statistical theory of sound and vibration. The foundation of statistical energy analysis is presented in great detail. In the modal approach, an introduction to random vibration with application to complex systems having a large number of modes is provided. For the wave approach, the phenomena of propagation, group speed, and energy transport are extensively discussed. Particular emphasis is given to the emergence of diffuse field, the central concept of the theory.
SAS统计分析软件模块揽要%Profiles on Modules of SAS
Institute of Scientific and Technical Information of China (English)
黄伟; 黄瑜
2008-01-01
SAS(Statistics Analysis System)是20世纪60年代初美国SAS公司研发的统计分析软件.SAS与SPSS及BMDP并称为国际上最有知名度的三大统计软件,如今,在我国社会科学乃至自然科学领域已得到了广泛应用.
Instant Replay: Investigating statistical Analysis in Sports
Sidhu, Gagan
2011-01-01
Technology has had an unquestionable impact on the way people watch sports. As technology has evolved, so too has the knowledge of a casual sports fan. A direct result of this evolution is the amount of statistical analysis in sport. The goal of statistical analysis in sports is a simple one: to eliminate subjective analysis. Over the past four decades, statistics have slowly pervaded the viewing experience of sports. In this paper, we analyze previous work that proposed metrics and models that seek to evaluate various aspects of sports. The unifying goal of these works is an accurate representation of either the player or sport. We also look at work that investigates certain situations and their impact on the outcome a game. We conclude this paper with the discussion of potential future work in certain areas of sport..
Statistical analysis of network data with R
Kolaczyk, Eric D
2014-01-01
Networks have permeated everyday life through everyday realities like the Internet, social networks, and viral marketing. As such, network analysis is an important growth area in the quantitative sciences, with roots in social network analysis going back to the 1930s and graph theory going back centuries. Measurement and analysis are integral components of network research. As a result, statistical methods play a critical role in network analysis. This book is the first of its kind in network research. It can be used as a stand-alone resource in which multiple R packages are used to illustrate how to conduct a wide range of network analyses, from basic manipulation and visualization, to summary and characterization, to modeling of network data. The central package is igraph, which provides extensive capabilities for studying network graphs in R. This text builds on Eric D. Kolaczyk’s book Statistical Analysis of Network Data (Springer, 2009).
Statistical Analysis of Data for Timber Strengths
DEFF Research Database (Denmark)
Sørensen, John Dalsgaard
2003-01-01
Statistical analyses are performed for material strength parameters from a large number of specimens of structural timber. Non-parametric statistical analysis and fits have been investigated for the following distribution types: Normal, Lognormal, 2 parameter Weibull and 3-parameter Weibull....... The statistical fits have generally been made using all data and the lower tail of the data. The Maximum Likelihood Method and the Least Square Technique have been used to estimate the statistical parameters in the selected distributions. The results show that the 2-parameter Weibull distribution gives the best...... fits to the data available, especially if tail fits are used whereas the Log Normal distribution generally gives a poor fit and larger coefficients of variation, especially if tail fits are used. The implications on the reliability level of typical structural elements and on partial safety factors...
Statistics and analysis of scientific data
Bonamente, Massimiliano
2017-01-01
The revised second edition of this textbook provides the reader with a solid foundation in probability theory and statistics as applied to the physical sciences, engineering and related fields. It covers a broad range of numerical and analytical methods that are essential for the correct analysis of scientific data, including probability theory, distribution functions of statistics, fits to two-dimensional data and parameter estimation, Monte Carlo methods and Markov chains. Features new to this edition include: • a discussion of statistical techniques employed in business science, such as multiple regression analysis of multivariate datasets. • a new chapter on the various measures of the mean including logarithmic averages. • new chapters on systematic errors and intrinsic scatter, and on the fitting of data with bivariate errors. • a new case study and additional worked examples. • mathematical derivations and theoretical background material have been appropriately marked,to improve the readabili...
About Statistical Analysis of Qualitative Survey Data
Directory of Open Access Journals (Sweden)
Stefan Loehnert
2010-01-01
Full Text Available Gathered data is frequently not in a numerical form allowing immediate appliance of the quantitative mathematical-statistical methods. In this paper are some basic aspects examining how quantitative-based statistical methodology can be utilized in the analysis of qualitative data sets. The transformation of qualitative data into numeric values is considered as the entrance point to quantitative analysis. Concurrently related publications and impacts of scale transformations are discussed. Subsequently, it is shown how correlation coefficients are usable in conjunction with data aggregation constrains to construct relationship modelling matrices. For illustration, a case study is referenced at which ordinal type ordered qualitative survey answers are allocated to process defining procedures as aggregation levels. Finally options about measuring the adherence of the gathered empirical data to such kind of derived aggregation models are introduced and a statistically based reliability check approach to evaluate the reliability of the chosen model specification is outlined.
The fuzzy approach to statistical analysis
Coppi, Renato; Gil, Maria A.; Kiers, Henk A. L.
2006-01-01
For the last decades, research studies have been developed in which a coalition of Fuzzy Sets Theory and Statistics has been established with different purposes. These namely are: (i) to introduce new data analysis problems in which the objective involves either fuzzy relationships or fuzzy terms;
Statistical Analysis of Random Simulations : Bootstrap Tutorial
Deflandre, D.; Kleijnen, J.P.C.
2002-01-01
The bootstrap is a simple but versatile technique for the statistical analysis of random simulations.This tutorial explains the basics of that technique, and applies it to the well-known M/M/1 queuing simulation.In that numerical example, different responses are studied.For some responses, bootstrap
Bayesian Statistics for Biological Data: Pedigree Analysis
Stanfield, William D.; Carlton, Matthew A.
2004-01-01
The use of Bayes' formula is applied to the biological problem of pedigree analysis to show that the Bayes' formula and non-Bayesian or "classical" methods of probability calculation give different answers. First year college students of biology can be introduced to the Bayesian statistics.
Selected papers on analysis, probability, and statistics
Nomizu, Katsumi
1994-01-01
This book presents papers that originally appeared in the Japanese journal Sugaku. The papers fall into the general area of mathematical analysis as it pertains to probability and statistics, dynamical systems, differential equations and analytic function theory. Among the topics discussed are: stochastic differential equations, spectra of the Laplacian and Schrödinger operators, nonlinear partial differential equations which generate dissipative dynamical systems, fractal analysis on self-similar sets and the global structure of analytic functions.
SAS4 and SAS5 are locus-specific regulators of silencing in Saccharomyces cerevisiae.
Xu, E. Y; S. Kim; Rivier, D H
1999-01-01
Sir2p, Sir3p, Sir4p, and the core histones form a repressive chromatin structure that silences transcription in the regions near telomeres and at the HML and HMR cryptic mating-type loci in Saccharomyces cerevisiae. Null alleles of SAS4 and SAS5 suppress silencing defects at HMR; therefore, SAS4 and SAS5 are negative regulators of silencing at HMR. This study revealed that SAS4 and SAS5 contribute to silencing at HML and the telomeres, indicating that SAS4 and SAS5 are positive regulators of ...
Energy Technology Data Exchange (ETDEWEB)
Massara, S.; Schmitt, D.; Bretault, A.; Lemasson, D.; Darmet, G.; Verwaerde, D. [EDF R and D, 1, Avenue du General de Gaulle, 92141 Clamart (France); Struwe, D.; Pfrang, W.; Ponomarev, A. [Karlsruher Institut fuer Technologie KIT, Institut fuer Neutronenphysik und Reaktortechnik INR, Hermann-von-Helmholtz-Platz 1, Gebaude 521, 76344 Eggenstein-Leopoldshafen (Germany)
2012-07-01
In the framework of a substantial improvement on FBR core safety connected to the development of a new Gen IV reactor type, heterogeneous core with innovative features are being carefully analyzed in France since 2009. At EDF R and D, the main goal is to understand whether a strong reduction of the Na-void worth - possibly attempting a negative value - allows a significant improvement of the core behavior during an unprotected loss of flow accident. Also, the physical behavior of such a core is of interest, before and beyond the (possible) onset of Na boiling. Hence, a cutting-edge heterogeneous design, featuring an annular shape, a Na-plena with a B{sub 4}C plate and a stepwise modulation of fissile core heights, was developed at EDF by means of the SDDS methodology, with a total Na-void worth of -1 $. The behavior of such a core during the primary phase of a severe accident, initiated by an unprotected loss of flow, is analyzed by means of the SAS-SFR code. This study is carried-out at KIT and EDF, in the framework of a scientific collaboration on innovative FBR severe accident analyses. The results show that the reduction of the Na-void worth is very effective, but is not sufficient alone to avoid Na-boiling and, hence, to prevent the core from entering into the primary phase of a severe accident. Nevertheless, the grace time up to boiling onset is greatly enhanced in comparison to a more traditional homogeneous core design, and only an extremely low fraction of the fuel (<0.1%) enters into melting at the end of this phase. A sensitivity analysis shows that, due to the inherent neutronic characteristics of such a core, the gagging scheme plays a major role on the core behavior: indeed, an improved 4-zones gagging scheme, associated with an enhanced control rod drive line expansion feed-back effect, finally prevents the core from entering into sodium boiling. This major conclusion highlights both the progress already accomplished and the need for more detailed
Statistical Analysis of Thermal Analysis Margin
Garrison, Matthew B.
2011-01-01
NASA Goddard Space Flight Center requires that each project demonstrate a minimum of 5 C margin between temperature predictions and hot and cold flight operational limits. The bounding temperature predictions include worst-case environment and thermal optical properties. The purpose of this work is to: assess how current missions are performing against their pre-launch bounding temperature predictions and suggest any possible changes to the thermal analysis margin rules
Multilevel models applications using SAS
Wang, Jichuan; Fisher, James F
2011-01-01
This book covers a broad range of topics about multilevel modeling. The goal is to help readers to understand the basic concepts, theoretical frameworks, and application methods of multilevel modeling. It is at a level also accessible to non-mathematicians, focusing on the methods and applications of various multilevel models and using the widely used statistical software SAS®. Examples are drawn from analysis of real-world research data.
Multilevel Models Applications Using SAS
Wang, Jichuan; Fisher, James
2011-01-01
This book covers a broad range of topics about multilevel modeling. The goal is to help readersto understand the basic concepts, theoretical frameworks, and application methods of multilevel modeling. Itis at a level also accessible to non-mathematicians, focusing on the methods and applications of various multilevel models and using the widely used statistical software SAS®.Examples are drawn from analysis of real-world research data.
The Statistical Analysis of Time Series
Anderson, T W
2011-01-01
The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences George
Statistical analysis of next generation sequencing data
Nettleton, Dan
2014-01-01
Next Generation Sequencing (NGS) is the latest high throughput technology to revolutionize genomic research. NGS generates massive genomic datasets that play a key role in the big data phenomenon that surrounds us today. To extract signals from high-dimensional NGS data and make valid statistical inferences and predictions, novel data analytic and statistical techniques are needed. This book contains 20 chapters written by prominent statisticians working with NGS data. The topics range from basic preprocessing and analysis with NGS data to more complex genomic applications such as copy number variation and isoform expression detection. Research statisticians who want to learn about this growing and exciting area will find this book useful. In addition, many chapters from this book could be included in graduate-level classes in statistical bioinformatics for training future biostatisticians who will be expected to deal with genomic data in basic biomedical research, genomic clinical trials and personalized med...
Statistical Tools for Forensic Analysis of Toolmarks
Energy Technology Data Exchange (ETDEWEB)
David Baldwin; Max Morris; Stan Bajic; Zhigang Zhou; James Kreiser
2004-04-22
Recovery and comparison of toolmarks, footprint impressions, and fractured surfaces connected to a crime scene are of great importance in forensic science. The purpose of this project is to provide statistical tools for the validation of the proposition that particular manufacturing processes produce marks on the work-product (or tool) that are substantially different from tool to tool. The approach to validation involves the collection of digital images of toolmarks produced by various tool manufacturing methods on produced work-products and the development of statistical methods for data reduction and analysis of the images. The developed statistical methods provide a means to objectively calculate a ''degree of association'' between matches of similarly produced toolmarks. The basis for statistical method development relies on ''discriminating criteria'' that examiners use to identify features and spatial relationships in their analysis of forensic samples. The developed data reduction algorithms utilize the same rules used by examiners for classification and association of toolmarks.
Statistical Analysis of Iberian Peninsula Megaliths Orientations
González-García, A. C.
2009-08-01
Megalithic monuments have been intensively surveyed and studied from the archaeoastronomical point of view in the past decades. We have orientation measurements for over one thousand megalithic burial monuments in the Iberian Peninsula, from several different periods. These data, however, lack a sound understanding. A way to classify and start to understand such orientations is by means of statistical analysis of the data. A first attempt is done with simple statistical variables and a mere comparison between the different areas. In order to minimise the subjectivity in the process a further more complicated analysis is performed. Some interesting results linking the orientation and the geographical location will be presented. Finally I will present some models comparing the orientation of the megaliths in the Iberian Peninsula with the rising of the sun and the moon at several times of the year.
Multivariate analysis: A statistical approach for computations
Michu, Sachin; Kaushik, Vandana
2014-10-01
Multivariate analysis is a type of multivariate statistical approach commonly used in, automotive diagnosis, education evaluating clusters in finance etc and more recently in the health-related professions. The objective of the paper is to provide a detailed exploratory discussion about factor analysis (FA) in image retrieval method and correlation analysis (CA) of network traffic. Image retrieval methods aim to retrieve relevant images from a collected database, based on their content. The problem is made more difficult due to the high dimension of the variable space in which the images are represented. Multivariate correlation analysis proposes an anomaly detection and analysis method based on the correlation coefficient matrix. Anomaly behaviors in the network include the various attacks on the network like DDOs attacks and network scanning.
Statistical quality control through overall vibration analysis
Carnero, M. a. Carmen; González-Palma, Rafael; Almorza, David; Mayorga, Pedro; López-Escobar, Carlos
2010-05-01
The present study introduces the concept of statistical quality control in automotive wheel bearings manufacturing processes. Defects on products under analysis can have a direct influence on passengers' safety and comfort. At present, the use of vibration analysis on machine tools for quality control purposes is not very extensive in manufacturing facilities. Noise and vibration are common quality problems in bearings. These failure modes likely occur under certain operating conditions and do not require high vibration amplitudes but relate to certain vibration frequencies. The vibration frequencies are affected by the type of surface problems (chattering) of ball races that are generated through grinding processes. The purpose of this paper is to identify grinding process variables that affect the quality of bearings by using statistical principles in the field of machine tools. In addition, an evaluation of the quality results of the finished parts under different combinations of process variables is assessed. This paper intends to establish the foundations to predict the quality of the products through the analysis of self-induced vibrations during the contact between the grinding wheel and the parts. To achieve this goal, the overall self-induced vibration readings under different combinations of process variables are analysed using statistical tools. The analysis of data and design of experiments follows a classical approach, considering all potential interactions between variables. The analysis of data is conducted through analysis of variance (ANOVA) for data sets that meet normality and homoscedasticity criteria. This paper utilizes different statistical tools to support the conclusions such as chi squared, Shapiro-Wilks, symmetry, Kurtosis, Cochran, Hartlett, and Hartley and Krushal-Wallis. The analysis presented is the starting point to extend the use of predictive techniques (vibration analysis) for quality control. This paper demonstrates the existence
Energy Technology Data Exchange (ETDEWEB)
Kruessmann, R., E-mail: regina.kruessmann@kit.edu [Karlsruhe Institute of Technology, Institute of Neutron Physics and Reactor Technology INR, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen (Germany); Ponomarev, A.; Pfrang, W.; Struwe, D. [Karlsruhe Institute of Technology, Institute of Neutron Physics and Reactor Technology INR, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen (Germany); Champigny, J.; Carluec, B. [AREVA, 10, rue J. Récamier, 69456 Lyon Cedex 06 (France); Schmitt, D.; Verwaerde, D. [EDF R& D, 1 avenue du général de Gaulle, 92140 Clamart (France)
2015-04-15
Highlights: • Comparison of different core designs for a sodium-cooled fast reactor. • Safety assessment with the code system SAS-SFR. • Unprotected Loss of Flow (ULOF) scenario. • Sodium boiling and core melting cannot be avoided. • A net negative Na void effect provides more grace time prior to local SA destruction. - Abstract: In the framework of cooperation agreements between KIT-INR and AREVA SAS NP as well as between KIT-INR and EDF R&D in the years 2008–2013, the evaluation of severe transient behavior in sodium-cooled fast reactors (SFRs) was investigated. In Part I of this contribution, the efficiency of newly conceived prevention and mitigation measures was investigated for unprotected loss-of-flow (ULOF), unprotected loss-of-heat-sink (ULOHS) and the unprotected transient-overpower (UTOP) transients. In this second part, consequence analyses were performed for the initiation phase of different unprotected loss-of-flow (ULOF) scenarios imposed on a variety of different core design options of SFRs. The code system SAS-SFR was used for this purpose. Results of analyses for cases postulating unavailability of prevention measures as shut-down systems, passive and/or active additional devices show that entering into an energetic power excursion as a consequence of the initiation phase of a ULOF cannot be avoided for those core designs with a cumulative void reactivity feedback larger than zero. However, even for core designs aiming at values of the void reactivity less than zero it is difficult to find system design characteristics which prevent the transient entering into partial core destruction. Further studies of the transient core and system behavior would require codes dedicated to specific aspects of transition phase analyses and of in-vessel material relocation analyses.
Statistical mechanics analysis of sparse data.
Habeck, Michael
2011-03-01
Inferential structure determination uses Bayesian theory to combine experimental data with prior structural knowledge into a posterior probability distribution over protein conformational space. The posterior distribution encodes everything one can say objectively about the native structure in the light of the available data and additional prior assumptions and can be searched for structural representatives. Here an analogy is drawn between the posterior distribution and the canonical ensemble of statistical physics. A statistical mechanics analysis assesses the complexity of a structure calculation globally in terms of ensemble properties. Analogs of the free energy and density of states are introduced; partition functions evaluate the consistency of prior assumptions with data. Critical behavior is observed with dwindling restraint density, which impairs structure determination with too sparse data. However, prior distributions with improved realism ameliorate the situation by lowering the critical number of observations. An in-depth analysis of various experimentally accessible structural parameters and force field terms will facilitate a statistical approach to protein structure determination with sparse data that avoids bias as much as possible.
Sensitivity analysis and related analysis : A survey of statistical techniques
Kleijnen, J.P.C.
1995-01-01
This paper reviews the state of the art in five related types of analysis, namely (i) sensitivity or what-if analysis, (ii) uncertainty or risk analysis, (iii) screening, (iv) validation, and (v) optimization. The main question is: when should which type of analysis be applied; which statistical
Statistical analysis of sleep spindle occurrences.
Panas, Dagmara; Malinowska, Urszula; Piotrowski, Tadeusz; Żygierewicz, Jarosław; Suffczyński, Piotr
2013-01-01
Spindles - a hallmark of stage II sleep - are a transient oscillatory phenomenon in the EEG believed to reflect thalamocortical activity contributing to unresponsiveness during sleep. Currently spindles are often classified into two classes: fast spindles, with a frequency of around 14 Hz, occurring in the centro-parietal region; and slow spindles, with a frequency of around 12 Hz, prevalent in the frontal region. Here we aim to establish whether the spindle generation process also exhibits spatial heterogeneity. Electroencephalographic recordings from 20 subjects were automatically scanned to detect spindles and the time occurrences of spindles were used for statistical analysis. Gamma distribution parameters were fit to each inter-spindle interval distribution, and a modified Wald-Wolfowitz lag-1 correlation test was applied. Results indicate that not all spindles are generated by the same statistical process, but this dissociation is not spindle-type specific. Although this dissociation is not topographically specific, a single generator for all spindle types appears unlikely.
Statistical error analysis of reactivity measurement
Energy Technology Data Exchange (ETDEWEB)
Thammaluckan, Sithisak; Hah, Chang Joo [KEPCO International Nuclear Graduate School, Ulsan (Korea, Republic of)
2013-10-15
After statistical analysis, it was confirmed that each group were sampled from same population. It is observed in Table 7 that the mean error decreases as core size increases. Application of bias factor obtained from this research reduces mean error further. The point kinetic model had been used to measure control rod worth without 3D spatial information of neutron flux or power distribution, which causes inaccurate result. Dynamic Control rod Reactivity Measurement (DCRM) was employed to take into account of 3D spatial information of flux in the point kinetics model. The measured bank worth probably contains some uncertainty such as methodology uncertainty and measurement uncertainty. Those uncertainties may varies with size of core and magnitude of reactivity. The goal of this research is to investigate the effect of core size and magnitude of control rod worth on the error of reactivity measurement using statistics.
Clinical SAS programming in India: A study of industry needs versus wants
Directory of Open Access Journals (Sweden)
Nithiyanandhan Ananthakrishnan
2014-01-01
Full Text Available Background: The clinical SAS (www.sas.com programming industry, in India, has seen a rapid growth in the last decade and the trend seems set to continue, for the next couple of years, due to cost advantage and the availability of skilled labor. On one side the industry needs are focused on less execution time, high margins, segmented tasks and the delivery of high quality output with minimal oversight. On the other side, due to the increased demand for skilled resources, the wants of the programmers have taken a different shift toward diversifying exposure, unsustainable wage inflation due to multiple opportunities and generally high expectations around career progression. If the industry needs are not going to match with programmers want, or vice versa, then there is the possibility that the current year on year growth may start to slow or even go into decline. Aim: This paper is intended to identify the gap between wants and need and puts forwards some suggestions, for both sides, in ways to change the equation to benefit all. Settings and Design: Questionnaire on similar themes created to survey managers and programmers working in clinical SAS programming industry and was surveyed online to collect their perspectives. Their views are compared for each theme and presented as results. Materials and Methods: Two surveys were created in www.surveymonkey.com. Management: https://www.surveymonkey.com/s/SAS_India_managment_needvswant_survey. Programmer: https://www.surveymonkey.com/s/SAS_India_programmer_needvswant_survey. Statistical Analysis Used: Bar chart and pie chart used on data collect to show segmentation of data. Results and Conclusions: In conclusion, it seeks to highlight the future industry direction and the skillset that existing programmers need to have, in order to sustain the momentum and remain competitive, to contribute to the future pipeline and the development of the profession in India.
On quantum statistics in data analysis
Pavlovic, Dusko
2008-01-01
Originally, quantum probability theory was developed to analyze statistical phenomena in quantum systems, where classical probability theory does not apply, because the lattice of measurable sets is not necessarily distributive. On the other hand, it is well known that the lattices of concepts, that arise in data analysis, are in general also non-distributive, albeit for completely different reasons. In his recent book, van Rijsbergen argues that many of the logical tools developed for quantum systems are also suitable for applications in information retrieval. I explore the mathematical support for this idea on an abstract vector space model, covering several forms of data analysis (information retrieval, data mining, collaborative filtering, formal concept analysis...), and roughly based on an idea from categorical quantum mechanics. It turns out that quantum (i.e., noncommutative) probability distributions arise already in this rudimentary mathematical framework. Moreover, a Bell-type inequality is formula...
SPSS and SAS programs for comparing Pearson correlations and OLS regression coefficients.
Weaver, Bruce; Wuensch, Karl L
2013-09-01
Several procedures that use summary data to test hypotheses about Pearson correlations and ordinary least squares regression coefficients have been described in various books and articles. To our knowledge, however, no single resource describes all of the most common tests. Furthermore, many of these tests have not yet been implemented in popular statistical software packages such as SPSS and SAS. In this article, we describe all of the most common tests and provide SPSS and SAS programs to perform them. When they are applicable, our code also computes 100 × (1 - α)% confidence intervals corresponding to the tests. For testing hypotheses about independent regression coefficients, we demonstrate one method that uses summary data and another that uses raw data (i.e., Potthoff analysis). When the raw data are available, the latter method is preferred, because use of summary data entails some loss of precision due to rounding.
Multiscale statistical analysis of coronal solar activity
Gamborino, Diana; Martinell, Julio J
2016-01-01
Multi-filter images from the solar corona are used to obtain temperature maps which are analyzed using techniques based on proper orthogonal decomposition (POD) in order to extract dynamical and structural information at various scales. Exploring active regions before and after a solar flare and comparing them with quiet regions we show that the multiscale behavior presents distinct statistical properties for each case that can be used to characterize the level of activity in a region. Information about the nature of heat transport is also be extracted from the analysis.
Rogala, Kacper B; Dynes, Nicola J; Hatzopoulos, Georgios N; Yan, Jun; Pong, Sheng Kai; Robinson, Carol V; Deane, Charlotte M; Gönczy, Pierre; Vakonakis, Ioannis
2015-05-29
Centrioles are microtubule-based organelles crucial for cell division, sensing and motility. In Caenorhabditis elegans, the onset of centriole formation requires notably the proteins SAS-5 and SAS-6, which have functional equivalents across eukaryotic evolution. Whereas the molecular architecture of SAS-6 and its role in initiating centriole formation are well understood, the mechanisms by which SAS-5 and its relatives function is unclear. Here, we combine biophysical and structural analysis to uncover the architecture of SAS-5 and examine its functional implications in vivo. Our work reveals that two distinct self-associating domains are necessary to form higher-order oligomers of SAS-5: a trimeric coiled coil and a novel globular dimeric Implico domain. Disruption of either domain leads to centriole duplication failure in worm embryos, indicating that large SAS-5 assemblies are necessary for function in vivo.
Statistical Hot Channel Analysis for the NBSR
Energy Technology Data Exchange (ETDEWEB)
Cuadra A.; Baek J.
2014-05-27
A statistical analysis of thermal limits has been carried out for the research reactor (NBSR) at the National Institute of Standards and Technology (NIST). The objective of this analysis was to update the uncertainties of the hot channel factors with respect to previous analysis for both high-enriched uranium (HEU) and low-enriched uranium (LEU) fuels. Although uncertainties in key parameters which enter into the analysis are not yet known for the LEU core, the current analysis uses reasonable approximations instead of conservative estimates based on HEU values. Cumulative distribution functions (CDFs) were obtained for critical heat flux ratio (CHFR), and onset of flow instability ratio (OFIR). As was done previously, the Sudo-Kaminaga correlation was used for CHF and the Saha-Zuber correlation was used for OFI. Results were obtained for probability levels of 90%, 95%, and 99.9%. As an example of the analysis, the results for both the existing reactor with HEU fuel and the LEU core show that CHFR would have to be above 1.39 to assure with 95% probability that there is no CHF. For the OFIR, the results show that the ratio should be above 1.40 to assure with a 95% probability that OFI is not reached.
Statistical energy analysis of similarly coupled systems
Institute of Scientific and Technical Information of China (English)
ZHANG Jian
2002-01-01
Based on the principle of Statistical Energy Analysis (SEA) for non-conservatively coupled dynamical systems under non-correlative or correlative excitations, energy relationship between two similar SEA systems is established in the paper. The energy relationship is verified theoretically and experimentally from two similar SEA systems i.e., the structure of a coupled panel-beam and that of a coupled panel-sideframe, in the cases of conservative coupling and non-conservative coupling respectively. As an application of the method, relationship between noise power radiated from two similar cutting systems is studied. Results show that there are good agreements between the theory and the experiments, and the method is valuable to analysis of dynamical problems associated with a complicated system from that with a simple one.
STATISTICAL ANALYSIS OF PUBLIC ADMINISTRATION PAY
Directory of Open Access Journals (Sweden)
Elena I. Dobrolyubova
2014-01-01
Full Text Available This article reviews the progress achieved inimproving the pay system in public administration and outlines the key issues to be resolved.The cross-country comparisons presented inthe article suggest high differentiation in pay levels depending on position held. In fact,this differentiation in Russia exceeds one in OECD almost twofold The analysis of theinternal pay structure demonstrates that thelow share of the base pay leads to perversenature of ‘stimulation elements’ of the paysystem which in fact appear to be used mostlyfor compensation purposes. The analysis of regional statistical data demonstrates thatdespite high differentiation among regionsin terms of their revenue potential, averagepublic ofﬁcial pay is strongly correlated withthe average regional pay.
Wavelet and statistical analysis for melanoma classification
Nimunkar, Amit; Dhawan, Atam P.; Relue, Patricia A.; Patwardhan, Sachin V.
2002-05-01
The present work focuses on spatial/frequency analysis of epiluminesence images of dysplastic nevus and melanoma. A three-level wavelet decomposition was performed on skin-lesion images to obtain coefficients in the wavelet domain. A total of 34 features were obtained by computing ratios of the mean, variance, energy and entropy of the wavelet coefficients along with the mean and standard deviation of image intensity. An unpaired t-test for a normal distribution based features and the Wilcoxon rank-sum test for non-normal distribution based features were performed for selecting statistically correlated features. For our data set, the statistical analysis of features reduced the feature set from 34 to 5 features. For classification, the discriminant functions were computed in the feature space using the Mahanalobis distance. ROC curves were generated and evaluated for false positive fraction from 0.1 to 0.4. Most of the discrimination functions provided a true positive rate for melanoma of 93% with a false positive rate up to 21%.
DEFF Research Database (Denmark)
Milhøj, Anders
2012-01-01
I foråret 2010 blev den ny SAS version SAS 9.22 lanceret som en opdatering af SAS 9.2, med væsentlige nye muligheder for netop statistik. Her i sommeren 2011 blev version 9.3 så lanceret med endnu flere nyskabelser i både generel performance og specielt indenfor statistik, økonometri og...
Wu, Yazhou; Zhou, Liang; Li, Gaoming; Yi, Dali; Wu, Xiaojiao; Liu, Xiaoyu; Zhang, Yanqi; Liu, Ling; Yi, Dong
2015-01-01
Although a substantial number of studies focus on the teaching and application of medical statistics in China, few studies comprehensively evaluate the recognition of and demand for medical statistics. In addition, the results of these various studies differ and are insufficiently comprehensive and systematic. This investigation aimed to evaluate the general cognition of and demand for medical statistics by undergraduates, graduates, and medical staff in China. We performed a comprehensive database search related to the cognition of and demand for medical statistics from January 2007 to July 2014 and conducted a meta-analysis of non-controlled studies with sub-group analysis for undergraduates, graduates, and medical staff. There are substantial differences with respect to the cognition of theory in medical statistics among undergraduates (73.5%), graduates (60.7%), and medical staff (39.6%). The demand for theory in medical statistics is high among graduates (94.6%), undergraduates (86.1%), and medical staff (88.3%). Regarding specific statistical methods, the cognition of basic statistical methods is higher than of advanced statistical methods. The demand for certain advanced statistical methods, including (but not limited to) multiple analysis of variance (ANOVA), multiple linear regression, and logistic regression, is higher than that for basic statistical methods. The use rates of the Statistical Package for the Social Sciences (SPSS) software and statistical analysis software (SAS) are only 55% and 15%, respectively. The overall statistical competence of undergraduates, graduates, and medical staff is insufficient, and their ability to practically apply their statistical knowledge is limited, which constitutes an unsatisfactory state of affairs for medical statistics education. Because the demand for skills in this area is increasing, the need to reform medical statistics education in China has become urgent.
Li, Gaoming; Yi, Dali; Wu, Xiaojiao; Liu, Xiaoyu; Zhang, Yanqi; Liu, Ling; Yi, Dong
2015-01-01
Background Although a substantial number of studies focus on the teaching and application of medical statistics in China, few studies comprehensively evaluate the recognition of and demand for medical statistics. In addition, the results of these various studies differ and are insufficiently comprehensive and systematic. Objectives This investigation aimed to evaluate the general cognition of and demand for medical statistics by undergraduates, graduates, and medical staff in China. Methods We performed a comprehensive database search related to the cognition of and demand for medical statistics from January 2007 to July 2014 and conducted a meta-analysis of non-controlled studies with sub-group analysis for undergraduates, graduates, and medical staff. Results There are substantial differences with respect to the cognition of theory in medical statistics among undergraduates (73.5%), graduates (60.7%), and medical staff (39.6%). The demand for theory in medical statistics is high among graduates (94.6%), undergraduates (86.1%), and medical staff (88.3%). Regarding specific statistical methods, the cognition of basic statistical methods is higher than of advanced statistical methods. The demand for certain advanced statistical methods, including (but not limited to) multiple analysis of variance (ANOVA), multiple linear regression, and logistic regression, is higher than that for basic statistical methods. The use rates of the Statistical Package for the Social Sciences (SPSS) software and statistical analysis software (SAS) are only 55% and 15%, respectively. Conclusion The overall statistical competence of undergraduates, graduates, and medical staff is insufficient, and their ability to practically apply their statistical knowledge is limited, which constitutes an unsatisfactory state of affairs for medical statistics education. Because the demand for skills in this area is increasing, the need to reform medical statistics education in China has become urgent
Statistical analysis of tourism destination competitiveness
Directory of Open Access Journals (Sweden)
Attilio Gardini
2013-05-01
Full Text Available The growing relevance of tourism industry for modern advanced economies has increased the interest among researchers and policy makers in the statistical analysis of destination competitiveness. In this paper we outline a new model of destination competitiveness based on sound theoretical grounds and we develop a statistical test of the model on sample data based on Italian tourist destination decisions and choices. Our model focuses on the tourism decision process which starts from the demand schedule for holidays and ends with the choice of a specific holiday destination. The demand schedule is a function of individual preferences and of destination positioning, while the final decision is a function of the initial demand schedule and the information concerning services for accommodation and recreation in the selected destinations. Moreover, we extend previous studies that focused on image or attributes (such as climate and scenery by paying more attention to the services for accommodation and recreation in the holiday destinations. We test the proposed model using empirical data collected from a sample of 1.200 Italian tourists interviewed in 2007 (October - December. Data analysis shows that the selection probability for the destination included in the consideration set is not proportional to the share of inclusion because the share of inclusion is determined by the brand image, while the selection of the effective holiday destination is influenced by the real supply conditions. The analysis of Italian tourists preferences underline the existence of a latent demand for foreign holidays which points out a risk of market share reduction for Italian tourism system in the global market. We also find a snow ball effect which helps the most popular destinations, mainly in the northern Italian regions.
Multivariate statistical analysis of wildfires in Portugal
Costa, Ricardo; Caramelo, Liliana; Pereira, Mário
2013-04-01
Several studies demonstrate that wildfires in Portugal present high temporal and spatial variability as well as cluster behavior (Pereira et al., 2005, 2011). This study aims to contribute to the characterization of the fire regime in Portugal with the multivariate statistical analysis of the time series of number of fires and area burned in Portugal during the 1980 - 2009 period. The data used in the analysis is an extended version of the Rural Fire Portuguese Database (PRFD) (Pereira et al, 2011), provided by the National Forest Authority (Autoridade Florestal Nacional, AFN), the Portuguese Forest Service, which includes information for more than 500,000 fire records. There are many multiple advanced techniques for examining the relationships among multiple time series at the same time (e.g., canonical correlation analysis, principal components analysis, factor analysis, path analysis, multiple analyses of variance, clustering systems). This study compares and discusses the results obtained with these different techniques. Pereira, M.G., Trigo, R.M., DaCamara, C.C., Pereira, J.M.C., Leite, S.M., 2005: "Synoptic patterns associated with large summer forest fires in Portugal". Agricultural and Forest Meteorology. 129, 11-25. Pereira, M. G., Malamud, B. D., Trigo, R. M., and Alves, P. I.: The history and characteristics of the 1980-2005 Portuguese rural fire database, Nat. Hazards Earth Syst. Sci., 11, 3343-3358, doi:10.5194/nhess-11-3343-2011, 2011 This work is supported by European Union Funds (FEDER/COMPETE - Operational Competitiveness Programme) and by national funds (FCT - Portuguese Foundation for Science and Technology) under the project FCOMP-01-0124-FEDER-022692, the project FLAIR (PTDC/AAC-AMB/104702/2008) and the EU 7th Framework Program through FUME (contract number 243888).
Statistical analysis of sleep spindle occurrences.
Directory of Open Access Journals (Sweden)
Dagmara Panas
Full Text Available Spindles - a hallmark of stage II sleep - are a transient oscillatory phenomenon in the EEG believed to reflect thalamocortical activity contributing to unresponsiveness during sleep. Currently spindles are often classified into two classes: fast spindles, with a frequency of around 14 Hz, occurring in the centro-parietal region; and slow spindles, with a frequency of around 12 Hz, prevalent in the frontal region. Here we aim to establish whether the spindle generation process also exhibits spatial heterogeneity. Electroencephalographic recordings from 20 subjects were automatically scanned to detect spindles and the time occurrences of spindles were used for statistical analysis. Gamma distribution parameters were fit to each inter-spindle interval distribution, and a modified Wald-Wolfowitz lag-1 correlation test was applied. Results indicate that not all spindles are generated by the same statistical process, but this dissociation is not spindle-type specific. Although this dissociation is not topographically specific, a single generator for all spindle types appears unlikely.
Statistical Analysis of Bus Networks in India
Chatterjee, Atanu; Ramadurai, Gitakrishnan
2015-01-01
Through the past decade the field of network science has established itself as a common ground for the cross-fertilization of exciting inter-disciplinary studies which has motivated researchers to model almost every physical system as an interacting network consisting of nodes and links. Although public transport networks such as airline and railway networks have been extensively studied, the status of bus networks still remains in obscurity. In developing countries like India, where bus networks play an important role in day-to-day commutation, it is of significant interest to analyze its topological structure and answer some of the basic questions on its evolution, growth, robustness and resiliency. In this paper, we model the bus networks of major Indian cities as graphs in \\textit{L}-space, and evaluate their various statistical properties using concepts from network science. Our analysis reveals a wide spectrum of network topology with the common underlying feature of small-world property. We observe tha...
On two methods of statistical image analysis
Missimer, J; Knorr, U; Maguire, RP; Herzog, H; Seitz, RJ; Tellman, L; Leenders, KL
1999-01-01
The computerized brain atlas (CBA) and statistical parametric mapping (SPM) are two procedures for voxel-based statistical evaluation of PET activation studies. Each includes spatial standardization of image volumes, computation of a statistic, and evaluation of its significance. In addition, smooth
Intermediate statistics a modern approach
Stevens, James P
2007-01-01
Written for those who use statistical techniques, this text focuses on a conceptual understanding of the material. It uses definitional formulas on small data sets to provide conceptual insight into what is being measured. It emphasizes the assumptions underlying each analysis, and shows how to test the critical assumptions using SPSS or SAS.
SAS : kosheljok ili proshtshai, Estonian Air!
2008-01-01
Lennukompanii SAS saatis Eesti valitsusele kirja, milles teatab, et on nõus raskustesse sattunud Estonian Airile lisainvesteeringuid tegema ainult siis, kui valitsus müüb SAS-ile oma osaluse. Ettevõtjate, parlamendiliige Marko Mihkelsoni ja ministrite arvamusi
A SAS IML Macro for Loglinear Smoothing
Moses, Tim; von Davier, Alina
2011-01-01
Polynomial loglinear models for one-, two-, and higher-way contingency tables have important applications to measurement and assessment. They are essentially regarded as a smoothing technique, which is commonly referred to as loglinear smoothing. A SAS IML (SAS Institute, 2002a) macro was created to implement loglinear smoothing according to…
Development and validation of a smartphone addiction scale (SAS.
Directory of Open Access Journals (Sweden)
Min Kwon
Full Text Available OBJECTIVE: The aim of this study was to develop a self-diagnostic scale that could distinguish smartphone addicts based on the Korean self-diagnostic program for Internet addiction (K-scale and the smartphone's own features. In addition, the reliability and validity of the smartphone addiction scale (SAS was demonstrated. METHODS: A total of 197 participants were selected from Nov. 2011 to Jan. 2012 to accomplish a set of questionnaires, including SAS, K-scale, modified Kimberly Young Internet addiction test (Y-scale, visual analogue scale (VAS, and substance dependence and abuse diagnosis of DSM-IV. There were 64 males and 133 females, with ages ranging from 18 to 53 years (M = 26.06; SD = 5.96. Factor analysis, internal-consistency test, t-test, ANOVA, and correlation analysis were conducted to verify the reliability and validity of SAS. RESULTS: Based on the factor analysis results, the subscale "disturbance of reality testing" was removed, and six factors were left. The internal consistency and concurrent validity of SAS were verified (Cronbach's alpha = 0.967. SAS and its subscales were significantly correlated with K-scale and Y-scale. The VAS of each factor also showed a significant correlation with each subscale. In addition, differences were found in the job (p<0.05, education (p<0.05, and self-reported smartphone addiction scores (p<0.001 in SAS. CONCLUSIONS: This study developed the first scale of the smartphone addiction aspect of the diagnostic manual. This scale was proven to be relatively reliable and valid.
Statistics Analysis Measures Painting of Cooling Tower
Directory of Open Access Journals (Sweden)
A. Zacharopoulou
2013-01-01
Full Text Available This study refers to the cooling tower of Megalopolis (construction 1975 and protection from corrosive environment. The maintenance of the cooling tower took place in 2008. The cooling tower was badly damaged from corrosion of reinforcement. The parabolic cooling towers (factory of electrical power are a typical example of construction, which has a special aggressive environment. The protection of cooling towers is usually achieved through organic coatings. Because of the different environmental impacts on the internal and external side of the cooling tower, a different system of paint application is required. The present study refers to the damages caused by corrosion process. The corrosive environments, the application of this painting, the quality control process, the measures and statistics analysis, and the results were discussed in this study. In the process of quality control the following measurements were taken into consideration: (1 examination of the adhesion with the cross-cut test, (2 examination of the film thickness, and (3 controlling of the pull-off resistance for concrete substrates and paintings. Finally, this study refers to the correlations of measurements, analysis of failures in relation to the quality of repair, and rehabilitation of the cooling tower. Also this study made a first attempt to apply the specific corrosion inhibitors in such a large structure.
An R package for statistical provenance analysis
Vermeesch, Pieter; Resentini, Alberto; Garzanti, Eduardo
2016-05-01
This paper introduces provenance, a software package within the statistical programming environment R, which aims to facilitate the visualisation and interpretation of large amounts of sedimentary provenance data, including mineralogical, petrographic, chemical and isotopic provenance proxies, or any combination of these. provenance comprises functions to: (a) calculate the sample size required to achieve a given detection limit; (b) plot distributional data such as detrital zircon U-Pb age spectra as Cumulative Age Distributions (CADs) or adaptive Kernel Density Estimates (KDEs); (c) plot compositional data as pie charts or ternary diagrams; (d) correct the effects of hydraulic sorting on sandstone petrography and heavy mineral composition; (e) assess the settling equivalence of detrital minerals and grain-size dependence of sediment composition; (f) quantify the dissimilarity between distributional data using the Kolmogorov-Smirnov and Sircombe-Hazelton distances, or between compositional data using the Aitchison and Bray-Curtis distances; (e) interpret multi-sample datasets by means of (classical and nonmetric) Multidimensional Scaling (MDS) and Principal Component Analysis (PCA); and (f) simplify the interpretation of multi-method datasets by means of Generalised Procrustes Analysis (GPA) and 3-way MDS. All these tools can be accessed through an intuitive query-based user interface, which does not require knowledge of the R programming language. provenance is free software released under the GPL-2 licence and will be further expanded based on user feedback.
The SAS-3 delayed command system
Hoffman, E. J.
1975-01-01
To meet the requirements arising from the increased complexity of the power, attitude control and telemetry systems, a full redundant high-performance control section with delayed command capability was designed for the Small Astronomy Satellite-3 (SAS-3). The relay command system of SAS-3 is characterized by 56 bystate relay commands, with capability for handling up to 64 commands in future versions. The 'short' data command service of SAS-1 and SAS-2 consisting of shifting 24-bit words to two users was expanded to five users and augmented with a 'long load' data command service (up to 4080 bits) used to program the telemetry system and the delayed command subsystem. The inclusion of a delayed command service ensures a program of up to 30 relay or short data commands to be loaded for execution at designated times. The design and system operation of the SAS-3 command section are analyzed, with special attention given to the delayed command subsystem.
The SAS-3 programmable telemetry system
Peterson, M. R.
1975-01-01
Basic concept, system design and operation principles of the telemetry system developed for the Small Astronomy Satellite-3 (SAS-3) are analyzed. The concept of programmable format selected for the SAS-3 represents an optical combination of the fixed format system of SAS-1 and SAS-2, and the adaptive format concept. The programmable telemetry system permits a very wide range of changes in the data sampling order by a ground control station, depending on the experimental requirements, so that the maximal amount of useful data can be returned from orbit. The programmable system also allows the data format to differ from one spacecraft to another without changing hardware. Attention is given to the command requirements and redundancy of the SAS-3 telemetry system.
SAS-2 galactic gamma-ray results. 1: Diffuse emission
Thompson, D. J.; Fichtel, C. E.; Hartman, R. C.; Kniffen, D. A.; Bignami, G. F.; Lamb, R. C.; Oegelman, H.; Oezel, M. E.; Tuemer, T.
1977-01-01
Continuing analysis of the data from the SAS-2 high energy gamma ray experiment has produced an improved picture of the sky at photon energies above 35 MeV. On a large scale, the diffuse emission from the galactic plane is the dominant feature observed by SAS-2. This galactic plane emission is most intense between galactic longitudes 310 deg and 45 deg, corresponding to a region within 7 kpc of the galactic center. Within the high-intensity region, SAS-2 observes peaks around galactic longitudes 315, 330, 345, 0, and 35 deg. These peaks appear to be correlated with galactic features and components such as molecular hydrogen, atomic hydrogen, magnetic fields, cosmic-ray concentrations, and photon fields.
PSHREG: a SAS macro for proportional and nonproportional subdistribution hazards regression.
Kohl, Maria; Plischke, Max; Leffondré, Karen; Heinze, Georg
2015-02-01
We present a new SAS macro %pshreg that can be used to fit a proportional subdistribution hazards model for survival data subject to competing risks. Our macro first modifies the input data set appropriately and then applies SAS's standard Cox regression procedure, PROC PHREG, using weights and counting-process style of specifying survival times to the modified data set. The modified data set can also be used to estimate cumulative incidence curves for the event of interest. The application of PROC PHREG has several advantages, e.g., it directly enables the user to apply the Firth correction, which has been proposed as a solution to the problem of undefined (infinite) maximum likelihood estimates in Cox regression, frequently encountered in small sample analyses. Deviation from proportional subdistribution hazards can be detected by both inspecting Schoenfeld-type residuals and testing correlation of these residuals with time, or by including interactions of covariates with functions of time. We illustrate application of these extended methods for competing risk regression using our macro, which is freely available at: http://cemsiis.meduniwien.ac.at/en/kb/science-research/software/statistical-software/pshreg, by means of analysis of a real chronic kidney disease study. We discuss differences in features and capabilities of %pshreg and the recent (January 2014) SAS PROC PHREG implementation of proportional subdistribution hazards modelling.
Statistical Analysis of Bus Networks in India
2016-01-01
In this paper, we model the bus networks of six major Indian cities as graphs in L-space, and evaluate their various statistical properties. While airline and railway networks have been extensively studied, a comprehensive study on the structure and growth of bus networks is lacking. In India, where bus transport plays an important role in day-to-day commutation, it is of significant interest to analyze its topological structure and answer basic questions on its evolution, growth, robustness and resiliency. Although the common feature of small-world property is observed, our analysis reveals a wide spectrum of network topologies arising due to significant variation in the degree-distribution patterns in the networks. We also observe that these networks although, robust and resilient to random attacks are particularly degree-sensitive. Unlike real-world networks, such as Internet, WWW and airline, that are virtual, bus networks are physically constrained. Our findings therefore, throw light on the evolution of such geographically and constrained networks that will help us in designing more efficient bus networks in the future. PMID:27992590
A statistical trend analysis of ozonesonde data
Tiao, G. C.; Pedrick, J. H.; Allenby, G. M.; Reinsel, G. C.; Mateer, C. L.
1986-01-01
A detailed statistical analysis of monthly averages of ozonesond readings is performed to assess trends in ozone in the troposphere and the lower to midstratosphere. Regression time series models, which include seasonal and trend factors, are estimated for 13 stations located mainly in the midlatitudes of the Northern Hemisphere. At each station, trend estimates are calculated for 14 'fractional' Umkehr layers covering the altitude range from 0 to 33 km. For the 1970-1982 period, the main findings indicate an overall negative trend in ozonesonde data in the lower stratosphere (15-21 km) of about -0.5 percent per year, and some evidence of a positive trend in the troposphere (0-5 km) of about 0.8 percent per year. An in-depth sensitivity study of the trend estimates is performed with respect to various correction procedures used to normalize ozonesonde readings to Dobson total ozone measurements. The main results indicate that the negative trend findings in the 15- to 21-km altitude region are robust to the normalization procedures considered.
Web-Based Statistical Sampling and Analysis
Quinn, Anne; Larson, Karen
2016-01-01
Consistent with the Common Core State Standards for Mathematics (CCSSI 2010), the authors write that they have asked students to do statistics projects with real data. To obtain real data, their students use the free Web-based app, Census at School, created by the American Statistical Association (ASA) to help promote civic awareness among school…
Developments in statistical analysis in quantitative genetics
DEFF Research Database (Denmark)
Sorensen, Daniel
2009-01-01
A remarkable research impetus has taken place in statistical genetics since the last World Conference. This has been stimulated by breakthroughs in molecular genetics, automated data-recording devices and computer-intensive statistical methods. The latter were revolutionized by the bootstrap and ...
Statistical Analysis of Data for Timber Strengths
DEFF Research Database (Denmark)
Sørensen, John Dalsgaard
2003-01-01
. The statistical fits have generally been made using all data and the lower tail of the data. The Maximum Likelihood Method and the Least Square Technique have been used to estimate the statistical parameters in the selected distributions. The results show that the 2-parameter Weibull distribution gives the best...
Statistical network analysis for analyzing policy networks
DEFF Research Database (Denmark)
Robins, Garry; Lewis, Jenny; Wang, Peng
2012-01-01
and policy network methodology is the development of statistical modeling approaches that can accommodate such dependent data. In this article, we review three network statistical methods commonly used in the current literature: quadratic assignment procedures, exponential random graph models (ERGMs...... has much to offer in analyzing the policy process....
Time Series Analysis Based on Running Mann Whitney Z Statistics
A sensitive and objective time series analysis method based on the calculation of Mann Whitney U statistics is described. This method samples data rankings over moving time windows, converts those samples to Mann-Whitney U statistics, and then normalizes the U statistics to Z statistics using Monte-...
Statistical Analysis of Data for Timber Strengths
DEFF Research Database (Denmark)
Sørensen, John Dalsgaard; Hoffmeyer, P.
Statistical analyses are performed for material strength parameters from approximately 6700 specimens of structural timber. Non-parametric statistical analyses and fits to the following distributions types have been investigated: Normal, Lognormal, 2 parameter Weibull and 3-parameter Weibull....... The statistical fits have generally been made using all data (100%) and the lower tail (30%) of the data. The Maximum Likelihood Method and the Least Square Technique have been used to estimate the statistical parameters in the selected distributions. 8 different databases are analysed. The results show that 2......-parameter Weibull (and Normal) distributions give the best fits to the data available, especially if tail fits are used whereas the LogNormal distribution generally gives poor fit and larger coefficients of variation, especially if tail fits are used....
The Statistical Analysis Techniques to Support the NGNP Fuel Performance Experiments
Energy Technology Data Exchange (ETDEWEB)
Bihn T. Pham; Jeffrey J. Einerson
2010-06-01
This paper describes the development and application of statistical analysis techniques to support the AGR experimental program on NGNP fuel performance. The experiments conducted in the Idaho National Laboratory’s Advanced Test Reactor employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule. The tests are instrumented with thermocouples embedded in graphite blocks and the target quantity (fuel/graphite temperature) is regulated by the He-Ne gas mixture that fills the gap volume. Three techniques for statistical analysis, namely control charting, correlation analysis, and regression analysis, are implemented in the SAS-based NGNP Data Management and Analysis System (NDMAS) for automated processing and qualification of the AGR measured data. The NDMAS also stores daily neutronic (power) and thermal (heat transfer) code simulation results along with the measurement data, allowing for their combined use and comparative scrutiny. The ultimate objective of this work includes (a) a multi-faceted system for data monitoring and data accuracy testing, (b) identification of possible modes of diagnostics deterioration and changes in experimental conditions, (c) qualification of data for use in code validation, and (d) identification and use of data trends to support effective control of test conditions with respect to the test target. Analysis results and examples given in the paper show the three statistical analysis techniques providing a complementary capability to warn of thermocouple failures. It also suggests that the regression analysis models relating calculated fuel temperatures and thermocouple readings can enable online regulation of experimental parameters (i.e. gas mixture content), to effectively maintain the target quantity (fuel temperature) within a given range.
Key Performance Indicators for SAS Flights
Arhall, Johanna; Cox, Emmie
2013-01-01
Revenue management is a thoroughly researched field of study and it is widely used in several different industries. The Revenue Management Department at the airline SAS (Scandinavian Airline System) serves to maximise the profit of the company’s flights. At their disposal they have a number of tools, which use KPIs (Key Performance Indicators) as a measurement. The KPIs are used in prognosis to determine future initiatives, and to analyse and verify results. SAS does not know if the KPIs they...
Statistical models and methods for reliability and survival analysis
Couallier, Vincent; Huber-Carol, Catherine; Mesbah, Mounir; Huber -Carol, Catherine; Limnios, Nikolaos; Gerville-Reache, Leo
2013-01-01
Statistical Models and Methods for Reliability and Survival Analysis brings together contributions by specialists in statistical theory as they discuss their applications providing up-to-date developments in methods used in survival analysis, statistical goodness of fit, stochastic processes for system reliability, amongst others. Many of these are related to the work of Professor M. Nikulin in statistics over the past 30 years. The authors gather together various contributions with a broad array of techniques and results, divided into three parts - Statistical Models and Methods, Statistical
Statistical Analysis Of Data Sets Legislative Type
Directory of Open Access Journals (Sweden)
Gheorghe Săvoiu
2013-06-01
Full Text Available This paper identifies some characteristic statistical aspects of the annual legislation’s dynamics and structure in the socio-economic system that had defined Romania, over the last two decades. After a brief introduction devoted to the concepts of social and economic system (SES and societal computerized management (SCM in Romania, first section describes the indicators, the specific database and the investigative method and a second section presents some descriptive statistics on the suggestive abnormality of the data series on the legislation of the last 20 years. A final remark underlines the difficult context of Romania’s legislative adjustment to EU requirements.
Statistical analysis of protein kinase specificity determinants
DEFF Research Database (Denmark)
Kreegipuu, Andres; Blom, Nikolaj; Brunak, Søren;
1998-01-01
The site and sequence specificity of protein kinase, as well as the role of the secondary structure and surface accessibility of the phosphorylation sites on substrate proteins, was statistically analyzed. The experimental data were collected from the literature and are available on the World Wide...
Statistical Analysis in Dental Research Papers.
1983-08-08
Clinical Trials of Agents used in the Prevention and Treatment of Periodontal Diseases (5). Standards sumary data must be included statistical methods...situations under clinical conditions. Research: qualitative research, as in joint tomography , where the results and conclusions are not amenable to
Directory of Open Access Journals (Sweden)
Priya Ranganathan
2015-01-01
Full Text Available In the second part of a series on pitfalls in statistical analysis, we look at various ways in which a statistically significant study result can be expressed. We debunk some of the myths regarding the ′P′ value, explain the importance of ′confidence intervals′ and clarify the importance of including both values in a paper
Ranganathan, Priya; Pramesh, C. S.; Buyse, Marc
2015-01-01
In the second part of a series on pitfalls in statistical analysis, we look at various ways in which a statistically significant study result can be expressed. We debunk some of the myths regarding the ‘P’ value, explain the importance of ‘confidence intervals’ and clarify the importance of including both values in a paper PMID:25878958
Notes on numerical reliability of several statistical analysis programs
Landwehr, J.M.; Tasker, Gary D.
1999-01-01
This report presents a benchmark analysis of several statistical analysis programs currently in use in the USGS. The benchmark consists of a comparison between the values provided by a statistical analysis program for variables in the reference data set ANASTY and their known or calculated theoretical values. The ANASTY data set is an amendment of the Wilkinson NASTY data set that has been used in the statistical literature to assess the reliability (computational correctness) of calculated analytical results.
Fundamentals of statistical experimental design and analysis
Easterling, Robert G
2015-01-01
Professionals in all areas - business; government; the physical, life, and social sciences; engineering; medicine, etc. - benefit from using statistical experimental design to better understand their worlds and then use that understanding to improve the products, processes, and programs they are responsible for. This book aims to provide the practitioners of tomorrow with a memorable, easy to read, engaging guide to statistics and experimental design. This book uses examples, drawn from a variety of established texts, and embeds them in a business or scientific context, seasoned with a dash of humor, to emphasize the issues and ideas that led to the experiment and the what-do-we-do-next? steps after the experiment. Graphical data displays are emphasized as means of discovery and communication and formulas are minimized, with a focus on interpreting the results that software produce. The role of subject-matter knowledge, and passion, is also illustrated. The examples do not require specialized knowledge, and t...
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2015-02-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: (1) P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. (2) Overemphasis on P values rather than on the actual size of the observed effect. (3) Overuse of statistical hypothesis testing, and being seduced by the word "significant". (4) Overreliance on standard errors, which are often misunderstood.
SAS doctors career progression survey 2013.
Oroz, Carlos; Sands, Lorna R; Lee, John
2016-03-01
We conducted a national survey of Staff, Associate Specialists and Specialty (SAS) doctors working in sexual health clinics in the UK in 2013 in order to explore their career progression. The aim of the survey was to assess SAS doctors' experience in passing through the thresholds and to gather information about the adherence by SAS doctors and employers to the terms and conditions of service laid out by the new 2008 contract. Out of 185 responders, whom the authors estimate comprise 34% of the total workforce, 159 were on the new contract. Of those, most SAS doctors were women (84%), the majority (67%) worked less than nine programmed activities per week; only a few had intentions to join the consultant grade (15%), and a considerable minority (26%) were older than 54 years of age and likely to retire in the next ten years. The survey showed that most participating SAS doctors had undergone appraisal in the previous 15 months (90%), most had a job planning discussion (83%) with their employer and most had some allocated time for supporting professional activities (86%). However, a significant minority had no appraisal (10%), no job planning discussion (17%) and had no allocated supporting professional activities (14%), which allows time for career development in the specialty. Most SAS doctors, who had the opportunity, had progressed through the thresholds automatically (88%); some experienced difficulties in passing (8%) and only a few did not pass (4%). SAS doctors must ensure that they work together with their employer in order to improve adherence to the terms and conditions of service of the contract, which allow for career progression and benefit both the individual doctors and ultimately service provision.
Critical analysis of adsorption data statistically
Kaushal, Achla; Singh, S. K.
2016-09-01
Experimental data can be presented, computed, and critically analysed in a different way using statistics. A variety of statistical tests are used to make decisions about the significance and validity of the experimental data. In the present study, adsorption was carried out to remove zinc ions from contaminated aqueous solution using mango leaf powder. The experimental data was analysed statistically by hypothesis testing applying t test, paired t test and Chi-square test to (a) test the optimum value of the process pH, (b) verify the success of experiment and (c) study the effect of adsorbent dose in zinc ion removal from aqueous solutions. Comparison of calculated and tabulated values of t and χ 2 showed the results in favour of the data collected from the experiment and this has been shown on probability charts. K value for Langmuir isotherm was 0.8582 and m value for Freundlich adsorption isotherm obtained was 0.725, both are Pearson's correlation coefficient values for Langmuir and Freundlich adsorption isotherms were obtained as 0.99 and 0.95 respectively, which show higher degree of correlation between the variables. This validates the data obtained for adsorption of zinc ions from the contaminated aqueous solution with the help of mango leaf powder.
SAS- Semantic Annotation Service for Geoscience resources on the web
Elag, M.; Kumar, P.; Marini, L.; Li, R.; Jiang, P.
2015-12-01
There is a growing need for increased integration across the data and model resources that are disseminated on the web to advance their reuse across different earth science applications. Meaningful reuse of resources requires semantic metadata to realize the semantic web vision for allowing pragmatic linkage and integration among resources. Semantic metadata associates standard metadata with resources to turn them into semantically-enabled resources on the web. However, the lack of a common standardized metadata framework as well as the uncoordinated use of metadata fields across different geo-information systems, has led to a situation in which standards and related Standard Names abound. To address this need, we have designed SAS to provide a bridge between the core ontologies required to annotate resources and information systems in order to enable queries and analysis over annotation from a single environment (web). SAS is one of the services that are provided by the Geosematnic framework, which is a decentralized semantic framework to support the integration between models and data and allow semantically heterogeneous to interact with minimum human intervention. Here we present the design of SAS and demonstrate its application for annotating data and models. First we describe how predicates and their attributes are extracted from standards and ingested in the knowledge-base of the Geosemantic framework. Then we illustrate the application of SAS in annotating data managed by SEAD and annotating simulation models that have web interface. SAS is a step in a broader approach to raise the quality of geoscience data and models that are published on the web and allow users to better search, access, and use of the existing resources based on standard vocabularies that are encoded and published using semantic technologies.
Mayer, W. F.
1975-01-01
The experiment section of the Small Astronomy Satellite-3 (SAS-3) launched in May 1975 is an X-ray observatory intended to determine the location of bright X-ray sources to an accuracy of 15 arc-seconds; to study a selected set of sources over a wide energy range, from 0.1 to 55 keV, while performing very specific measurements of the spectra and time variability of known X-ray sources; and to monitor the sky continuously for X-ray novae, flares, and unexpected phenomena. The improvements in SAS-3 spacecraft include a clock accurate to 1 part in 10 billion, rotatable solar panels, a programmable data format, and improved nutation damper, a delayed command system, improved magnetic trim and azimuth control systems. These improvements enable SAS-3 to perform three-axis stabilized observations of any point on the celestial sphere at any time of the year. The description of the experiment section and the SAS-3 operation is followed by a synopsis of scientific results obtained from the observations of X-ray sources, such as Vela X-1 (supposed to be an accreting neutron star), a transient source of hard X-ray (less than 36 min in duration) detected by SAS-3, the Crab Nebula pulsar, the Perseus cluster of galaxies, and the Vela supernova remnant.
Hayslett, H T
1991-01-01
Statistics covers the basic principles of Statistics. The book starts by tackling the importance and the two kinds of statistics; the presentation of sample data; the definition, illustration and explanation of several measures of location; and the measures of variation. The text then discusses elementary probability, the normal distribution and the normal approximation to the binomial. Testing of statistical hypotheses and tests of hypotheses about the theoretical proportion of successes in a binomial population and about the theoretical mean of a normal population are explained. The text the
Statistical analysis of life history calendar data.
Eerola, Mervi; Helske, Satu
2016-04-01
The life history calendar is a data-collection tool for obtaining reliable retrospective data about life events. To illustrate the analysis of such data, we compare the model-based probabilistic event history analysis and the model-free data mining method, sequence analysis. In event history analysis, we estimate instead of transition hazards the cumulative prediction probabilities of life events in the entire trajectory. In sequence analysis, we compare several dissimilarity metrics and contrast data-driven and user-defined substitution costs. As an example, we study young adults' transition to adulthood as a sequence of events in three life domains. The events define the multistate event history model and the parallel life domains in multidimensional sequence analysis. The relationship between life trajectories and excess depressive symptoms in middle age is further studied by their joint prediction in the multistate model and by regressing the symptom scores on individual-specific cluster indices. The two approaches complement each other in life course analysis; sequence analysis can effectively find typical and atypical life patterns while event history analysis is needed for causal inquiries.
Statistical analysis of Contact Angle Hysteresis
Janardan, Nachiketa; Panchagnula, Mahesh
2015-11-01
We present the results of a new statistical approach to determining Contact Angle Hysteresis (CAH) by studying the nature of the triple line. A statistical distribution of local contact angles on a random three-dimensional drop is used as the basis for this approach. Drops with randomly shaped triple lines but of fixed volumes were deposited on a substrate and their triple line shapes were extracted by imaging. Using a solution developed by Prabhala et al. (Langmuir, 2010), the complete three dimensional shape of the sessile drop was generated. A distribution of the local contact angles for several such drops but of the same liquid-substrate pairs is generated. This distribution is a result of several microscopic advancing and receding processes along the triple line. This distribution is used to yield an approximation of the CAH associated with the substrate. This is then compared with measurements of CAH by means of a liquid infusion-withdrawal experiment. Static measurements are shown to be sufficient to measure quasistatic contact angle hysteresis of a substrate. The approach also points towards the relationship between microscopic triple line contortions and CAH.
Book review: Statistical Analysis and Modelling of Spatial Point Patterns
DEFF Research Database (Denmark)
Møller, Jesper
2009-01-01
Statistical Analysis and Modelling of Spatial Point Patterns by J. Illian, A. Penttinen, H. Stoyan and D. Stoyan. Wiley (2008), ISBN 9780470014912......Statistical Analysis and Modelling of Spatial Point Patterns by J. Illian, A. Penttinen, H. Stoyan and D. Stoyan. Wiley (2008), ISBN 9780470014912...
Statistical Modelling of Wind Proles - Data Analysis and Modelling
DEFF Research Database (Denmark)
Jónsson, Tryggvi; Pinson, Pierre
The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles.......The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles....
Statistical methods for categorical data analysis
Powers, Daniel
2008-01-01
This book provides a comprehensive introduction to methods and models for categorical data analysis and their applications in social science research. Companion website also available, at https://webspace.utexas.edu/dpowers/www/
Statistical analysis: the need, the concept, and the usage
Directory of Open Access Journals (Sweden)
Naduvilath Thomas
1998-01-01
Full Text Available In general, better understanding of the need and usage of statistics would benefit the medical community in India. This paper explains why statistical analysis is needed, and what is the conceptual basis for it. Ophthalmic data are used as examples. The concept of sampling variation is explained to further corroborate the need for statistical analysis in medical research. Statistical estimation and testing of hypothesis which form the major components of statistical inference are construed. Commonly reported univariate and multivariate statistical tests are explained in order to equip the ophthalmologist with basic knowledge of statistics for better understanding of research data. It is felt that this understanding would facilitate well designed investigations ultimately leading to higher quality practice of ophthalmology in our country.
SAS-1 is a C2 domain protein critical for centriole integrity in C. elegans.
von Tobel, Lukas; Mikeladze-Dvali, Tamara; Delattre, Marie; Balestra, Fernando R; Blanchoud, Simon; Finger, Susanne; Knott, Graham; Müller-Reichert, Thomas; Gönczy, Pierre
2014-11-01
Centrioles are microtubule-based organelles important for the formation of cilia, flagella and centrosomes. Despite progress in understanding the underlying assembly mechanisms, how centriole integrity is ensured is incompletely understood, including in sperm cells, where such integrity is particularly critical. We identified C. elegans sas-1 in a genetic screen as a locus required for bipolar spindle assembly in the early embryo. Our analysis reveals that sperm-derived sas-1 mutant centrioles lose their integrity shortly after fertilization, and that a related defect occurs when maternal sas-1 function is lacking. We establish that sas-1 encodes a C2 domain containing protein that localizes to centrioles in C. elegans, and which can bind and stabilize microtubules when expressed in human cells. Moreover, we uncover that SAS-1 is related to C2CD3, a protein required for complete centriole formation in human cells and affected in a type of oral-facial-digital (OFD) syndrome.
SAS-1 is a C2 domain protein critical for centriole integrity in C. elegans.
Directory of Open Access Journals (Sweden)
Lukas von Tobel
2014-11-01
Full Text Available Centrioles are microtubule-based organelles important for the formation of cilia, flagella and centrosomes. Despite progress in understanding the underlying assembly mechanisms, how centriole integrity is ensured is incompletely understood, including in sperm cells, where such integrity is particularly critical. We identified C. elegans sas-1 in a genetic screen as a locus required for bipolar spindle assembly in the early embryo. Our analysis reveals that sperm-derived sas-1 mutant centrioles lose their integrity shortly after fertilization, and that a related defect occurs when maternal sas-1 function is lacking. We establish that sas-1 encodes a C2 domain containing protein that localizes to centrioles in C. elegans, and which can bind and stabilize microtubules when expressed in human cells. Moreover, we uncover that SAS-1 is related to C2CD3, a protein required for complete centriole formation in human cells and affected in a type of oral-facial-digital (OFD syndrome.
Effects of 2D and 3D Error Fields on the SAS Divertor Magnetic Topology
Trevisan, G. L.; Lao, L. L.; Strait, E. J.; Guo, H. Y.; Wu, W.; Evans, T. E.
2016-10-01
The successful design of plasma-facing components in fusion experiments is of paramount importance in both the operation of future reactors and in the modification of operating machines. Indeed, the Small Angle Slot (SAS) divertor concept, proposed for application on the DIII-D experiment, combines a small incident angle at the plasma strike point with a progressively opening slot, so as to better control heat flux and erosion in high-performance tokamak plasmas. Uncertainty quantification of the error fields expected around the striking point provides additional useful information in both the design and the modeling phases of the new divertor, in part due to the particular geometric requirement of the striking flux surfaces. The presented work involves both 2D and 3D magnetic error field analysis on the SAS strike point carried out using the EFIT code for 2D equilibrium reconstruction, V3POST for vacuum 3D computations and the OMFIT integrated modeling framework for data analysis. An uncertainty in the magnetic probes' signals is found to propagate non-linearly as an uncertainty in the striking point and angle, which can be quantified through statistical analysis to yield robust estimates. Work supported by contracts DE-FG02-95ER54309 and DE-FC02-04ER54698.
Statistical analysis of concrete quality testing results
Directory of Open Access Journals (Sweden)
Jevtić Dragica
2014-01-01
Full Text Available This paper statistically investigates the testing results of compressive strength and density of control concrete specimens tested in the Laboratory for materials, Faculty of Civil Engineering, University of Belgrade, during 2012. The total number of 4420 concrete specimens were tested, which were sampled on different locations - either on concrete production site (concrete plant, or concrete placement location (construction site. To be exact, these samples were made of concrete which was produced on 15 concrete plants, i.e. placed in at 50 different reinforced concrete structures, built during 2012 by 22 different contractors. It is a known fact that the achieved values of concrete compressive strength are very important, both for quality and durability assessment of concrete inside the structural elements, as well as for calculation of their load-bearing capacity limit. Together with the compressive strength testing results, the data concerning requested (designed concrete class, matching between the designed and the achieved concrete quality, concrete density values and frequency of execution of concrete works during 2012 were analyzed.
Links to sources of cancer-related statistics, including the Surveillance, Epidemiology and End Results (SEER) Program, SEER-Medicare datasets, cancer survivor prevalence data, and the Cancer Trends Progress Report.
Statistical Smoothing Methods and Image Analysis
1988-12-01
83 - 111. Rosenfeld, A. and Kak, A.C. (1982). Digital Picture Processing. Academic Press,Qrlando. Serra, J. (1982). Image Analysis and Mat hematical ...hypothesis testing. IEEE Trans. Med. Imaging, MI-6, 313-319. Wicksell, S.D. (1925) The corpuscle problem. A mathematical study of a biometric problem
Statistical inference of Minimum Rank Factor Analysis
Shapiro, A; Ten Berge, JMF
2002-01-01
For any given number of factors, Minimum Rank Factor Analysis yields optimal communalities for an observed covariance matrix in the sense that the unexplained common variance with that number of factors is minimized, subject to the constraint that both the diagonal matrix of unique variances and the
Statistical inference of Minimum Rank Factor Analysis
Shapiro, A; Ten Berge, JMF
For any given number of factors, Minimum Rank Factor Analysis yields optimal communalities for an observed covariance matrix in the sense that the unexplained common variance with that number of factors is minimized, subject to the constraint that both the diagonal matrix of unique variances and the
The Statistical Analysis of Failure Time Data
Kalbfleisch, John D
2011-01-01
Contains additional discussion and examples on left truncation as well as material on more general censoring and truncation patterns.Introduces the martingale and counting process formulation swil lbe in a new chapter.Develops multivariate failure time data in a separate chapter and extends the material on Markov and semi Markov formulations.Presents new examples and applications of data analysis.
Zheng, Jie; Harris, Marcelline R; Masci, Anna Maria; Lin, Yu; Hero, Alfred; Smith, Barry; He, Yongqun
2016-09-14
Statistics play a critical role in biological and clinical research. However, most reports of scientific results in the published literature make it difficult for the reader to reproduce the statistical analyses performed in achieving those results because they provide inadequate documentation of the statistical tests and algorithms applied. The Ontology of Biological and Clinical Statistics (OBCS) is put forward here as a step towards solving this problem. The terms in OBCS including 'data collection', 'data transformation in statistics', 'data visualization', 'statistical data analysis', and 'drawing a conclusion based on data', cover the major types of statistical processes used in basic biological research and clinical outcome studies. OBCS is aligned with the Basic Formal Ontology (BFO) and extends the Ontology of Biomedical Investigations (OBI), an OBO (Open Biological and Biomedical Ontologies) Foundry ontology supported by over 20 research communities. Currently, OBCS comprehends 878 terms, representing 20 BFO classes, 403 OBI classes, 229 OBCS specific classes, and 122 classes imported from ten other OBO ontologies. We discuss two examples illustrating how the ontology is being applied. In the first (biological) use case, we describe how OBCS was applied to represent the high throughput microarray data analysis of immunological transcriptional profiles in human subjects vaccinated with an influenza vaccine. In the second (clinical outcomes) use case, we applied OBCS to represent the processing of electronic health care data to determine the associations between hospital staffing levels and patient mortality. Our case studies were designed to show how OBCS can be used for the consistent representation of statistical analysis pipelines under two different research paradigms. Other ongoing projects using OBCS for statistical data processing are also discussed. The OBCS source code and documentation are available at: https://github.com/obcs/obcs . The Ontology
Wolff, Hans-Georg; Preising, Katja
2005-02-01
To ease the interpretation of higher order factor analysis, the direct relationships between variables and higher order factors may be calculated by the Schmid-Leiman solution (SLS; Schmid & Leiman, 1957). This simple transformation of higher order factor analysis orthogonalizes first-order and higher order factors and thereby allows the interpretation of the relative impact of factor levels on variables. The Schmid-Leiman solution may also be used to facilitate theorizing and scale development. The rationale for the procedure is presented, supplemented by syntax codes for SPSS and SAS, since the transformation is not part of most statistical programs. Syntax codes may also be downloaded from www.psychonomic.org/archive/.
Statistic analysis of millions of digital photos
Wueller, Dietmar; Fageth, Reiner
2008-02-01
The analysis of images has always been an important aspect in the quality enhancement of photographs and photographic equipment. Due to the lack of meta data it was mostly limited to images taken by experts under predefined conditions and the analysis was also done by experts or required psychophysical tests. With digital photography and the EXIF1 meta data stored in the images, a lot of information can be gained from a semiautomatic or automatic image analysis if one has access to a large number of images. Although home printing is becoming more and more popular, the European market still has a few photofinishing companies who have access to a large number of images. All printed images are stored for a certain period of time adding up to several million images on servers every day. We have utilized the images to answer numerous questions and think that these answers are useful for increasing image quality by optimizing the image processing algorithms. Test methods can be modified to fit typical user conditions and future developments can be pointed towards ideal directions.
SAS wave experiment on board Magion 4
Directory of Open Access Journals (Sweden)
J. Błęcki
Full Text Available A short description of the SAS (subsatellite analyser of spectra wave experiment on board the Magion-4 subsatellite is given. We present first measurements of the magnetic-field fluctuations in the frequency range 32–2000 Hz obtained in the magnetotail during the disturbed period at the magnetopause and in the polar cusp.
Beijing, the SAS Gateway to China
Institute of Scientific and Technical Information of China (English)
无
2007-01-01
@@ Scandinavian Airlines has launched the direct route from Stockholm to Beijing.China's Foreign Trade had an interview with Mr. Lars Lindgren,CEO of Scandinavian Airlines International (SAS).He pointed out that they expect to see that all across Scandinavia will be opened to China,and before long Beijing will become Scandinavian Airlines' gateway to China. The follows are the interview.
EXTREME PROGRAMMING PROJECT PERFORMANCE MANAGEMENT BY STATISTICAL EARNED VALUE ANALYSIS
Wei Lu; Li Lu
2013-01-01
As an important project type of Agile Software Development, the performance evaluation and prediction for eXtreme Programming project has significant meanings. Targeting on the short release life cycle and concurrent multitask features, a statistical earned value analysis model is proposed. Based on the traditional concept of earned value analysis, the statistical earned value analysis model introduced Elastic Net regression function and Laplacian hierarchical model to construct a Bayesian El...
An analysis of radio pulsar nulling statistics
Biggs, James D.
1992-01-01
Survival analysis methods are used to seek correlations between the fraction of null pulsars and other pulsar characteristics for an ensemble of 72 radio pulsars. The strongest correlation is found between the null fraction and the pulse period, suggesting that nulling is a manifestation of a faltering emission mechanism. Correlations are also found between the fraction of null pulses and other parameters that have a strong dependence on the pulse period. The results presented here suggest that nulling is broad-band and may ultimately be explained in terms of polar cap models of pulsar emission.
CORSSA: The Community Online Resource for Statistical Seismicity Analysis
Michael, Andrew J.; Wiemer, Stefan
2010-01-01
Statistical seismology is the application of rigorous statistical methods to earthquake science with the goal of improving our knowledge of how the earth works. Within statistical seismology there is a strong emphasis on the analysis of seismicity data in order to improve our scientific understanding of earthquakes and to improve the evaluation and testing of earthquake forecasts, earthquake early warning, and seismic hazards assessments. Given the societal importance of these applications, statistical seismology must be done well. Unfortunately, a lack of educational resources and available software tools make it difficult for students and new practitioners to learn about this discipline. The goal of the Community Online Resource for Statistical Seismicity Analysis (CORSSA) is to promote excellence in statistical seismology by providing the knowledge and resources necessary to understand and implement the best practices, so that the reader can apply these methods to their own research. This introduction describes the motivation for and vision of CORRSA. It also describes its structure and contents.
Improved statistics for genome-wide interaction analysis.
Ueki, Masao; Cordell, Heather J
2012-01-01
Recently, Wu and colleagues [1] proposed two novel statistics for genome-wide interaction analysis using case/control or case-only data. In computer simulations, their proposed case/control statistic outperformed competing approaches, including the fast-epistasis option in PLINK and logistic regression analysis under the correct model; however, reasons for its superior performance were not fully explored. Here we investigate the theoretical properties and performance of Wu et al.'s proposed statistics and explain why, in some circumstances, they outperform competing approaches. Unfortunately, we find minor errors in the formulae for their statistics, resulting in tests that have higher than nominal type 1 error. We also find minor errors in PLINK's fast-epistasis and case-only statistics, although theory and simulations suggest that these errors have only negligible effect on type 1 error. We propose adjusted versions of all four statistics that, both theoretically and in computer simulations, maintain correct type 1 error rates under the null hypothesis. We also investigate statistics based on correlation coefficients that maintain similar control of type 1 error. Although designed to test specifically for interaction, we show that some of these previously-proposed statistics can, in fact, be sensitive to main effects at one or both loci, particularly in the presence of linkage disequilibrium. We propose two new "joint effects" statistics that, provided the disease is rare, are sensitive only to genuine interaction effects. In computer simulations we find, in most situations considered, that highest power is achieved by analysis under the correct genetic model. Such an analysis is unachievable in practice, as we do not know this model. However, generally high power over a wide range of scenarios is exhibited by our joint effects and adjusted Wu statistics. We recommend use of these alternative or adjusted statistics and urge caution when using Wu et al
A statistical analysis of UK financial networks
Chu, J.; Nadarajah, S.
2017-04-01
In recent years, with a growing interest in big or large datasets, there has been a rise in the application of large graphs and networks to financial big data. Much of this research has focused on the construction and analysis of the network structure of stock markets, based on the relationships between stock prices. Motivated by Boginski et al. (2005), who studied the characteristics of a network structure of the US stock market, we construct network graphs of the UK stock market using same method. We fit four distributions to the degree density of the vertices from these graphs, the Pareto I, Fréchet, lognormal, and generalised Pareto distributions, and assess the goodness of fit. Our results show that the degree density of the complements of the market graphs, constructed using a negative threshold value close to zero, can be fitted well with the Fréchet and lognormal distributions.
STATISTICAL BAYESIAN ANALYSIS OF EXPERIMENTAL DATA.
Directory of Open Access Journals (Sweden)
AHLAM LABDAOUI
2012-12-01
Full Text Available The Bayesian researcher should know the basic ideas underlying Bayesian methodology and the computational tools used in modern Bayesian econometrics. Some of the most important methods of posterior simulation are Monte Carlo integration, importance sampling, Gibbs sampling and the Metropolis- Hastings algorithm. The Bayesian should also be able to put the theory and computational tools together in the context of substantive empirical problems. We focus primarily on recent developments in Bayesian computation. Then we focus on particular models. Inevitably, we combine theory and computation in the context of particular models. Although we have tried to be reasonably complete in terms of covering the basic ideas of Bayesian theory and the computational tools most commonly used by the Bayesian, there is no way we can cover all the classes of models used in econometrics. We propose to the user of analysis of variance and linear regression model.
Method for statistical data analysis of multivariate observations
Gnanadesikan, R
1997-01-01
A practical guide for multivariate statistical techniques-- now updated and revised In recent years, innovations in computer technology and statistical methodologies have dramatically altered the landscape of multivariate data analysis. This new edition of Methods for Statistical Data Analysis of Multivariate Observations explores current multivariate concepts and techniques while retaining the same practical focus of its predecessor. It integrates methods and data-based interpretations relevant to multivariate analysis in a way that addresses real-world problems arising in many areas of inte
Statistical evaluation of diagnostic performance topics in ROC analysis
Zou, Kelly H; Bandos, Andriy I; Ohno-Machado, Lucila; Rockette, Howard E
2016-01-01
Statistical evaluation of diagnostic performance in general and Receiver Operating Characteristic (ROC) analysis in particular are important for assessing the performance of medical tests and statistical classifiers, as well as for evaluating predictive models or algorithms. This book presents innovative approaches in ROC analysis, which are relevant to a wide variety of applications, including medical imaging, cancer research, epidemiology, and bioinformatics. Statistical Evaluation of Diagnostic Performance: Topics in ROC Analysis covers areas including monotone-transformation techniques in parametric ROC analysis, ROC methods for combined and pooled biomarkers, Bayesian hierarchical transformation models, sequential designs and inferences in the ROC setting, predictive modeling, multireader ROC analysis, and free-response ROC (FROC) methodology. The book is suitable for graduate-level students and researchers in statistics, biostatistics, epidemiology, public health, biomedical engineering, radiology, medi...
Online Statistical Modeling (Regression Analysis) for Independent Responses
Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus
2017-06-01
Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.
QRS DETECTION OF ECG - A STATISTICAL ANALYSIS
Directory of Open Access Journals (Sweden)
I.S. Siva Rao
2015-03-01
Full Text Available Electrocardiogram (ECG is a graphical representation generated by heart muscle. ECG plays an important role in diagnosis and monitoring of heart’s condition. The real time analyzer based on filtering, beat recognition, clustering, classification of signal with maximum few seconds delay can be done to recognize the life threatening arrhythmia. ECG signal examines and study of anatomic and physiologic facets of the entire cardiac muscle. The inceptive task for proficient scrutiny is the expulsion of noise. It is attained by the use of wavelet transform analysis. Wavelets yield temporal and spectral information concurrently and offer stretchability with a possibility of wavelet functions of different properties. This paper is concerned with the extraction of QRS complexes of ECG signals using Discrete Wavelet Transform based algorithms aided with MATLAB. By removing the inconsistent wavelet transform coefficient, denoising is done in ECG signal. In continuation, QRS complexes are identified and in which each peak can be utilized to discover the peak of separate waves like P and T with their derivatives. Here we put forth a new combinatory algorithm builded on using Pan-Tompkins' method and multi-wavelet transform.
Guidelines for Statistical Analysis of Percentage of Syllables Stuttered Data
Jones, Mark; Onslow, Mark; Packman, Ann; Gebski, Val
2006-01-01
Purpose: The purpose of this study was to develop guidelines for the statistical analysis of percentage of syllables stuttered (%SS) data in stuttering research. Method; Data on %SS from various independent sources were used to develop a statistical model to describe this type of data. On the basis of this model, %SS data were simulated with…
Attitudes and Achievement in Statistics: A Meta-Analysis Study
Emmioglu, Esma; Capa-Aydin, Yesim
2012-01-01
This study examined the relationships among statistics achievement and four components of attitudes toward statistics (Cognitive Competence, Affect, Value, and Difficulty) as assessed by the SATS. Meta-analysis results revealed that the size of relationships differed by the geographical region in which the studies were conducted as well as by the…
Explorations in Statistics: The Analysis of Ratios and Normalized Data
Curran-Everett, Douglas
2013-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This ninth installment of "Explorations in Statistics" explores the analysis of ratios and normalized--or standardized--data. As researchers, we compute a ratio--a numerator divided by a denominator--to compute a…
Attitudes and Achievement in Statistics: A Meta-Analysis Study
Emmioglu, Esma; Capa-Aydin, Yesim
2012-01-01
This study examined the relationships among statistics achievement and four components of attitudes toward statistics (Cognitive Competence, Affect, Value, and Difficulty) as assessed by the SATS. Meta-analysis results revealed that the size of relationships differed by the geographical region in which the studies were conducted as well as by the…
The Importance of Statistical Modeling in Data Analysis and Inference
Rollins, Derrick, Sr.
2017-01-01
Statistical inference simply means to draw a conclusion based on information that comes from data. Error bars are the most commonly used tool for data analysis and inference in chemical engineering data studies. This work demonstrates, using common types of data collection studies, the importance of specifying the statistical model for sound…
Development of statistical models for data analysis
Energy Technology Data Exchange (ETDEWEB)
Downham, D.Y.
2000-07-01
Incidents that cause, or could cause, injury to personnel, and that satisfy specific criteria, are reported to the Offshore Safety Division (OSD) of the Health and Safety Executive (HSE). The underlying purpose of this report is to improve ways of quantifying risk, a recommendation in Lord Cullen's report into the Piper Alpha disaster. Records of injuries and hydrocarbon releases from 1 January, 1991, to 31 March 1996, are analysed, because the reporting of incidents was standardised after 1990. Models are identified for risk assessment and some are applied. The appropriate analyses of one or two factors (or variables) are tests of uniformity or of independence. Radar graphs are used to represent some temporal variables. Cusums are applied for the analysis of incident frequencies over time, and could be applied for regular monitoring. Log-linear models for Poisson-distributed data are identified as being suitable for identifying 'non-random' combinations of more than two factors. Some questions cannot be addressed with the available data: for example, more data are needed to assess the risk of injury per employee in a time interval. If the questions are considered sufficiently important, resources could be assigned to obtain the data. Some of the main results from the analyses are as follows: the cusum analyses identified a change-point at the end of July 1993, when the reported number of injuries reduced by 40%. Injuries were more likely to occur between 8am and 12am or between 2pm and 5pm than at other times: between 2pm and 3pm the number of injuries was almost twice the average and was more than three fold the smallest. No seasonal effects in the numbers of injuries were identified. Three-day injuries occurred more frequently on the 5th, 6th and 7th days into a tour of duty than on other days. Three-day injuries occurred less frequently on the 13th and 14th days of a tour of duty. An injury classified as 'lifting or craning' was
Practical application and statistical analysis of titrimetric monitoring ...
African Journals Online (AJOL)
Practical application and statistical analysis of titrimetric monitoring of water and ... The resulting raw data were further processed with an Excel-based program. ... As such the type of component and the concentration can be determined.
Statistical Analysis of the Exchange Rate of Bitcoin: e0133678
National Research Council Canada - National Science Library
Jeffrey Chu; Saralees Nadarajah; Stephen Chan
2015-01-01
Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar...
Propensity Score Analysis: An Alternative Statistical Approach for HRD Researchers
Keiffer, Greggory L.; Lane, Forrest C.
2016-01-01
Purpose: This paper aims to introduce matching in propensity score analysis (PSA) as an alternative statistical approach for researchers looking to make causal inferences using intact groups. Design/methodology/approach: An illustrative example demonstrated the varying results of analysis of variance, analysis of covariance and PSA on a heuristic…
Meta analysis a guide to calibrating and combining statistical evidence
Kulinskaya, Elena; Staudte, Robert G
2008-01-01
Meta Analysis: A Guide to Calibrating and Combining Statistical Evidence acts as a source of basic methods for scientists wanting to combine evidence from different experiments. The authors aim to promote a deeper understanding of the notion of statistical evidence.The book is comprised of two parts - The Handbook, and The Theory. The Handbook is a guide for combining and interpreting experimental evidence to solve standard statistical problems. This section allows someone with a rudimentary knowledge in general statistics to apply the methods. The Theory provides the motivation, theory and results of simulation experiments to justify the methodology.This is a coherent introduction to the statistical concepts required to understand the authors' thesis that evidence in a test statistic can often be calibrated when transformed to the right scale.
Advanced data analysis in neuroscience integrating statistical and computational models
Durstewitz, Daniel
2017-01-01
This book is intended for use in advanced graduate courses in statistics / machine learning, as well as for all experimental neuroscientists seeking to understand statistical methods at a deeper level, and theoretical neuroscientists with a limited background in statistics. It reviews almost all areas of applied statistics, from basic statistical estimation and test theory, linear and nonlinear approaches for regression and classification, to model selection and methods for dimensionality reduction, density estimation and unsupervised clustering. Its focus, however, is linear and nonlinear time series analysis from a dynamical systems perspective, based on which it aims to convey an understanding also of the dynamical mechanisms that could have generated observed time series. Further, it integrates computational modeling of behavioral and neural dynamics with statistical estimation and hypothesis testing. This way computational models in neuroscience are not only explanat ory frameworks, but become powerfu...
Statistical multiresolution analysis in amplitude-frequency domain
Institute of Scientific and Technical Information of China (English)
SUN Hong; GUAN Bao; Henri Maitre
2004-01-01
A concept of statistical multiresolution analysis in amplitude-frequency domain is proposed, which is to employ the wavelet transform on the statistical character of a signal in amplitude domain. In terms of the theorem of generalized ergodicity, an algorithm to estimate the transform coefficients based on the amplitude statistical multiresolution analysis (AMA) is presented. The principle of applying the AMA to Synthetic Aperture Radar (SAR) image processing is described, and the good experimental results imply that the AMA is an efficient tool for processing of speckled signals modeled by the multiplicative noise.
[Generating person-years and calculating SMR using SAS: a simple program for exact calculations].
Marchand, J-L
2010-10-01
The computation of standardized incidence/mortality ratios or Poisson regression requires the calculation of person-years generated in a cohort. Softwares can do that, but SAS users still need to program this step themselves. Various algorithms were published previously, but they do not perform exact calculations: the present paper describes a simple program, which creates exact person-years, and computes SMRs. This program provides a referenced tool to perform this analysis in a cohort, with SAS or another language (the algorithm used can be easily adapted). Copyright © 2010 Elsevier Masson SAS. All rights reserved.
The SAS-3 power and thermal systems
Sullivan, R. M.; Hogrefe, A. F.; Brenza, P. T.
1975-01-01
Solar array configurations of the SAS-3 are described: a configuration with two sets of coplanar panels in the horizontal and two others in the vertical position, and two other configurations with either four horizontal or four vertical sets of panels. The nickel-cadmium battery of the power subsystem is described in detail, with emphasis on voltage limits and charge-discharge characteristics. The characteristic of 'solar-only' operation in the case of damage to the battery is discussed. The thermal subsystem of SAS-3 is considered, with discussions of thermal design criteria and the thermal environment. Temperature is controlled by using internal thermal louvers that regulate the rate at which the heat load from electronic equipment is transmitted to the outer surface for dumping to space.
Basic statistical tools in research and data analysis
Ali, Zulfiqar; Bhaskar, S Bala
2016-01-01
Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies. The article covers a brief outline of the variables, an understanding of quantitative and qualitative variables and the measures of central tendency. An idea of the sample size estimation, power analysis and the statistical errors is given. Finally, there is a summary of parametric and non-parametric tests used for data analysis.
Basic statistical tools in research and data analysis.
Ali, Zulfiqar; Bhaskar, S Bala
2016-09-01
Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies. The article covers a brief outline of the variables, an understanding of quantitative and qualitative variables and the measures of central tendency. An idea of the sample size estimation, power analysis and the statistical errors is given. Finally, there is a summary of parametric and non-parametric tests used for data analysis.
Basic statistical tools in research and data analysis
Directory of Open Access Journals (Sweden)
Zulfiqar Ali
2016-01-01
Full Text Available Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies. The article covers a brief outline of the variables, an understanding of quantitative and qualitative variables and the measures of central tendency. An idea of the sample size estimation, power analysis and the statistical errors is given. Finally, there is a summary of parametric and non-parametric tests used for data analysis.
Numeric computation and statistical data analysis on the Java platform
Chekanov, Sergei V
2016-01-01
Numerical computation, knowledge discovery and statistical data analysis integrated with powerful 2D and 3D graphics for visualization are the key topics of this book. The Python code examples powered by the Java platform can easily be transformed to other programming languages, such as Java, Groovy, Ruby and BeanShell. This book equips the reader with a computational platform which, unlike other statistical programs, is not limited by a single programming language. The author focuses on practical programming aspects and covers a broad range of topics, from basic introduction to the Python language on the Java platform (Jython), to descriptive statistics, symbolic calculations, neural networks, non-linear regression analysis and many other data-mining topics. He discusses how to find regularities in real-world data, how to classify data, and how to process data for knowledge discoveries. The code snippets are so short that they easily fit into single pages. Numeric Computation and Statistical Data Analysis ...
Fitting polytomous Rasch models in SAS
DEFF Research Database (Denmark)
Christensen, Karl Bang
2006-01-01
The item parameters of a polytomous Rasch model can be estimated using marginal and conditional approaches. This paper describes how this can be done in SAS (V8.2) for three item parameter estimation procedures: marginal maximum likelihood estimation, conditional maximum likelihood estimation......, and pairwise conditional estimation. The use of the procedures for extensions of the Rasch model is also discussed. The accuracy of the methods are evaluated using a simulation study....
Nonparametric statistical inference
Gibbons, Jean Dickinson
2014-01-01
Thoroughly revised and reorganized, the fourth edition presents in-depth coverage of the theory and methods of the most widely used nonparametric procedures in statistical analysis and offers example applications appropriate for all areas of the social, behavioral, and life sciences. The book presents new material on the quantiles, the calculation of exact and simulated power, multiple comparisons, additional goodness-of-fit tests, methods of analysis of count data, and modern computer applications using MINITAB, SAS, and STATXACT. It includes tabular guides for simplified applications of tests and finding P values and confidence interval estimates.
D'Souza, Malcolm J; Kashmar, Richard J; Hurst, Kent; Fiedler, Frank; Gross, Catherine E; Deol, Jasbir K; Wilson, Alora
Wesley College is a private, primarily undergraduate minority-serving institution located in the historic district of Dover, Delaware (DE). The College recently revised its baccalaureate biological chemistry program requirements to include a one-semester Physical Chemistry for the Life Sciences course and project-based experiential learning courses using instrumentation, data-collection, data-storage, statistical-modeling analysis, visualization, and computational techniques. In this revised curriculum, students begin with a traditional set of biology, chemistry, physics, and mathematics major core-requirements, a geographic information systems (GIS) course, a choice of an instrumental analysis course or a statistical analysis systems (SAS) programming course, and then, students can add major-electives that further add depth and value to their future post-graduate specialty areas. Open-sourced georeferenced census, health and health disparity data were coupled with GIS and SAS tools, in a public health surveillance system project, based on US county zip-codes, to develop use-cases for chronic adult obesity where income, poverty status, health insurance coverage, education, and age were categorical variables. Across the 48 contiguous states, obesity rates are found to be directly proportional to high poverty and inversely proportional to median income and educational achievement. For the State of Delaware, age and educational attainment were found to be limiting obesity risk-factors in its adult population. Furthermore, the 2004-2010 obesity trends showed that for two of the less densely populated Delaware counties; Sussex and Kent, the rates of adult obesity were found to be progressing at much higher proportions when compared to the national average.
Comparative assessment of SAS and DES turbulence modeling for massively separated flows
Zheng, Weilin; Yan, Chao; Liu, Hongkang; Luo, Dahai
2016-02-01
Numerical studies of the flow past a circular cylinder at Reynolds number 1.4× 105 and NACA0021 airfoil at the angle of attack 60° have been carried out by scale-adaptive simulation (SAS) and detached eddy simulation (DES), in comparison with the existing experimental data. The new version of the model developed by Egorov and Menter is assessed, and advantages and disadvantages of the SAS simulation are analyzed in detail to provide guidance for industrial application in the future. Moreover, the mechanism of the scale-adaptive characteristics in separated regions is discussed, which is obscure in previous analyses. It is concluded that: the mean flow properties satisfactorily agree with the experimental results for the SAS simulation, although the prediction of the second order turbulent statistics in the near wake region is just reasonable. The SAS model can produce a larger magnitude of the turbulent kinetic energy in the recirculation bubble, and, consequently, a smaller recirculation region and a more rapid recovery of the mean velocity outside the recirculation region than the DES approach with the same grid resolution. The vortex shedding is slightly less irregular with the SAS model than with the DES approach, probably due to the higher dissipation of the SAS simulation under the condition of the coarse mesh.
Simulation Experiments in Practice : Statistical Design and Regression Analysis
Kleijnen, J.P.C.
2007-01-01
In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. The goal of this article is to change these traditional, naïve methods of design and analysis, because statistical theory proves that more information is obta
Simulation Experiments in Practice : Statistical Design and Regression Analysis
Kleijnen, J.P.C.
2007-01-01
In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. Statistical theory proves that more information is obtained when applying Design Of Experiments (DOE) and linear regression analysis. Unfortunately, classic t
Statistical Analysis of Processes of Bankruptcy is in Ukraine
Berest Marina Nikolaevna
2012-01-01
The statistical analysis of processes of bankruptcy in Ukraine is conducted. Quantitative and high-quality indexes, characterizing efficiency of functioning of institute of bankruptcy of enterprises, are analyzed; the analysis of processes, related to bankruptcy of enterprises, being in state administration is conducted.
HistFitter software framework for statistical data analysis
Baak, M.; Côte, D.; Koutsman, A.; Lorenz, J.; Short, D.
2015-01-01
We present a software framework for statistical data analysis, called HistFitter, that has been used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the Large Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searches for supersymmetric particles performed by ATLAS. HistFitter is a programmable and flexible framework to build, book-keep, fit, interpret and present results of data models of nearly arbitrary complexity. Starting from an object-oriented configuration, defined by users, the framework builds probability density functions that are automatically fitted to data and interpreted with statistical tests. A key innovation of HistFitter is its design, which is rooted in core analysis strategies of particle physics. The concepts of control, signal and validation regions are woven into its very fabric. These are progressively treated with statistically rigorous built-in methods. Being capable of working with mu...
A Divergence Statistics Extension to VTK for Performance Analysis.
Energy Technology Data Exchange (ETDEWEB)
Pebay, Philippe Pierre; Bennett, Janine Camille
2015-02-01
This report follows the series of previous documents ([PT08, BPRT09b, PT09, BPT09, PT10, PB13], where we presented the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k -means, order and auto-correlative statistics engines which we developed within the Visualization Tool Kit ( VTK ) as a scalable, parallel and versatile statistics package. We now report on a new engine which we developed for the calculation of divergence statistics, a concept which we hereafter explain and whose main goal is to quantify the discrepancy, in a stasticial manner akin to measuring a distance, between an observed empirical distribution and a theoretical, "ideal" one. The ease of use of the new diverence statistics engine is illustrated by the means of C++ code snippets. Although this new engine does not yet have a parallel implementation, it has already been applied to HPC performance analysis, of which we provide an example.
A Divergence Statistics Extension to VTK for Performance Analysis
Energy Technology Data Exchange (ETDEWEB)
Pebay, Philippe Pierre [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Bennett, Janine Camille [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2015-02-01
This report follows the series of previous documents ([PT08, BPRT09b, PT09, BPT09, PT10, PB13], where we presented the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k -means, order and auto-correlative statistics engines which we developed within the Visualization Tool Kit ( VTK ) as a scalable, parallel and versatile statistics package. We now report on a new engine which we developed for the calculation of divergence statistics, a concept which we hereafter explain and whose main goal is to quantify the discrepancy, in a stasticial manner akin to measuring a distance, between an observed empirical distribution and a theoretical, "ideal" one. The ease of use of the new diverence statistics engine is illustrated by the means of C++ code snippets. Although this new engine does not yet have a parallel implementation, it has already been applied to HPC performance analysis, of which we provide an example.
Longitudinal data analysis a handbook of modern statistical methods
Fitzmaurice, Garrett; Verbeke, Geert; Molenberghs, Geert
2008-01-01
Although many books currently available describe statistical models and methods for analyzing longitudinal data, they do not highlight connections between various research threads in the statistical literature. Responding to this void, Longitudinal Data Analysis provides a clear, comprehensive, and unified overview of state-of-the-art theory and applications. It also focuses on the assorted challenges that arise in analyzing longitudinal data. After discussing historical aspects, leading researchers explore four broad themes: parametric modeling, nonparametric and semiparametric methods, joint
A novel statistic for genome-wide interaction analysis.
Wu, Xuesen; Dong, Hua; Luo, Li; Zhu, Yun; Peng, Gang; Reveille, John D; Xiong, Momiao
2010-09-23
Although great progress in genome-wide association studies (GWAS) has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide interaction analysis. To meet challenges raised by genome-wide interactional analysis, we have developed a novel statistic for testing interaction between two loci (either linked or unlinked). The null distribution and the type I error rates of the new statistic for testing interaction are validated using simulations. Extensive power studies show that the developed statistic has much higher power to detect interaction than classical logistic regression. The results identified 44 and 211 pairs of SNPs showing significant evidence of interactions with FDRanalysis is a valuable tool for finding remaining missing heritability unexplained by the current GWAS, and the developed novel statistic is able to search significant interaction between SNPs across the genome. Real data analysis showed that the results of genome-wide interaction analysis can be replicated in two independent studies.
Chen, Vivian Yi-Ju; Yang, Tse-Chuan
2012-08-01
An increasing interest in exploring spatial non-stationarity has generated several specialized analytic software programs; however, few of these programs can be integrated natively into a well-developed statistical environment such as SAS. We not only developed a set of SAS macro programs to fill this gap, but also expanded the geographically weighted generalized linear modeling (GWGLM) by integrating the strengths of SAS into the GWGLM framework. Three features distinguish our work. First, the macro programs of this study provide more kernel weighting functions than the existing programs. Second, with our codes the users are able to better specify the bandwidth selection process compared to the capabilities of existing programs. Third, the development of the macro programs is fully embedded in the SAS environment, providing great potential for future exploration of complicated spatially varying coefficient models in other disciplines. We provided three empirical examples to illustrate the use of the SAS macro programs and demonstrated the advantages explained above.
Complexity of software trustworthiness and its dynamical statistical analysis methods
Institute of Scientific and Technical Information of China (English)
ZHENG ZhiMing; MA ShiLong; LI Wei; JIANG Xin; WEI Wei; MA LiLi; TANG ShaoTing
2009-01-01
Developing trusted softwares has become an important trend and a natural choice in the development of software technology and applications.At present,the method of measurement and assessment of software trustworthiness cannot guarantee safe and reliable operations of software systems completely and effectively.Based on the dynamical system study,this paper interprets the characteristics of behaviors of software systems and the basic scientific problems of software trustworthiness complexity,analyzes the characteristics of complexity of software trustworthiness,and proposes to study the software trustworthiness measurement in terms of the complexity of software trustworthiness.Using the dynamical statistical analysis methods,the paper advances an invariant-measure based assessment method of software trustworthiness by statistical indices,and hereby provides a dynamical criterion for the untrustworthiness of software systems.By an example,the feasibility of the proposed dynamical statistical analysis method in software trustworthiness measurement is demonstrated using numerical simulations and theoretical analysis.
Yang, Lei; Sun, Zhen; Zu, Yuangang; Zhao, Chunjian; Sun, Xiaowei; Zhang, Zhonghua; Zhang, Lin
2012-05-01
The objective of the study was to prepare ursolic acid (UA) nanoparticles using the supercritical anti-solvent (SAS) process and evaluate its physicochemical properties and oral bioavailability. The effects of four process variables, pressure, temperature, drug concentration and drug solution flow rate, on drug particle formation during SAS process, were investigated. Particles with mean particle size ranging from 139.2±19.7 to 1039.8±65.2nm were obtained by varying the process parameters. The UA was characterised by scanning electron microscopy, X-ray diffraction, Fourier-transform infrared spectroscopy, thermal gravimetric analysis, specific surface area, dissolution test and bioavailability test. It was concluded that physicochemical properties and bioavailability of crystalline UA could be improved by physical modification, such as particle size reduction and generation of amorphous state using SAS process. Further, SAS process was a powerful methodology for improving the physicochemical properties and bioavailability of UA.
Towards proper sampling and statistical analysis of defects
Directory of Open Access Journals (Sweden)
Cetin Ali
2014-06-01
Full Text Available Advancements in applied statistics with great relevance to defect sampling and analysis are presented. Three main issues are considered; (i proper handling of multiple defect types, (ii relating sample data originating from polished inspection surfaces (2D to finite material volumes (3D, and (iii application of advanced extreme value theory in statistical analysis of block maximum data. Original and rigorous, but practical mathematical solutions are presented. Finally, these methods are applied to make prediction regarding defect sizes in a steel alloy containing multiple defect types.
2012-01-01
Colombian Safe Guide S.A.S. es una compañía de servicios de asesoramiento y acompañamiento a los visitantes extranjeros que vienen a la ciudad de Bogotá. Funciona a través de una página web, que facilitan a sus clientes la obtención de todos los servicios necesarios durante su estadía, en un solo lugar. El principal objetivo es proporcionar nuevas alternativas de servicios de asesoramiento, capitalizando la compañía bajo el cumplimiento del objetivo y cuota del mercado propuesta por la compañ...
The SAS-3 attitude control system
Mobley, F. F.; Konigsberg, R.; Fountain, G. H.
1975-01-01
SAS-3 uses a reaction wheel to provide torque to control the spin rate. If the wheel speed becomes too great or too small, it must be restored to its nominal rate by momentum dumping which is done by magnetic torquing against the earth's magnetic field by the satellite's magnetic coils. A small rate-integrating gyro is used to sense the spin rate so that closed loop control of the spin rate can be achieved. These various systems are described in detail including the reaction wheel system, the gyro system, along with control modes (spin rate control and the star lock mode).
Adaptive strategy for the statistical analysis of connectomes.
Directory of Open Access Journals (Sweden)
Djalel Eddine Meskaldji
Full Text Available We study an adaptive statistical approach to analyze brain networks represented by brain connection matrices of interregional connectivity (connectomes. Our approach is at a middle level between a global analysis and single connections analysis by considering subnetworks of the global brain network. These subnetworks represent either the inter-connectivity between two brain anatomical regions or by the intra-connectivity within the same brain anatomical region. An appropriate summary statistic, that characterizes a meaningful feature of the subnetwork, is evaluated. Based on this summary statistic, a statistical test is performed to derive the corresponding p-value. The reformulation of the problem in this way reduces the number of statistical tests in an orderly fashion based on our understanding of the problem. Considering the global testing problem, the p-values are corrected to control the rate of false discoveries. Finally, the procedure is followed by a local investigation within the significant subnetworks. We contrast this strategy with the one based on the individual measures in terms of power. We show that this strategy has a great potential, in particular in cases where the subnetworks are well defined and the summary statistics are properly chosen. As an application example, we compare structural brain connection matrices of two groups of subjects with a 22q11.2 deletion syndrome, distinguished by their IQ scores.
Statistical Error analysis of Nucleon-Nucleon phenomenological potentials
Perez, R Navarro; Arriola, E Ruiz
2014-01-01
Nucleon-Nucleon potentials are commonplace in nuclear physics and are determined from a finite number of experimental data with limited precision sampling the scattering process. We study the statistical assumptions implicit in the standard least squares fitting procedure and apply, along with more conventional tests, a tail sensitive quantile-quantile test as a simple and confident tool to verify the normality of residuals. We show that the fulfilment of normality tests is linked to a judicious and consistent selection of a nucleon-nucleon database. These considerations prove crucial to a proper statistical error analysis and uncertainty propagation. We illustrate these issues by analyzing about 8000 proton-proton and neutron-proton scattering published data. This enables the construction of potentials meeting all statistical requirements necessary for statistical uncertainty estimates in nuclear structure calculations.
Data analysis using the Gnu R system for statistical computation
Energy Technology Data Exchange (ETDEWEB)
Simone, James; /Fermilab
2011-07-01
R is a language system for statistical computation. It is widely used in statistics, bioinformatics, machine learning, data mining, quantitative finance, and the analysis of clinical drug trials. Among the advantages of R are: it has become the standard language for developing statistical techniques, it is being actively developed by a large and growing global user community, it is open source software, it is highly portable (Linux, OS-X and Windows), it has a built-in documentation system, it produces high quality graphics and it is easily extensible with over four thousand extension library packages available covering statistics and applications. This report gives a very brief introduction to R with some examples using lattice QCD simulation results. It then discusses the development of R packages designed for chi-square minimization fits for lattice n-pt correlation functions.
SAS-Expander and Its Realization%SAS-Expander及其实现
Institute of Scientific and Technical Information of China (English)
王宇
2009-01-01
SAS(Serial Attached SCSI)即串行SCSI技术,是一种磁盘连接技术.而Expander本质上就是SAS交换机,可以将多个SAS连接到有限数量的主机端口上.这里主要介绍了通过MAX-IM公司的VSC7154实现SAS-Expander磁盘阵列柜.此阵列柜机构部件依据SBB规范实现.
Direct binding of SAS-6 to ZYG-1 recruits SAS-6 to the mother centriole for cartwheel assembly.
Lettman, Molly M; Wong, Yao Liang; Viscardi, Valeria; Niessen, Sherry; Chen, Sheng-Hong; Shiau, Andrew K; Zhou, Huilin; Desai, Arshad; Oegema, Karen
2013-05-13
Assembly of SAS-6 dimers to form the centriolar cartwheel requires the ZYG-1/Plk4 kinase. Here, we show that ZYG-1 recruits SAS-6 to the mother centriole independently of its kinase activity; kinase activity is subsequently required for cartwheel assembly. We identify a direct interaction between ZYG-1 and the SAS-6 coiled coil that explains its kinase activity-independent function in SAS-6 recruitment. Perturbing this interaction, or the interaction between an adjacent segment of the SAS-6 coiled coil and SAS-5, prevented SAS-6 recruitment and cartwheel assembly. SAS-6 mutants with alanine substitutions in a previously described ZYG-1 target site or in 37 other residues, either phosphorylated by ZYG-1 in vitro or conserved in closely related nematodes, all supported cartwheel assembly. We propose that ZYG-1 binding to the SAS-6 coiled coil recruits the SAS-6-SAS-5 complex to the mother centriole, where a ZYG-1 kinase activity-dependent step, whose target is unlikely to be SAS-6, triggers cartwheel assembly.
A novel statistic for genome-wide interaction analysis.
Directory of Open Access Journals (Sweden)
Xuesen Wu
2010-09-01
Full Text Available Although great progress in genome-wide association studies (GWAS has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide interaction analysis. To meet challenges raised by genome-wide interactional analysis, we have developed a novel statistic for testing interaction between two loci (either linked or unlinked. The null distribution and the type I error rates of the new statistic for testing interaction are validated using simulations. Extensive power studies show that the developed statistic has much higher power to detect interaction than classical logistic regression. The results identified 44 and 211 pairs of SNPs showing significant evidence of interactions with FDR<0.001 and 0.001
A Statistical Analysis of Cointegration for I(2) Variables
DEFF Research Database (Denmark)
Johansen, Søren
1995-01-01
be conducted using the ¿ sup2/sup distribution. It is shown to what extent inference on the cointegration ranks can be conducted using the tables already prepared for the analysis of cointegration of I(1) variables. New tables are needed for the test statistics to control the size of the tests. This paper...
Advanced Statistical and Data Analysis Tools for Astrophysics
Kashyap, V.; Scargle, Jeffrey D. (Technical Monitor)
2001-01-01
The goal of the project is to obtain, derive, and develop statistical and data analysis tools that would be of use in the analyses of high-resolution, high-sensitivity data that are becoming available with new instruments. This is envisioned as a cross-disciplinary effort with a number of collaborators.
A New Statistic for Variable Selection in Questionnaire Analysis
Institute of Scientific and Technical Information of China (English)
ZHANG Jun-hua; FANG Wei-wu
2001-01-01
In this paper, a new statistic is proposed for variable selection which is one of the important problems in analysis of questionnaire data. Contrasting to other methods, the approach introduced here can be used not only for two groups of samples but can also be easily generalized to the multi-group case.
Measures of radioactivity: a tool for understanding statistical data analysis
Montalbano, Vera
2012-01-01
A learning path on radioactivity in the last class of high school is presented. An introduction to radioactivity and nuclear phenomenology is followed by measurements of natural radioactivity. Background and weak sources are monitored for days or weeks. The data are analyzed in order to understand the importance of statistical analysis in modern physics.
Statistical Analysis of Hypercalcaemia Data related to Transferability
DEFF Research Database (Denmark)
Frølich, Anne; Nielsen, Bo Friis
2005-01-01
In this report we describe statistical analysis related to a study of hypercalcaemia carried out in the Copenhagen area in the ten year period from 1984 to 1994. Results from the study have previously been publised in a number of papers [3, 4, 5, 6, 7, 8, 9] and in various abstracts and posters...
AstroStat - A VO Tool for Statistical Analysis
Kembhavi, Ajit K; Kale, Tejas; Jagade, Santosh; Vibhute, Ajay; Garg, Prerak; Vaghmare, Kaustubh; Navelkar, Sharmad; Agrawal, Tushar; Nandrekar, Deoyani; Shaikh, Mohasin
2015-01-01
AstroStat is an easy-to-use tool for performing statistical analysis on data. It has been designed to be compatible with Virtual Observatory (VO) standards thus enabling it to become an integral part of the currently available collection of VO tools. A user can load data in a variety of formats into AstroStat and perform various statistical tests using a menu driven interface. Behind the scenes, all analysis is done using the public domain statistical software - R and the output returned is presented in a neatly formatted form to the user. The analyses performable include exploratory tests, visualizations, distribution fitting, correlation & causation, hypothesis testing, multivariate analysis and clustering. The tool is available in two versions with identical interface and features - as a web service that can be run using any standard browser and as an offline application. AstroStat will provide an easy-to-use interface which can allow for both fetching data and performing power statistical analysis on ...
Introduction to Statistics and Data Analysis With Computer Applications I.
Morris, Carl; Rolph, John
This document consists of unrevised lecture notes for the first half of a 20-week in-house graduate course at Rand Corporation. The chapter headings are: (1) Histograms and descriptive statistics; (2) Measures of dispersion, distance and goodness of fit; (3) Using JOSS for data analysis; (4) Binomial distribution and normal approximation; (5)…
Investigation of Weibull statistics in fracture analysis of cast aluminum
Holland, F. A., Jr.; Zaretsky, E. V.
1989-01-01
The fracture strengths of two large batches of A357-T6 cast aluminum coupon specimens were compared by using two-parameter Weibull analysis. The minimum number of these specimens necessary to find the fracture strength of the material was determined. The applicability of three-parameter Weibull analysis was also investigated. A design methodolgy based on the combination of elementary stress analysis and Weibull statistical analysis is advanced and applied to the design of a spherical pressure vessel shell. The results from this design methodology are compared with results from the applicable ASME pressure vessel code.
Investigation of Weibull statistics in fracture analysis of cast aluminum
Holland, F. A., Jr.; Zaretsky, E. V.
1989-01-01
The fracture strengths of two large batches of A357-T6 cast aluminum coupon specimens were compared by using two-parameter Weibull analysis. The minimum number of these specimens necessary to find the fracture strength of the material was determined. The applicability of three-parameter Weibull analysis was also investigated. A design methodolgy based on the combination of elementary stress analysis and Weibull statistical analysis is advanced and applied to the design of a spherical pressure vessel shell. The results from this design methodology are compared with results from the applicable ASME pressure vessel code.
Multivariate statistical analysis of precipitation chemistry in Northwestern Spain
Energy Technology Data Exchange (ETDEWEB)
Prada-Sanchez, J.M.; Garcia-Jurado, I.; Gonzalez-Manteiga, W.; Fiestras-Janeiro, M.G.; Espada-Rios, M.I.; Lucas-Dominguez, T. (University of Santiago, Santiago (Spain). Faculty of Mathematics, Dept. of Statistics and Operations Research)
1993-07-01
149 samples of rainwater were collected in the proximity of a power station in northwestern Spain at three rainwater monitoring stations. The resulting data are analyzed using multivariate statistical techniques. Firstly, the Principal Component Analysis shows that there are three main sources of pollution in the area (a marine source, a rural source and an acid source). The impact from pollution from these sources on the immediate environment of the stations is studied using Factorial Discriminant Analysis. 8 refs., 7 figs., 11 tabs.
HistFitter software framework for statistical data analysis
Baak, M.; Besjes, G. J.; Côté, D.; Koutsman, A.; Lorenz, J.; Short, D.
2015-04-01
We present a software framework for statistical data analysis, called HistFitter, that has been used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the Large Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searches for supersymmetric particles performed by ATLAS. HistFitter is a programmable and flexible framework to build, book-keep, fit, interpret and present results of data models of nearly arbitrary complexity. Starting from an object-oriented configuration, defined by users, the framework builds probability density functions that are automatically fit to data and interpreted with statistical tests. Internally HistFitter uses the statistics packages RooStats and HistFactory. A key innovation of HistFitter is its design, which is rooted in analysis strategies of particle physics. The concepts of control, signal and validation regions are woven into its fabric. These are progressively treated with statistically rigorous built-in methods. Being capable of working with multiple models at once that describe the data, HistFitter introduces an additional level of abstraction that allows for easy bookkeeping, manipulation and testing of large collections of signal hypotheses. Finally, HistFitter provides a collection of tools to present results with publication quality style through a simple command-line interface.
SPSS and SAS programs for generalizability theory analyses.
Mushquash, Christopher; O'Connor, Brian P
2006-08-01
The identification and reduction of measurement errors is a major challenge in psychological testing. Most investigators rely solely on classical test theory for assessing reliability, whereas most experts have long recommended using generalizability theory instead. One reason for the common neglect of generalizability theory is the absence of analytic facilities for this purpose in popular statistical software packages. This article provides a brief introduction to generalizability theory, describes easy to use SPSS, SAS, and MATLAB programs for conducting the recommended analyses, and provides an illustrative example, using data (N = 329) for the Rosenberg Self-Esteem Scale. Program output includes variance components, relative and absolute errors and generalizability coefficients, coefficients for D studies, and graphs of D study results.
Park, Junsung; Park, Hee Jun; Cho, Wonkyung; Cha, Kwang-Ho; Kang, Young-Shin; Hwang, Sung-Joo
2010-08-30
The aim of this study was to investigate the effects of micronization and amorphorization of cefdinir on solubility and dissolution rate. The amorphous samples were prepared by spray-drying (SD) and supercritical anti-solvent (SAS) process, respectively and their amorphous natures were confirmed by DSC, PXRD and FT-IR. Thermal gravimetric analysis was performed by TGA. SEM was used to investigate the morphology of particles and the processed particle had a spherical shape, while the unprocessed crystalline particle had a needle-like shape. The mean particle size and specific surface area were measured by dynamic light scattering (DLS) and BET, respectively. The DLS result showed that the SAS-processed particle was the smallest, followed by SD and the unprocessed cefdinir. The BET result was the same as DLS result in that the SAS-processed particle had the largest surface area. Therefore, the processed cefdinir, especially the SAS-processed particle, appeared to have enhanced apparent solubility, improved intrinsic dissolution rate and better drug release when compared with SD-processed and unprocessed crystalline cefdinir due not only to its amorphous nature, but also its reduced particle size. Conclusions were that the solubility and dissolution rate of crystalline cefdinir could be improved by physically modifying the particles using SD and SAS-process. Furthermore, SAS-process was a powerful methodology for improving the solubility and dissolution rate of cefdinir.
Common pitfalls in statistical analysis: Linear regression analysis.
Aggarwal, Rakesh; Ranganathan, Priya
2017-01-01
In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis.
Common pitfalls in statistical analysis: Linear regression analysis
Directory of Open Access Journals (Sweden)
Rakesh Aggarwal
2017-01-01
Full Text Available In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis.
Statistical analysis of absorptive laser damage in dielectric thin films
Energy Technology Data Exchange (ETDEWEB)
Budgor, A.B.; Luria-Budgor, K.F.
1978-09-11
The Weibull distribution arises as an example of the theory of extreme events. It is commonly used to fit statistical data arising in the failure analysis of electrical components and in DC breakdown of materials. This distribution is employed to analyze time-to-damage and intensity-to-damage statistics obtained when irradiating thin film coated samples of SiO/sub 2/, ZrO/sub 2/, and Al/sub 2/O/sub 3/ with tightly focused laser beams. The data used is furnished by Milam. The fit to the data is excellent; and least squared correlation coefficients greater than 0.9 are often obtained.
Comparison of Statistical Models for Regional Crop Trial Analysis
Institute of Scientific and Technical Information of China (English)
ZHANG Qun-yuan; KONG Fan-ling
2002-01-01
Based on the review and comparison of main statistical analysis models for estimating varietyenvironment cell means in regional crop trials, a new statistical model, LR-PCA composite model was proposed, and the predictive precision of these models were compared by cross validation of an example data. Results showed that the order of model precision was LR-PCA model ＞ AMMI model ＞ PCA model ＞ Treatment Means (TM) model ＞ Linear Regression (LR) model ＞ Additive Main Effects ANOVA model. The precision gain factor of LR-PCA model was 1.55, increasing by 8.4% compared with AMMI.
Network similarity and statistical analysis of earthquake seismic data
Deyasi, Krishanu; Banerjee, Anirban
2016-01-01
We study the structural similarity of earthquake networks constructed from seismic catalogs of different geographical regions. A hierarchical clustering of underlying undirected earthquake networks is shown using Jensen-Shannon divergence in graph spectra. The directed nature of links indicates that each earthquake network is strongly connected, which motivates us to study the directed version statistically. Our statistical analysis of each earthquake region identifies the hub regions. We calculate the conditional probability of the forthcoming occurrences of earthquakes in each region. The conditional probability of each event has been compared with their stationary distribution.
[Some basic aspects in statistical analysis of visual acuity data].
Ren, Ze-Qin
2007-06-01
All visual acuity charts used currently have their own shortcomings. Therefore, it is difficult for ophthalmologists to evaluate visual acuity data. Many problems present in the use of statistical methods for handling visual acuity data in clinical research. The quantitative relationship between visual acuity and visual angle varied in different visual acuity charts. The type of visual acuity and visual angle are different from each other. Therefore, different statistical methods should be used for different data sources. A correct understanding and analysis of visual acuity data could be obtained only after the elucidation of these aspects.
Network similarity and statistical analysis of earthquake seismic data
Deyasi, Krishanu; Chakraborty, Abhijit; Banerjee, Anirban
2017-09-01
We study the structural similarity of earthquake networks constructed from seismic catalogs of different geographical regions. A hierarchical clustering of underlying undirected earthquake networks is shown using Jensen-Shannon divergence in graph spectra. The directed nature of links indicates that each earthquake network is strongly connected, which motivates us to study the directed version statistically. Our statistical analysis of each earthquake region identifies the hub regions. We calculate the conditional probability of the forthcoming occurrences of earthquakes in each region. The conditional probability of each event has been compared with their stationary distribution.
Statistical analysis and interpolation of compositional data in materials science.
Pesenson, Misha Z; Suram, Santosh K; Gregoire, John M
2015-02-01
Compositional data are ubiquitous in chemistry and materials science: analysis of elements in multicomponent systems, combinatorial problems, etc., lead to data that are non-negative and sum to a constant (for example, atomic concentrations). The constant sum constraint restricts the sampling space to a simplex instead of the usual Euclidean space. Since statistical measures such as mean and standard deviation are defined for the Euclidean space, traditional correlation studies, multivariate analysis, and hypothesis testing may lead to erroneous dependencies and incorrect inferences when applied to compositional data. Furthermore, composition measurements that are used for data analytics may not include all of the elements contained in the material; that is, the measurements may be subcompositions of a higher-dimensional parent composition. Physically meaningful statistical analysis must yield results that are invariant under the number of composition elements, requiring the application of specialized statistical tools. We present specifics and subtleties of compositional data processing through discussion of illustrative examples. We introduce basic concepts, terminology, and methods required for the analysis of compositional data and utilize them for the spatial interpolation of composition in a sputtered thin film. The results demonstrate the importance of this mathematical framework for compositional data analysis (CDA) in the fields of materials science and chemistry.
Feature-Based Statistical Analysis of Combustion Simulation Data
Energy Technology Data Exchange (ETDEWEB)
Bennett, J; Krishnamoorthy, V; Liu, S; Grout, R; Hawkes, E; Chen, J; Pascucci, V; Bremer, P T
2011-11-18
We present a new framework for feature-based statistical analysis of large-scale scientific data and demonstrate its effectiveness by analyzing features from Direct Numerical Simulations (DNS) of turbulent combustion. Turbulent flows are ubiquitous and account for transport and mixing processes in combustion, astrophysics, fusion, and climate modeling among other disciplines. They are also characterized by coherent structure or organized motion, i.e. nonlocal entities whose geometrical features can directly impact molecular mixing and reactive processes. While traditional multi-point statistics provide correlative information, they lack nonlocal structural information, and hence, fail to provide mechanistic causality information between organized fluid motion and mixing and reactive processes. Hence, it is of great interest to capture and track flow features and their statistics together with their correlation with relevant scalar quantities, e.g. temperature or species concentrations. In our approach we encode the set of all possible flow features by pre-computing merge trees augmented with attributes, such as statistical moments of various scalar fields, e.g. temperature, as well as length-scales computed via spectral analysis. The computation is performed in an efficient streaming manner in a pre-processing step and results in a collection of meta-data that is orders of magnitude smaller than the original simulation data. This meta-data is sufficient to support a fully flexible and interactive analysis of the features, allowing for arbitrary thresholds, providing per-feature statistics, and creating various global diagnostics such as Cumulative Density Functions (CDFs), histograms, or time-series. We combine the analysis with a rendering of the features in a linked-view browser that enables scientists to interactively explore, visualize, and analyze the equivalent of one terabyte of simulation data. We highlight the utility of this new framework for combustion
Composites reinforcement by rods: a SAS study
Energy Technology Data Exchange (ETDEWEB)
Urban, V. [ESRF, BP220, 38043 Grenoble Cedex (France); Botti, A.; Pyckhout-Hintzen, W.; Richter, D. [IFF-Forschungszentrum Juelich, 52425 Juelich (Germany); Straube, E. [University of Halle, FB Physik, 06099 Halle (Germany)
2002-07-01
The mechanical properties of composites are governed by size, shape and dispersion degree of so-called reinforcing particles. Polymeric fillers based on thermodynamically driven microphase separation of block copolymers offer the opportunity to study a model system of controlled rod-like filler particles. We chose a triblock copolymer (PBPSPB) and carried out SAS measurements with both X-rays and neutrons, in order to characterize separately the hard phase and the cross-linked PB matrix. The properties of the material depend strongly on the way that stress is carried and transferred between the soft matrix and the hard fibers. The failure of the strain-amplification concept and the change of topological contributions to the free energy and scattering factor have to be addressed. In this respect the composite shows a similarity to a two-network system, i.e. interpenetrating rubber and rod-like filler networks. (orig.)
Composites reinforcement by rods a SAS study
Urban, V; Pyckhout-Hintzen, W; Richter, D; Straube, E
2002-01-01
The mechanical properties of composites are governed by size, shape and dispersion degree of so-called reinforcing particles. Polymeric fillers based on thermodynamically driven microphase separation of block copolymers offer the opportunity to study a model system of controlled rod-like filler particles. We chose a triblock copolymer (PBPSPB) and carried out SAS measurements with both X-rays and neutrons, in order to characterize separately the hard phase and the cross-linked PB matrix. The properties of the material depend strongly on the way that stress is carried and transferred between the soft matrix and the hard fibers. The failure of the strain-amplification concept and the change of topological contributions to the free energy and scattering factor have to be addressed. In this respect the composite shows a similarity to a two-network system, i.e. interpenetrating rubber and rod-like filler networks. (orig.)
Statistical Analysis of SAR Sea Clutter for Classification Purposes
Directory of Open Access Journals (Sweden)
Jaime Martín-de-Nicolás
2014-09-01
Full Text Available Statistical analysis of radar clutter has always been one of the topics, where more effort has been put in the last few decades. These studies were usually focused on finding the statistical models that better fitted the clutter distribution; however, the goal of this work is not the modeling of the clutter, but the study of the suitability of the statistical parameters to carry out a sea state classification. In order to achieve this objective and provide some relevance to this study, an important set of maritime and coastal Synthetic Aperture Radar data is considered. Due to the nature of the acquisition of data by SAR sensors, speckle noise is inherent to these data, and a specific study of how this noise affects the clutter distribution is also performed in this work. In pursuit of a sense of wholeness, a thorough study of the most suitable statistical parameters, as well as the most adequate classifier is carried out, achieving excellent results in terms of classification success rates. These concluding results confirm that a sea state classification is not only viable, but also successful using statistical parameters different from those of the best modeling distribution and applying a speckle filter, which allows a better characterization of the parameters used to distinguish between different sea states.
Wavelet analysis in ecology and epidemiology: impact of statistical tests.
Cazelles, Bernard; Cazelles, Kévin; Chavez, Mario
2014-02-06
Wavelet analysis is now frequently used to extract information from ecological and epidemiological time series. Statistical hypothesis tests are conducted on associated wavelet quantities to assess the likelihood that they are due to a random process. Such random processes represent null models and are generally based on synthetic data that share some statistical characteristics with the original time series. This allows the comparison of null statistics with those obtained from original time series. When creating synthetic datasets, different techniques of resampling result in different characteristics shared by the synthetic time series. Therefore, it becomes crucial to consider the impact of the resampling method on the results. We have addressed this point by comparing seven different statistical testing methods applied with different real and simulated data. Our results show that statistical assessment of periodic patterns is strongly affected by the choice of the resampling method, so two different resampling techniques could lead to two different conclusions about the same time series. Moreover, our results clearly show the inadequacy of resampling series generated by white noise and red noise that are nevertheless the methods currently used in the wide majority of wavelets applications. Our results highlight that the characteristics of a time series, namely its Fourier spectrum and autocorrelation, are important to consider when choosing the resampling technique. Results suggest that data-driven resampling methods should be used such as the hidden Markov model algorithm and the 'beta-surrogate' method.
Statistical analysis of the precision of the Match method
Directory of Open Access Journals (Sweden)
R. Lehmann
2005-05-01
Full Text Available The Match method quantifies chemical ozone loss in the polar stratosphere. The basic idea consists in calculating the forward trajectory of an air parcel that has been probed by an ozone measurement (e.g., by an ozone sonde or satellite and finding a second ozone measurement close to this trajectory. Such an event is called a ''match''. A rate of chemical ozone destruction can be obtained by a statistical analysis of several tens of such match events. Information on the uncertainty of the calculated rate can be inferred from the scatter of the ozone mixing ratio difference (second measurement minus first measurement associated with individual matches. A standard analysis would assume that the errors of these differences are statistically independent. However, this assumption may be violated because different matches can share a common ozone measurement, so that the errors associated with these match events become statistically dependent. Taking this effect into account, we present an analysis of the uncertainty of the final Match result. It has been applied to Match data from the Arctic winters 1995, 1996, 2000, and 2003. For these ozone-sonde Match studies the effect of the error correlation on the uncertainty estimates is rather small: compared to a standard error analysis, the uncertainty estimates increase by 15% on average. However, the effect is more pronounced for typical satellite Match analyses: for an Antarctic satellite Match study (2003, the uncertainty estimates increase by 60% on average.
Recommended methods for statistical analysis of data containing less-than-detectable measurements
Energy Technology Data Exchange (ETDEWEB)
Atwood, C.L.; Blackwood, L.G.; Harris, G.A.; Loehr, C.A.
1990-09-01
This report is a manual for statistical workers dealing with environmental measurements, when some of the measurements are not given exactly but are only reported as less than detectable. For some statistical settings with such data, many methods have been proposed in the literature, while for others few or none have been proposed. This report gives a recommended method in each of the settings considered. The body of the report gives a brief description of each recommended method. Appendix A gives example programs using the statistical package SAS, for those methods that involve nonstandard methods. Appendix B presents the methods that were compared and the reasons for selecting each recommended method, and explains any fine points that might be of interest. This is an interim version. Future revisions will complete the recommendations. 34 refs., 2 figs., 11 tabs.
Recommended methods for statistical analysis of data containing less-than-detectable measurements
Energy Technology Data Exchange (ETDEWEB)
Atwood, C.L.; Blackwood, L.G.; Harris, G.A.; Loehr, C.A.
1991-09-01
This report is a manual for statistical workers dealing with environmental measurements, when some of the measurements are not given exactly but are only reported as less than detectable. For some statistical settings with such data, many methods have been proposed in the literature, while for others few or none have been proposed. This report gives a recommended method in each of the settings considered. The body of the report gives a brief description of each recommended method. Appendix A gives example programs using the statistical package SAS, for those methods that involve nonstandard methods. Appendix B presents the methods that were compared and the reasons for selecting each recommended method, and explains any fine points that might be of interest. 7 refs., 4 figs.
Towards Advanced Data Analysis by Combining Soft Computing and Statistics
Gil, María; Sousa, João; Verleysen, Michel
2013-01-01
Soft computing, as an engineering science, and statistics, as a classical branch of mathematics, emphasize different aspects of data analysis. Soft computing focuses on obtaining working solutions quickly, accepting approximations and unconventional approaches. Its strength lies in its flexibility to create models that suit the needs arising in applications. In addition, it emphasizes the need for intuitive and interpretable models, which are tolerant to imprecision and uncertainty. Statistics is more rigorous and focuses on establishing objective conclusions based on experimental data by analyzing the possible situations and their (relative) likelihood. It emphasizes the need for mathematical methods and tools to assess solutions and guarantee performance. Combining the two fields enhances the robustness and generalizability of data analysis methods, while preserving the flexibility to solve real-world problems efficiently and intuitively.
Statistical Analysis of Ship Collisions with Bridges in China Waterway
Institute of Scientific and Technical Information of China (English)
DAI Tong-yu; NIE Wu; LIU Ying-jie; WANG Li-ping
2002-01-01
Having carried out investigations on ship collision accidents with bridges in waterway in China, a database of ship collision with bridge (SCB) is developed in this paper. It includes detailed information about more than 200 accidents near ship's waterways in the last four decades, in which ships collided with the bridges. Based on the information a statistical analysis is presented tentatively. The increase in frequency of ship collision with bridges appears, and the accident quantity of the barge system is more than that of single ship. The main reason of all the factors for ship collision with bridge is the human errors, which takes up 70%. The quantity of the accidents happened during flooding period shows over 3～6 times compared with the period from March to June in a year. The probability follows the normal distribution according to statistical analysis. Visibility, span between piers also have an effect on the frequency of the accidents.
Collagen morphology and texture analysis: from statistics to classification
Mostaço-Guidolin, Leila B.; Ko, Alex C.-T.; Wang, Fei; Xiang, Bo; Hewko, Mark; Tian, Ganghong; Major, Arkady; Shiomi, Masashi; Sowa, Michael G.
2013-07-01
In this study we present an image analysis methodology capable of quantifying morphological changes in tissue collagen fibril organization caused by pathological conditions. Texture analysis based on first-order statistics (FOS) and second-order statistics such as gray level co-occurrence matrix (GLCM) was explored to extract second-harmonic generation (SHG) image features that are associated with the structural and biochemical changes of tissue collagen networks. Based on these extracted quantitative parameters, multi-group classification of SHG images was performed. With combined FOS and GLCM texture values, we achieved reliable classification of SHG collagen images acquired from atherosclerosis arteries with >90% accuracy, sensitivity and specificity. The proposed methodology can be applied to a wide range of conditions involving collagen re-modeling, such as in skin disorders, different types of fibrosis and muscular-skeletal diseases affecting ligaments and cartilage.
GNSS Spoofing Detection Based on Signal Power Measurements: Statistical Analysis
Directory of Open Access Journals (Sweden)
V. Dehghanian
2012-01-01
Full Text Available A threat to GNSS receivers is posed by a spoofing transmitter that emulates authentic signals but with randomized code phase and Doppler values over a small range. Such spoofing signals can result in large navigational solution errors that are passed onto the unsuspecting user with potentially dire consequences. An effective spoofing detection technique is developed in this paper, based on signal power measurements and that can be readily applied to present consumer grade GNSS receivers with minimal firmware changes. An extensive statistical analysis is carried out based on formulating a multihypothesis detection problem. Expressions are developed to devise a set of thresholds required for signal detection and identification. The detection processing methods developed are further manipulated to exploit incidental antenna motion arising from user interaction with a GNSS handheld receiver to further enhance the detection performance of the proposed algorithm. The statistical analysis supports the effectiveness of the proposed spoofing detection technique under various multipath conditions.
Statistics in experimental design, preprocessing, and analysis of proteomics data.
Jung, Klaus
2011-01-01
High-throughput experiments in proteomics, such as 2-dimensional gel electrophoresis (2-DE) and mass spectrometry (MS), yield usually high-dimensional data sets of expression values for hundreds or thousands of proteins which are, however, observed on only a relatively small number of biological samples. Statistical methods for the planning and analysis of experiments are important to avoid false conclusions and to receive tenable results. In this chapter, the most frequent experimental designs for proteomics experiments are illustrated. In particular, focus is put on studies for the detection of differentially regulated proteins. Furthermore, issues of sample size planning, statistical analysis of expression levels as well as methods for data preprocessing are covered.
Statistical Analysis of the Exchange Rate of Bitcoin.
Chu, Jeffrey; Nadarajah, Saralees; Chan, Stephen
2015-01-01
Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate.
Lifetime statistics of quantum chaos studied by a multiscale analysis
Di Falco, A.
2012-04-30
In a series of pump and probe experiments, we study the lifetime statistics of a quantum chaotic resonator when the number of open channels is greater than one. Our design embeds a stadium billiard into a two dimensional photonic crystal realized on a silicon-on-insulator substrate. We calculate resonances through a multiscale procedure that combines energy landscape analysis and wavelet transforms. Experimental data is found to follow the universal predictions arising from random matrix theory with an excellent level of agreement.
Statistical and machine learning approaches for network analysis
Dehmer, Matthias
2012-01-01
Explore the multidisciplinary nature of complex networks through machine learning techniques Statistical and Machine Learning Approaches for Network Analysis provides an accessible framework for structurally analyzing graphs by bringing together known and novel approaches on graph classes and graph measures for classification. By providing different approaches based on experimental data, the book uniquely sets itself apart from the current literature by exploring the application of machine learning techniques to various types of complex networks. Comprised of chapters written by internation
Statistical analysis on reliability and serviceability of caterpillar tractor
Institute of Scientific and Technical Information of China (English)
WANG Jinwu; LIU Jiafu; XU Zhongxiang
2007-01-01
For further understanding reliability and serviceability of tractor and to furnish scientific and technical theories, based on the promotion and application of it, the following experiments and statistical analysis on reliability (reliability and MTBF) serviceability (service and MTTR) of Donfanghong-1002 and Dongfanghong-802 were conducted. The result showed that the intervals of average troubles of these two tractors were 182.62 h and 160.2 h, respectively, and the weakest assembly of them was engine part.
Common pitfalls in statistical analysis: Odds versus risk
Ranganathan, Priya; Aggarwal, Rakesh; Pramesh, C. S.
2015-01-01
In biomedical research, we are often interested in quantifying the relationship between an exposure and an outcome. “Odds” and “Risk” are the most common terms which are used as measures of association between variables. In this article, which is the fourth in the series of common pitfalls in statistical analysis, we explain the meaning of risk and odds and the difference between the two. PMID:26623395
Statistical Analysis of the Exchange Rate of Bitcoin
Chu, Jeffrey; Nadarajah, Saralees; Chan, Stephen
2015-01-01
Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate. PMID:26222702
Statistical Analysis of the Exchange Rate of Bitcoin.
Directory of Open Access Journals (Sweden)
Jeffrey Chu
Full Text Available Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate.
Hilbert, Manuel; Erat, Michèle C; Hachet, Virginie; Guichard, Paul; Blank, Iris D; Flückiger, Isabelle; Slater, Leanne; Lowe, Edward D; Hatzopoulos, Georgios N; Steinmetz, Michel O; Gönczy, Pierre; Vakonakis, Ioannis
2013-07-09
Centrioles are evolutionary conserved organelles that give rise to cilia and flagella as well as centrosomes. Centrioles display a characteristic ninefold symmetry imposed by the spindle assembly abnormal protein 6 (SAS-6) family. SAS-6 from Chlamydomonas reinhardtii and Danio rerio was shown to form ninefold symmetric, ring-shaped oligomers in vitro that were similar to the cartwheels observed in vivo during early steps of centriole assembly in most species. Here, we report crystallographic and EM analyses showing that, instead, Caenorhabotis elegans SAS-6 self-assembles into a spiral arrangement. Remarkably, we find that this spiral arrangement is also consistent with ninefold symmetry, suggesting that two distinct SAS-6 oligomerization architectures can direct the same output symmetry. Sequence analysis suggests that SAS-6 spirals are restricted to specific nematodes. This oligomeric arrangement may provide a structural basis for the presence of a central tube instead of a cartwheel during centriole assembly in these species.
Statistical methods of SNP data analysis with applications
Bulinski, Alexander; Shashkin, Alexey; Yaskov, Pavel
2011-01-01
Various statistical methods important for genetic analysis are considered and developed. Namely, we concentrate on the multifactor dimensionality reduction, logic regression, random forests and stochastic gradient boosting. These methods and their new modifications, e.g., the MDR method with "independent rule", are used to study the risk of complex diseases such as cardiovascular ones. The roles of certain combinations of single nucleotide polymorphisms and external risk factors are examined. To perform the data analysis concerning the ischemic heart disease and myocardial infarction the supercomputer SKIF "Chebyshev" of the Lomonosov Moscow State University was employed.
Regression modeling methods, theory, and computation with SAS
Panik, Michael
2009-01-01
Regression Modeling: Methods, Theory, and Computation with SAS provides an introduction to a diverse assortment of regression techniques using SAS to solve a wide variety of regression problems. The author fully documents the SAS programs and thoroughly explains the output produced by the programs.The text presents the popular ordinary least squares (OLS) approach before introducing many alternative regression methods. It covers nonparametric regression, logistic regression (including Poisson regression), Bayesian regression, robust regression, fuzzy regression, random coefficients regression,
SAS program for quantitative stratigraphic correlation by principal components
Hohn, M.E.
1985-01-01
A SAS program is presented which constructs a composite section of stratigraphic events through principal components analysis. The variables in the analysis are stratigraphic sections and the observational units are range limits of taxa. The program standardizes data in each section, extracts eigenvectors, estimates missing range limits, and computes the composite section from scores of events on the first principal component. Provided is an option of several types of diagnostic plots; these help one to determine conservative range limits or unrealistic estimates of missing values. Inspection of the graphs and eigenvalues allow one to evaluate goodness of fit between the composite and measured data. The program is extended easily to the creation of a rank-order composite. ?? 1985.
Statistical Analysis of 30 Years Rainfall Data: A Case Study
Arvind, G.; Ashok Kumar, P.; Girish Karthi, S.; Suribabu, C. R.
2017-07-01
Rainfall is a prime input for various engineering design such as hydraulic structures, bridges and culverts, canals, storm water sewer and road drainage system. The detailed statistical analysis of each region is essential to estimate the relevant input value for design and analysis of engineering structures and also for crop planning. A rain gauge station located closely in Trichy district is selected for statistical analysis where agriculture is the prime occupation. The daily rainfall data for a period of 30 years is used to understand normal rainfall, deficit rainfall, Excess rainfall and Seasonal rainfall of the selected circle headquarters. Further various plotting position formulae available is used to evaluate return period of monthly, seasonally and annual rainfall. This analysis will provide useful information for water resources planner, farmers and urban engineers to assess the availability of water and create the storage accordingly. The mean, standard deviation and coefficient of variation of monthly and annual rainfall was calculated to check the rainfall variability. From the calculated results, the rainfall pattern is found to be erratic. The best fit probability distribution was identified based on the minimum deviation between actual and estimated values. The scientific results and the analysis paved the way to determine the proper onset and withdrawal of monsoon results which were used for land preparation and sowing.
HistFitter: a flexible framework for statistical data analysis
Besjes, G J; Côté, D; Koutsman, A; Lorenz, J M; Short, D
2015-01-01
HistFitter is a software framework for statistical data analysis that has been used extensively in the ATLAS Collaboration to analyze data of proton-proton collisions produced by the Large Hadron Collider at CERN. Most notably, HistFitter has become a de-facto standard in searches for supersymmetric particles since 2012, with some usage for Exotic and Higgs boson physics. HistFitter coherently combines several statistics tools in a programmable and flexible framework that is capable of bookkeeping hundreds of data models under study using thousands of generated input histograms.HistFitter interfaces with the statistics tools HistFactory and RooStats to construct parametric models and to perform statistical tests of the data, and extends these tools in four key areas. The key innovations are to weave the concepts of control, validation and signal regions into the very fabric of HistFitter, and to treat these with rigorous methods. Multiple tools to visualize and interpret the results through a simple configura...
Detailed Analysis of the Interoccurrence Time Statistics in Seismic Activity
Tanaka, Hiroki; Aizawa, Yoji
2017-02-01
The interoccurrence time statistics of seismiciry is studied theoretically as well as numerically by taking into account the conditional probability and the correlations among many earthquakes in different magnitude levels. It is known so far that the interoccurrence time statistics is well approximated by the Weibull distribution, but the more detailed information about the interoccurrence times can be obtained from the analysis of the conditional probability. Firstly, we propose the Embedding Equation Theory (EET), where the conditional probability is described by two kinds of correlation coefficients; one is the magnitude correlation and the other is the inter-event time correlation. Furthermore, the scaling law of each correlation coefficient is clearly determined from the numerical data-analysis carrying out with the Preliminary Determination of Epicenter (PDE) Catalog and the Japan Meteorological Agency (JMA) Catalog. Secondly, the EET is examined to derive the magnitude dependence of the interoccurrence time statistics and the multi-fractal relation is successfully formulated. Theoretically we cannot prove the universality of the multi-fractal relation in seismic activity; nevertheless, the theoretical results well reproduce all numerical data in our analysis, where several common features or the invariant aspects are clearly observed. Especially in the case of stationary ensembles the multi-fractal relation seems to obey an invariant curve, furthermore in the case of non-stationary (moving time) ensembles for the aftershock regime the multi-fractal relation seems to satisfy a certain invariant curve at any moving times. It is emphasized that the multi-fractal relation plays an important role to unify the statistical laws of seismicity: actually the Gutenberg-Richter law and the Weibull distribution are unified in the multi-fractal relation, and some universality conjectures regarding the seismicity are briefly discussed.
Validation of statistical models for creep rupture by parametric analysis
Energy Technology Data Exchange (ETDEWEB)
Bolton, J., E-mail: john.bolton@uwclub.net [65, Fisher Ave., Rugby, Warks CV22 5HW (United Kingdom)
2012-01-15
Statistical analysis is an efficient method for the optimisation of any candidate mathematical model of creep rupture data, and for the comparative ranking of competing models. However, when a series of candidate models has been examined and the best of the series has been identified, there is no statistical criterion to determine whether a yet more accurate model might be devised. Hence there remains some uncertainty that the best of any series examined is sufficiently accurate to be considered reliable as a basis for extrapolation. This paper proposes that models should be validated primarily by parametric graphical comparison to rupture data and rupture gradient data. It proposes that no mathematical model should be considered reliable for extrapolation unless the visible divergence between model and data is so small as to leave no apparent scope for further reduction. This study is based on the data for a 12% Cr alloy steel used in BS PD6605:1998 to exemplify its recommended statistical analysis procedure. The models considered in this paper include a) a relatively simple model, b) the PD6605 recommended model and c) a more accurate model of somewhat greater complexity. - Highlights: Black-Right-Pointing-Pointer The paper discusses the validation of creep rupture models derived from statistical analysis. Black-Right-Pointing-Pointer It demonstrates that models can be satisfactorily validated by a visual-graphic comparison of models to data. Black-Right-Pointing-Pointer The method proposed utilises test data both as conventional rupture stress and as rupture stress gradient. Black-Right-Pointing-Pointer The approach is shown to be more reliable than a well-established and widely used method (BS PD6605).
Directory of Open Access Journals (Sweden)
Qiaorui Si
2014-02-01
Full Text Available This paper presents an investigation of pressure fluctuation of a single-suction volute-type centrifugal pump, particularly volute casing, by using numerical and experimental methods. A new type of hybrid Reynolds-averaged Navier-Stokes/Large Eddy Simulation, referred to as the shear stress transport-scale-adaptive simulation (SAS model, is employed to study the unsteady flow. Statistical analysis method is adopted to show the pressure fluctuation intensity distribution in the volute channel. A test rig for pressure pulsation measurement is built to validate the numerical simulation results using eight transient pressure sensors in the middle section of the volute wall. Results show that the SAS model can accurately predict the inner flow field of centrifugal pumps. Radial force acting on the impeller presents a star distribution related to the blade number. Pressure fluctuation intensity is strongest near the tongue and shows irregular distribution in the pump casing. Pressure fluctuation is distributed symmetrically at the cross-section of the volute casing because the volute can eliminate the rotational movement of the liquid discharged from the impeller. Blade passing frequency and its multiples indicate the dominant frequency of the monitoring points within the volute, and the low-frequency pulsation, particularly in the shaft component, increases when it operates at off-design condition, particularly with a small flow rate. The reason is that the vortex wave is enhanced at the off-design condition, which has an effect on the axle and is presented in the shaft component in the frequency domain.
The Effects of Statistical Analysis Software and Calculators on Statistics Achievement
Christmann, Edwin P.
2009-01-01
This study compared the effects of microcomputer-based statistical software and hand-held calculators on the statistics achievement of university males and females. The subjects, 73 graduate students enrolled in univariate statistics classes at a public comprehensive university, were randomly assigned to groups that used either microcomputer-based…
Multivariate statistical analysis a high-dimensional approach
Serdobolskii, V
2000-01-01
In the last few decades the accumulation of large amounts of in formation in numerous applications. has stimtllated an increased in terest in multivariate analysis. Computer technologies allow one to use multi-dimensional and multi-parametric models successfully. At the same time, an interest arose in statistical analysis with a de ficiency of sample data. Nevertheless, it is difficult to describe the recent state of affairs in applied multivariate methods as satisfactory. Unimprovable (dominating) statistical procedures are still unknown except for a few specific cases. The simplest problem of estimat ing the mean vector with minimum quadratic risk is unsolved, even for normal distributions. Commonly used standard linear multivari ate procedures based on the inversion of sample covariance matrices can lead to unstable results or provide no solution in dependence of data. Programs included in standard statistical packages cannot process 'multi-collinear data' and there are no theoretical recommen ...
Self-Contained Statistical Analysis of Gene Sets
Cannon, Judy L.; Ricoy, Ulises M.; Johnson, Christopher
2016-01-01
Microarrays are a powerful tool for studying differential gene expression. However, lists of many differentially expressed genes are often generated, and unraveling meaningful biological processes from the lists can be challenging. For this reason, investigators have sought to quantify the statistical probability of compiled gene sets rather than individual genes. The gene sets typically are organized around a biological theme or pathway. We compute correlations between different gene set tests and elect to use Fisher’s self-contained method for gene set analysis. We improve Fisher’s differential expression analysis of a gene set by limiting the p-value of an individual gene within the gene set to prevent a small percentage of genes from determining the statistical significance of the entire set. In addition, we also compute dependencies among genes within the set to determine which genes are statistically linked. The method is applied to T-ALL (T-lineage Acute Lymphoblastic Leukemia) to identify differentially expressed gene sets between T-ALL and normal patients and T-ALL and AML (Acute Myeloid Leukemia) patients. PMID:27711232
Agriculture, population growth, and statistical analysis of the radiocarbon record.
Zahid, H Jabran; Robinson, Erick; Kelly, Robert L
2016-01-26
The human population has grown significantly since the onset of the Holocene about 12,000 y ago. Despite decades of research, the factors determining prehistoric population growth remain uncertain. Here, we examine measurements of the rate of growth of the prehistoric human population based on statistical analysis of the radiocarbon record. We find that, during most of the Holocene, human populations worldwide grew at a long-term annual rate of 0.04%. Statistical analysis of the radiocarbon record shows that transitioning farming societies experienced the same rate of growth as contemporaneous foraging societies. The same rate of growth measured for populations dwelling in a range of environments and practicing a variety of subsistence strategies suggests that the global climate and/or endogenous biological factors, not adaptability to local environment or subsistence practices, regulated the long-term growth of the human population during most of the Holocene. Our results demonstrate that statistical analyses of large ensembles of radiocarbon dates are robust and valuable for quantitatively investigating the demography of prehistoric human populations worldwide.
Statistical wind analysis for near-space applications
Roney, Jason A.
2007-09-01
Statistical wind models were developed based on the existing observational wind data for near-space altitudes between 60 000 and 100 000 ft (18 30 km) above ground level (AGL) at two locations, Akon, OH, USA, and White Sands, NM, USA. These two sites are envisioned as playing a crucial role in the first flights of high-altitude airships. The analysis shown in this paper has not been previously applied to this region of the stratosphere for such an application. Standard statistics were compiled for these data such as mean, median, maximum wind speed, and standard deviation, and the data were modeled with Weibull distributions. These statistics indicated, on a yearly average, there is a lull or a “knee” in the wind between 65 000 and 72 000 ft AGL (20 22 km). From the standard statistics, trends at both locations indicated substantial seasonal variation in the mean wind speed at these heights. The yearly and monthly statistical modeling indicated that Weibull distributions were a reasonable model for the data. Forecasts and hindcasts were done by using a Weibull model based on 2004 data and comparing the model with the 2003 and 2005 data. The 2004 distribution was also a reasonable model for these years. Lastly, the Weibull distribution and cumulative function were used to predict the 50%, 95%, and 99% winds, which are directly related to the expected power requirements of a near-space station-keeping airship. These values indicated that using only the standard deviation of the mean may underestimate the operational conditions.
Statistical methods for the detection and analysis of radioactive sources
Klumpp, John
We consider four topics from areas of radioactive statistical analysis in the present study: Bayesian methods for the analysis of count rate data, analysis of energy data, a model for non-constant background count rate distributions, and a zero-inflated model of the sample count rate. The study begins with a review of Bayesian statistics and techniques for analyzing count rate data. Next, we consider a novel system for incorporating energy information into count rate measurements which searches for elevated count rates in multiple energy regions simultaneously. The system analyzes time-interval data in real time to sequentially update a probability distribution for the sample count rate. We then consider a "moving target" model of background radiation in which the instantaneous background count rate is a function of time, rather than being fixed. Unlike the sequential update system, this model assumes a large body of pre-existing data which can be analyzed retrospectively. Finally, we propose a novel Bayesian technique which allows for simultaneous source detection and count rate analysis. This technique is fully compatible with, but independent of, the sequential update system and moving target model.
A Statistical Analysis of Lunisolar-Earthquake Connections
Rüegg, Christian Michael-André
2012-11-01
Despite over a century of study, the relationship between lunar cycles and earthquakes remains controversial and difficult to quantitatively investigate. Perhaps as a consequence, major earthquakes around the globe are frequently followed by "prediction claim", using lunar cycles, that generate media furore and pressure scientists to provide resolute answers. The 2010-2011 Canterbury earthquakes in New Zealand were no exception; significant media attention was given to lunar derived earthquake predictions by non-scientists, even though the predictions were merely "opinions" and were not based on any statistically robust temporal or causal relationships. This thesis provides a framework for studying lunisolar earthquake temporal relationships by developing replicable statistical methodology based on peer reviewed literature. Notable in the methodology is a high accuracy ephemeris, called ECLPSE, designed specifically by the author for use on earthquake catalogs and a model for performing phase angle analysis.
Statistical analysis of subjective preferences for video enhancement
Woods, Russell L.; Satgunam, PremNandhini; Bronstad, P. Matthew; Peli, Eli
2010-02-01
Measuring preferences for moving video quality is harder than for static images due to the fleeting and variable nature of moving video. Subjective preferences for image quality can be tested by observers indicating their preference for one image over another. Such pairwise comparisons can be analyzed using Thurstone scaling (Farrell, 1999). Thurstone (1927) scaling is widely used in applied psychology, marketing, food tasting and advertising research. Thurstone analysis constructs an arbitrary perceptual scale for the items that are compared (e.g. enhancement levels). However, Thurstone scaling does not determine the statistical significance of the differences between items on that perceptual scale. Recent papers have provided inferential statistical methods that produce an outcome similar to Thurstone scaling (Lipovetsky and Conklin, 2004). Here, we demonstrate that binary logistic regression can analyze preferences for enhanced video.
ATP binding to a multisubunit enzyme: statistical thermodynamics analysis
Zhang, Yunxin
2012-01-01
Due to inter-subunit communication, multisubunit enzymes usually hydrolyze ATP in a concerted fashion. However, so far the principle of this process remains poorly understood. In this study, from the viewpoint of statistical thermodynamics, a simple model is presented. In this model, we assume that the binding of ATP will change the potential of the corresponding enzyme subunit, and the degree of this change depends on the state of its adjacent subunits. The probability of enzyme in a given state satisfies the Boltzmann's distribution. Although it looks much simple, this model can fit the recent experimental data of chaperonin TRiC/CCT well. From this model, the dominant state of TRiC/CCT can be obtained. This study provided a new way to understand biophysical processes by statistical thermodynamics analysis.
Statistical analysis of effective singular values in matrix rank determination
Konstantinides, Konstantinos; Yao, Kung
1988-01-01
A major problem in using SVD (singular-value decomposition) as a tool in determining the effective rank of a perturbed matrix is that of distinguishing between significantly small and significantly large singular values to the end, conference regions are derived for the perturbed singular values of matrices with noisy observation data. The analysis is based on the theories of perturbations of singular values and statistical significance test. Threshold bounds for perturbation due to finite-precision and i.i.d. random models are evaluated. In random models, the threshold bounds depend on the dimension of the matrix, the noisy variance, and predefined statistical level of significance. Results applied to the problem of determining the effective order of a linear autoregressive system from the approximate rank of a sample autocorrelation matrix are considered. Various numerical examples illustrating the usefulness of these bounds and comparisons to other previously known approaches are given.
Statistical methods for data analysis in particle physics
Lista, Luca
2015-01-01
This concise set of course-based notes provides the reader with the main concepts and tools to perform statistical analysis of experimental data, in particular in the field of high-energy physics (HEP). First, an introduction to probability theory and basic statistics is given, mainly as reminder from advanced undergraduate studies, yet also in view to clearly distinguish the Frequentist versus Bayesian approaches and interpretations in subsequent applications. More advanced concepts and applications are gradually introduced, culminating in the chapter on upper limits as many applications in HEP concern hypothesis testing, where often the main goal is to provide better and better limits so as to be able to distinguish eventually between competing hypotheses or to rule out some of them altogether. Many worked examples will help newcomers to the field and graduate students to understand the pitfalls in applying theoretical concepts to actual data
[Statistical analysis of DNA sequences nearby splicing sites].
Korzinov, O M; Astakhova, T V; Vlasov, P K; Roĭtberg, M A
2008-01-01
Recognition of coding regions within eukaryotic genomes is one of oldest but yet not solved problems of bioinformatics. New high-accuracy methods of splicing sites recognition are needed to solve this problem. A question of current interest is to identify specific features of nucleotide sequences nearby splicing sites and recognize sites in sequence context. We performed a statistical analysis of human genes fragment database and revealed some characteristics of nucleotide sequences in splicing sites neighborhood. Frequencies of all nucleotides and dinucleotides in splicing sites environment were computed and nucleotides and dinucleotides with extremely high\\low occurrences were identified. Statistical information obtained in this work can be used in further development of the methods of splicing sites annotation and exon-intron structure recognition.
The NIRS Analysis Package: noise reduction and statistical inference.
Fekete, Tomer; Rubin, Denis; Carlson, Joshua M; Mujica-Parodi, Lilianne R
2011-01-01
Near infrared spectroscopy (NIRS) is a non-invasive optical imaging technique that can be used to measure cortical hemodynamic responses to specific stimuli or tasks. While analyses of NIRS data are normally adapted from established fMRI techniques, there are nevertheless substantial differences between the two modalities. Here, we investigate the impact of NIRS-specific noise; e.g., systemic (physiological), motion-related artifacts, and serial autocorrelations, upon the validity of statistical inference within the framework of the general linear model. We present a comprehensive framework for noise reduction and statistical inference, which is custom-tailored to the noise characteristics of NIRS. These methods have been implemented in a public domain Matlab toolbox, the NIRS Analysis Package (NAP). Finally, we validate NAP using both simulated and actual data, showing marked improvement in the detection power and reliability of NIRS.
On Statistical Analysis of Neuroimages with Imperfect Registration
Kim, Won Hwa; Ravi, Sathya N.; Johnson, Sterling C.; Okonkwo, Ozioma C.; Singh, Vikas
2016-01-01
A variety of studies in neuroscience/neuroimaging seek to perform statistical inference on the acquired brain image scans for diagnosis as well as understanding the pathological manifestation of diseases. To do so, an important first step is to register (or co-register) all of the image data into a common coordinate system. This permits meaningful comparison of the intensities at each voxel across groups (e.g., diseased versus healthy) to evaluate the effects of the disease and/or use machine learning algorithms in a subsequent step. But errors in the underlying registration make this problematic, they either decrease the statistical power or make the follow-up inference tasks less effective/accurate. In this paper, we derive a novel algorithm which offers immunity to local errors in the underlying deformation field obtained from registration procedures. By deriving a deformation invariant representation of the image, the downstream analysis can be made more robust as if one had access to a (hypothetical) far superior registration procedure. Our algorithm is based on recent work on scattering transform. Using this as a starting point, we show how results from harmonic analysis (especially, non-Euclidean wavelets) yields strategies for designing deformation and additive noise invariant representations of large 3-D brain image volumes. We present a set of results on synthetic and real brain images where we achieve robust statistical analysis even in the presence of substantial deformation errors; here, standard analysis procedures significantly under-perform and fail to identify the true signal. PMID:27042168
STATISTICAL ANALYSIS OF TANK 19F FLOOR SAMPLE RESULTS
Energy Technology Data Exchange (ETDEWEB)
Harris, S.
2010-09-02
Representative sampling has been completed for characterization of the residual material on the floor of Tank 19F as per the statistical sampling plan developed by Harris and Shine. Samples from eight locations have been obtained from the tank floor and two of the samples were archived as a contingency. Six samples, referred to in this report as the current scrape samples, have been submitted to and analyzed by SRNL. This report contains the statistical analysis of the floor sample analytical results to determine if further data are needed to reduce uncertainty. Included are comparisons with the prior Mantis samples results to determine if they can be pooled with the current scrape samples to estimate the upper 95% confidence limits (UCL95%) for concentration. Statistical analysis revealed that the Mantis and current scrape sample results are not compatible. Therefore, the Mantis sample results were not used to support the quantification of analytes in the residual material. Significant spatial variability among the current scrape sample results was not found. Constituent concentrations were similar between the North and South hemispheres as well as between the inner and outer regions of the tank floor. The current scrape sample results from all six samples fall within their 3-sigma limits. In view of the results from numerous statistical tests, the data were pooled from all six current scrape samples. As such, an adequate sample size was provided for quantification of the residual material on the floor of Tank 19F. The uncertainty is quantified in this report by an UCL95% on each analyte concentration. The uncertainty in analyte concentration was calculated as a function of the number of samples, the average, and the standard deviation of the analytical results. The UCL95% was based entirely on the six current scrape sample results (each averaged across three analytical determinations).
SAS-2 high-energy gamma-ray observations of the Vela pulsar. II
Thompson, D. J.; Fichtel, C. E.; Kniffen, D. A.; Ogelman, H. B.
1977-01-01
Analysis of additional data from SAS-2 experiment and improvements in the orbit-attitude data and analysis procedures have produced revised values for the flux from the Vela gamma-ray source. The pulsar phase plot shows two peaks, neither of which is in phase with the single radio pulse.
Zhang, Ying; Wang, Yang; Wang, Zhi-Gang; Wang, Xi; Guo, Huo-Sheng; Meng, Dong-Fang; Wong, Po-Keung
2012-01-01
Statistical experimental designs provided by statistical analysis system (SAS) software were applied to optimize the fermentation medium composition for the production of atrazine-degrading Acinetobacter sp. DNS(32) in shake-flask cultures. A "Plackett-Burman Design" was employed to evaluate the effects of different components in the medium. The concentrations of corn flour, soybean flour, and K(2)HPO(4) were found to significantly influence Acinetobacter sp. DNS(32) production. The steepest ascent method was employed to determine the optimal regions of these three significant factors. Then, these three factors were optimized using central composite design of "response surface methodology." The optimized fermentation medium composition was composed as follows (g/L): corn flour 39.49, soybean flour 25.64, CaCO(3) 3, K(2)HPO(4) 3.27, MgSO(4)·7H(2)O 0.2, and NaCl 0.2. The predicted and verifiable values in the medium with optimized concentration of components in shake flasks experiments were 7.079 × 10(8) CFU/mL and 7.194 × 10(8) CFU/mL, respectively. The validated model can precisely predict the growth of atrazine-degraing bacterium, Acinetobacter sp. DNS(32).
Forensic discrimination of dyed hair color: II. Multivariate statistical analysis.
Barrett, Julie A; Siegel, Jay A; Goodpaster, John V
2011-01-01
This research is intended to assess the ability of UV-visible microspectrophotometry to successfully discriminate the color of dyed hair. Fifty-five red hair dyes were analyzed and evaluated using multivariate statistical techniques including agglomerative hierarchical clustering (AHC), principal component analysis (PCA), and discriminant analysis (DA). The spectra were grouped into three classes, which were visually consistent with different shades of red. A two-dimensional PCA observations plot was constructed, describing 78.6% of the overall variance. The wavelength regions associated with the absorbance of hair and dye were highly correlated. Principal components were selected to represent 95% of the overall variance for analysis with DA. A classification accuracy of 89% was observed for the comprehensive dye set, while external validation using 20 of the dyes resulted in a prediction accuracy of 75%. Significant color loss from successive washing of hair samples was estimated to occur within 3 weeks of dye application.
Managing Performance Analysis with Dynamic Statistical Projection Pursuit
Energy Technology Data Exchange (ETDEWEB)
Vetter, J.S.; Reed, D.A.
2000-05-22
Computer systems and applications are growing more complex. Consequently, performance analysis has become more difficult due to the complex, transient interrelationships among runtime components. To diagnose these types of performance issues, developers must use detailed instrumentation to capture a large number of performance metrics. Unfortunately, this instrumentation may actually influence the performance analysis, leading the developer to an ambiguous conclusion. In this paper, we introduce a technique for focusing a performance analysis on interesting performance metrics. This technique, called dynamic statistical projection pursuit, identifies interesting performance metrics that the monitoring system should capture across some number of processors. By reducing the number of performance metrics, projection pursuit can limit the impact of instrumentation on the performance of the target system and can reduce the volume of performance data.
Kleijnen, J.P.C.
1995-01-01
This tutorial discusses what-if analysis and optimization of System Dynamics models. These problems are solved, using the statistical techniques of regression analysis and design of experiments (DOE). These issues are illustrated by applying the statistical techniques to a System Dynamics model for
Consolidity analysis for fully fuzzy functions, matrices, probability and statistics
Directory of Open Access Journals (Sweden)
Walaa Ibrahim Gabr
2015-03-01
Full Text Available The paper presents a comprehensive review of the know-how for developing the systems consolidity theory for modeling, analysis, optimization and design in fully fuzzy environment. The solving of systems consolidity theory included its development for handling new functions of different dimensionalities, fuzzy analytic geometry, fuzzy vector analysis, functions of fuzzy complex variables, ordinary differentiation of fuzzy functions and partial fraction of fuzzy polynomials. On the other hand, the handling of fuzzy matrices covered determinants of fuzzy matrices, the eigenvalues of fuzzy matrices, and solving least-squares fuzzy linear equations. The approach demonstrated to be also applicable in a systematic way in handling new fuzzy probabilistic and statistical problems. This included extending the conventional probabilistic and statistical analysis for handling fuzzy random data. Application also covered the consolidity of fuzzy optimization problems. Various numerical examples solved have demonstrated that the new consolidity concept is highly effective in solving in a compact form the propagation of fuzziness in linear, nonlinear, multivariable and dynamic problems with different types of complexities. Finally, it is demonstrated that the implementation of the suggested fuzzy mathematics can be easily embedded within normal mathematics through building special fuzzy functions library inside the computational Matlab Toolbox or using other similar software languages.
GIS-BASED SPATIAL STATISTICAL ANALYSIS OF COLLEGE GRADUATES EMPLOYMENT
Directory of Open Access Journals (Sweden)
R. Tang
2012-07-01
Full Text Available It is urgently necessary to be aware of the distribution and employment status of college graduates for proper allocation of human resources and overall arrangement of strategic industry. This study provides empirical evidence regarding the use of geocoding and spatial analysis in distribution and employment status of college graduates based on the data from 2004–2008 Wuhan Municipal Human Resources and Social Security Bureau, China. Spatio-temporal distribution of employment unit were analyzed with geocoding using ArcGIS software, and the stepwise multiple linear regression method via SPSS software was used to predict the employment and to identify spatially associated enterprise and professionals demand in the future. The results show that the enterprises in Wuhan east lake high and new technology development zone increased dramatically from 2004 to 2008, and tended to distributed southeastward. Furthermore, the models built by statistical analysis suggest that the specialty of graduates major in has an important impact on the number of the employment and the number of graduates engaging in pillar industries. In conclusion, the combination of GIS and statistical analysis which helps to simulate the spatial distribution of the employment status is a potential tool for human resource development research.
Gis-Based Spatial Statistical Analysis of College Graduates Employment
Tang, R.
2012-07-01
It is urgently necessary to be aware of the distribution and employment status of college graduates for proper allocation of human resources and overall arrangement of strategic industry. This study provides empirical evidence regarding the use of geocoding and spatial analysis in distribution and employment status of college graduates based on the data from 2004-2008 Wuhan Municipal Human Resources and Social Security Bureau, China. Spatio-temporal distribution of employment unit were analyzed with geocoding using ArcGIS software, and the stepwise multiple linear regression method via SPSS software was used to predict the employment and to identify spatially associated enterprise and professionals demand in the future. The results show that the enterprises in Wuhan east lake high and new technology development zone increased dramatically from 2004 to 2008, and tended to distributed southeastward. Furthermore, the models built by statistical analysis suggest that the specialty of graduates major in has an important impact on the number of the employment and the number of graduates engaging in pillar industries. In conclusion, the combination of GIS and statistical analysis which helps to simulate the spatial distribution of the employment status is a potential tool for human resource development research.
The features of Drosophila core promoters revealed by statistical analysis
Directory of Open Access Journals (Sweden)
Trifonov Edward N
2006-06-01
Full Text Available Abstract Background Experimental investigation of transcription is still a very labor- and time-consuming process. Only a few transcription initiation scenarios have been studied in detail. The mechanism of interaction between basal machinery and promoter, in particular core promoter elements, is not known for the majority of identified promoters. In this study, we reveal various transcription initiation mechanisms by statistical analysis of 3393 nonredundant Drosophila promoters. Results Using Drosophila-specific position-weight matrices, we identified promoters containing TATA box, Initiator, Downstream Promoter Element (DPE, and Motif Ten Element (MTE, as well as core elements discovered in Human (TFIIB Recognition Element (BRE and Downstream Core Element (DCE. Promoters utilizing known synergetic combinations of two core elements (TATA_Inr, Inr_MTE, Inr_DPE, and DPE_MTE were identified. We also establish the existence of promoters with potentially novel synergetic combinations: TATA_DPE and TATA_MTE. Our analysis revealed several motifs with the features of promoter elements, including possible novel core promoter element(s. Comparison of Human and Drosophila showed consistent percentages of promoters with TATA, Inr, DPE, and synergetic combinations thereof, as well as most of the same functional and mutual positions of the core elements. No statistical evidence of MTE utilization in Human was found. Distinct nucleosome positioning in particular promoter classes was revealed. Conclusion We present lists of promoters that potentially utilize the aforementioned elements/combinations. The number of these promoters is two orders of magnitude larger than the number of promoters in which transcription initiation was experimentally studied. The sequences are ready to be experimentally tested or used for further statistical analysis. The developed approach may be utilized for other species.
Statistical design and analysis of RNA sequencing data.
Auer, Paul L; Doerge, R W
2010-06-01
Next-generation sequencing technologies are quickly becoming the preferred approach for characterizing and quantifying entire genomes. Even though data produced from these technologies are proving to be the most informative of any thus far, very little attention has been paid to fundamental design aspects of data collection and analysis, namely sampling, randomization, replication, and blocking. We discuss these concepts in an RNA sequencing framework. Using simulations we demonstrate the benefits of collecting replicated RNA sequencing data according to well known statistical designs that partition the sources of biological and technical variation. Examples of these designs and their corresponding models are presented with the goal of testing differential expression.
Statistical Analysis of Designed Experiments Theory and Applications
Tamhane, Ajit C
2012-01-01
A indispensable guide to understanding and designing modern experiments The tools and techniques of Design of Experiments (DOE) allow researchers to successfully collect, analyze, and interpret data across a wide array of disciplines. Statistical Analysis of Designed Experiments provides a modern and balanced treatment of DOE methodology with thorough coverage of the underlying theory and standard designs of experiments, guiding the reader through applications to research in various fields such as engineering, medicine, business, and the social sciences. The book supplies a foundation for the
Statistical energy analysis of complex structures, phase 2
Trudell, R. W.; Yano, L. I.
1980-01-01
A method for estimating the structural vibration properties of complex systems in high frequency environments was investigated. The structure analyzed was the Materials Experiment Assembly, (MEA), which is a portion of the OST-2A payload for the space transportation system. Statistical energy analysis (SEA) techniques were used to model the structure and predict the structural element response to acoustic excitation. A comparison of the intial response predictions and measured acoustic test data is presented. The conclusions indicate that: the SEA predicted the response of primary structure to acoustic excitation over a wide range of frequencies; and the contribution of mechanically induced random vibration to the total MEA is not significant.
Multi-scale statistical analysis of coronal solar activity
Gamborino, Diana; del-Castillo-Negrete, Diego; Martinell, Julio J.
2016-07-01
Multi-filter images from the solar corona are used to obtain temperature maps that are analyzed using techniques based on proper orthogonal decomposition (POD) in order to extract dynamical and structural information at various scales. Exploring active regions before and after a solar flare and comparing them with quiet regions, we show that the multi-scale behavior presents distinct statistical properties for each case that can be used to characterize the level of activity in a region. Information about the nature of heat transport is also to be extracted from the analysis.
Spatial Analysis Along Networks Statistical and Computational Methods
Okabe, Atsuyuki
2012-01-01
In the real world, there are numerous and various events that occur on and alongside networks, including the occurrence of traffic accidents on highways, the location of stores alongside roads, the incidence of crime on streets and the contamination along rivers. In order to carry out analyses of those events, the researcher needs to be familiar with a range of specific techniques. Spatial Analysis Along Networks provides a practical guide to the necessary statistical techniques and their computational implementation. Each chapter illustrates a specific technique, from Stochastic Point Process
Feature statistic analysis of ultrasound images of liver cancer
Huang, Shuqin; Ding, Mingyue; Zhang, Songgeng
2007-12-01
In this paper, a specific feature analysis of liver ultrasound images including normal liver, liver cancer especially hepatocellular carcinoma (HCC) and other hepatopathy is discussed. According to the classification of hepatocellular carcinoma (HCC), primary carcinoma is divided into four types. 15 features from single gray-level statistic, gray-level co-occurrence matrix (GLCM), and gray-level run-length matrix (GLRLM) are extracted. Experiments for the discrimination of each type of HCC, normal liver, fatty liver, angioma and hepatic abscess have been conducted. Corresponding features to potentially discriminate them are found.
STATISTIC ANALYSIS OF INTERNATIONAL TOURISM ON ROMANIAN SEASIDE
Directory of Open Access Journals (Sweden)
MIRELA SECARĂ
2010-01-01
Full Text Available In order to meet European and international touristic competition standards, modernization, re-establishment and development of Romanian tourism are necessary as well as creation of modern touristic products that are competitive on this market. The use of modern methods of statistic analysis in the field of tourism facilitates the achievement of systems of information that are the instruments for: evaluation of touristic demand and touristic supply, follow-up of touristic services of each touring form, follow-up of transportation services, leisure activities, hotel accommodation, touristic market study, and a complex flexible system of management and accountancy.
Statistics for proteomics: experimental design and 2-DE differential analysis.
Chich, Jean-François; David, Olivier; Villers, Fanny; Schaeffer, Brigitte; Lutomski, Didier; Huet, Sylvie
2007-04-15
Proteomics relies on the separation of complex protein mixtures using bidimensional electrophoresis. This approach is largely used to detect the expression variations of proteins prepared from two or more samples. Recently, attention was drawn on the reliability of the results published in literature. Among the critical points identified were experimental design, differential analysis and the problem of missing data, all problems where statistics can be of help. Using examples and terms understandable by biologists, we describe how a collaboration between biologists and statisticians can improve reliability of results and confidence in conclusions.
A Probabilistic Rain Diagnostic Model Based on Cyclone Statistical Analysis
Iordanidou, V.; A. G. Koutroulis; I. K. Tsanis
2014-01-01
Data from a dense network of 69 daily precipitation gauges over the island of Crete and cyclone climatological analysis over middle-eastern Mediterranean are combined in a statistical approach to develop a rain diagnostic model. Regarding the dataset, 0.5 × 0.5, 33-year (1979–2011) European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis (ERA-Interim) is used. The cyclone tracks and their characteristics are identified with the aid of Melbourne University algorithm (MS scheme). T...
Research and Development on Food Nutrition Statistical Analysis Software System
Directory of Open Access Journals (Sweden)
Du Li
2013-12-01
Full Text Available Designing and developing a set of food nutrition component statistical analysis software can realize the automation of nutrition calculation, improve the nutrition processional professional’s working efficiency and achieve the informatization of the nutrition propaganda and education. In the software development process, the software engineering method and database technology are used to calculate the human daily nutritional intake and the intelligent system is used to evaluate the user’s health condition. The experiment can show that the system can correctly evaluate the human health condition and offer the reasonable suggestion, thus exploring a new road to solve the complex nutrition computational problem with information engineering.
Spanish Validation of the School Anxiety Scale-Teacher Report (SAS-TR).
Orgilés, Mireia; Fernández-Martínez, Iván; Lera-Miguel, Sara; Marzo, Juan Carlos; Medrano, Laura; Espada, José Pedro
2016-11-04
This study aimed to examine the factorial structure and psychometric properties of the School Anxiety Scale-Teacher Report (SAS-TR) in a community sample of 315 Spanish children aged 5 to 12 years. Thirty-seven teachers from eleven schools completed the SAS-TR and the Strengths and Difficulties Questionnaire (SDQ) for each child. Confirmatory factor analysis supported the original two-factor structure, but a better fit model was obtained after removing four items. The scale was found to have high internal consistency (α = 0.91) and satisfactory test-retest reliability (ICC = 0.87) for the Spanish sample. Convergent validity was supported by positive significant correlations between the SAS-TR and the Emotional Symptoms subscale of the SDQ. Lower correlations between the SAS-TR and the SDQ Conduct Problems subscale supported the divergent validity. Overall, the findings suggest that the Spanish version of the SAS-TR is a reliable and valid instrument for teachers to assess anxiety in Spanish children.
Adaptive beamforming for low frequency SAS imagery and bathymetry
Hayes, M.P.; Hunter, A.J.
2012-01-01
Synthetic aperture side-scan sonar (SAS) is a mature technology for high-resolution sea floor imaging [1]. Interferometric synthetic aperture sonars (InSAS) use additional hydrophones in a vertical array for bathymetric mapping [2]. This has created high-resolution bathymetry in deep water environme
Adaptive beamforming for low frequency SAS imagery and bathymetry
Hayes, M.P.; Hunter, A.J.
2012-01-01
Synthetic aperture side-scan sonar (SAS) is a mature technology for high-resolution sea floor imaging [1]. Interferometric synthetic aperture sonars (InSAS) use additional hydrophones in a vertical array for bathymetric mapping [2]. This has created high-resolution bathymetry in deep water
Hotelli Radisson SAS mantra : jah, ma saan! / Kai Vare
Vare, Kai, 1968-
2004-01-01
Radisson SAS hotell Tallinnas on kliendirahulolu-uuringute järgi keti hotellide seas esimeste hulgas. Hotelli direktor Kaido Ojaperv ja müügijuht Ann-Kai Tõrs Radisson SAS-i standarditest, kliendi sajaprotsendilise rahulolu tagamise põhimõtetest, personali valikust, koolitusest. Kommenteerib Sandra Dimitrovich
A probabilistic analysis of wind gusts using extreme value statistics
Energy Technology Data Exchange (ETDEWEB)
Friederichs, Petra; Bentzien, Sabrina; Lenz, Anne; Krampitz, Rebekka [Meteorological Inst., Univ. of Bonn (Germany); Goeber, Martin [Deutscher Wetterdienst, Offenbach (Germany)
2009-12-15
The spatial variability of wind gusts is probably as large as that of precipitation, but the observational weather station network is much less dense. The lack of an area-wide observational analysis hampers the forecast verification of wind gust warnings. This article develops and compares several approaches to derive a probabilistic analysis of wind gusts for Germany. Such an analysis provides a probability that a wind gust exceeds a certain warning level. To that end we have 5 years of observations of hourly wind maxima at about 140 weather stations of the German weather service at our disposal. The approaches are based on linear statistical modeling using generalized linear models, extreme value theory and quantile regression. Warning level exceedance probabilities are estimated in response to predictor variables such as the observed mean wind or the operational analysis of the wind velocity at a height of 10 m above ground provided by the European Centre for Medium Range Weather Forecasts (ECMWF). The study shows that approaches that apply to the differences between the recorded wind gust and the mean wind perform better in terms of the Brier skill score (which measures the quality of a probability forecast) than those using the gust factor or the wind gusts only. The study points to the benefit from using extreme value theory as the most appropriate and theoretically consistent statistical model. The most informative predictors are the observed mean wind, but also the observed gust velocities recorded at the neighboring stations. Out of the predictors used from the ECMWF analysis, the wind velocity at 10 m above ground is the most informative predictor, whereas the wind shear and the vertical velocity provide no additional skill. For illustration the results for January 2007 and during the winter storm Kyrill are shown. (orig.)
Statistical analysis of personal radiofrequency electromagnetic field measurements with nondetects.
Röösli, Martin; Frei, Patrizia; Mohler, Evelyn; Braun-Fahrländer, Charlotte; Bürgi, Alfred; Fröhlich, Jürg; Neubauer, Georg; Theis, Gaston; Egger, Matthias
2008-09-01
Exposimeters are increasingly applied in bioelectromagnetic research to determine personal radiofrequency electromagnetic field (RF-EMF) exposure. The main advantages of exposimeter measurements are their convenient handling for study participants and the large amount of personal exposure data, which can be obtained for several RF-EMF sources. However, the large proportion of measurements below the detection limit is a challenge for data analysis. With the robust ROS (regression on order statistics) method, summary statistics can be calculated by fitting an assumed distribution to the observed data. We used a preliminary sample of 109 weekly exposimeter measurements from the QUALIFEX study to compare summary statistics computed by robust ROS with a naïve approach, where values below the detection limit were replaced by the value of the detection limit. For the total RF-EMF exposure, differences between the naïve approach and the robust ROS were moderate for the 90th percentile and the arithmetic mean. However, exposure contributions from minor RF-EMF sources were considerably overestimated with the naïve approach. This results in an underestimation of the exposure range in the population, which may bias the evaluation of potential exposure-response associations. We conclude from our analyses that summary statistics of exposimeter data calculated by robust ROS are more reliable and more informative than estimates based on a naïve approach. Nevertheless, estimates of source-specific medians or even lower percentiles depend on the assumed data distribution and should be considered with caution. Copyright 2008 Wiley-Liss, Inc.
Smoothing point data into maps using SAS/GRAPH (trade name) software. Forest Service research note
Energy Technology Data Exchange (ETDEWEB)
Chojnacky, D.C.; Rubey, M.E.
1996-01-01
Point of plot data are commonly available for mapping forest landscapes. Because such data are sampled, mapping a complete coverage usually requires some type of interpolation between plots. SAS/GRAPH software includes the G3GRID procedure for interpolating or smoothing this type of data to map with G3D or GCONTOUR procedures. However, the smoothing process in G3GRID is not easily controlled, nor can it be used to display missing data within rectangular grid maps. These shortcomings motivated development of SAS code that prepares point data for display in mapping units. This code links well with the rest of the SAS system to allow for powerful, easily controlled data analysis within mapping units. Examples are given for mapping forest vegetation with the GMAP procedure.
STATISTICAL ANALYSIS OF THE TM- MODEL VIA BAYESIAN APPROACH
Directory of Open Access Journals (Sweden)
Muhammad Aslam
2012-11-01
Full Text Available The method of paired comparisons calls for the comparison of treatments presented in pairs to judges who prefer the better one based on their sensory evaluations. Thurstone (1927 and Mosteller (1951 employ the method of maximum likelihood to estimate the parameters of the Thurstone-Mosteller model for the paired comparisons. A Bayesian analysis of the said model using the non-informative reference (Jeffreys prior is presented in this study. The posterior estimates (means and joint modes of the parameters and the posterior probabilities comparing the two parameters are obtained for the analysis. The predictive probabilities that one treatment (Ti in preferred to any other treatment (Tj in a future single comparison are also computed. In addition, the graphs of the marginal posterior distributions of the individual parameter are drawn. The appropriateness of the model is also tested using the Chi-Square test statistic.
On Understanding Statistical Data Analysis in Higher Education
Montalbano, Vera
2012-01-01
Data analysis is a powerful tool in all experimental sciences. Statistical methods, such as sampling theory, computer technologies necessary for handling large amounts of data, skill in analysing information contained in different types of graphs are all competences necessary for achieving an in-depth data analysis. In higher education, these topics are usually fragmentized in different courses, the interdisciplinary integration can lack, some caution in the use of these topics can missing or be misunderstood. Students are often obliged to acquire these skills by themselves during the preparation of the final experimental thesis. A proposal for a learning path on nuclear phenomena is presented in order to develop these scientific competences in physics courses. An introduction to radioactivity and nuclear phenomenology is followed by measurements of natural radioactivity. Background and weak sources can be monitored for long time in a physics laboratory. The data are collected and analyzed in a computer lab i...
Statistical analysis of cascading failures in power grids
Energy Technology Data Exchange (ETDEWEB)
Chertkov, Michael [Los Alamos National Laboratory; Pfitzner, Rene [Los Alamos National Laboratory; Turitsyn, Konstantin [Los Alamos National Laboratory
2010-12-01
We introduce a new microscopic model of cascading failures in transmission power grids. This model accounts for automatic response of the grid to load fluctuations that take place on the scale of minutes, when optimum power flow adjustments and load shedding controls are unavailable. We describe extreme events, caused by load fluctuations, which cause cascading failures of loads, generators and lines. Our model is quasi-static in the causal, discrete time and sequential resolution of individual failures. The model, in its simplest realization based on the Directed Current description of the power flow problem, is tested on three standard IEEE systems consisting of 30, 39 and 118 buses. Our statistical analysis suggests a straightforward classification of cascading and islanding phases in terms of the ratios between average number of removed loads, generators and links. The analysis also demonstrates sensitivity to variations in line capacities. Future research challenges in modeling and control of cascading outages over real-world power networks are discussed.
Topics in statistical data analysis for high-energy physics
Cowan, G
2013-01-01
These lectures concern two topics that are becoming increasingly important in the analysis of High Energy Physics (HEP) data: Bayesian statistics and multivariate methods. In the Bayesian approach we extend the interpretation of probability to cover not only the frequency of repeatable outcomes but also to include a degree of belief. In this way we are able to associate probability with a hypothesis and thus to answer directly questions that cannot be addressed easily with traditional frequentist methods. In multivariate analysis we try to exploit as much information as possible from the characteristics that we measure for each event to distinguish between event types. In particular we will look at a method that has gained popularity in HEP in recent years: the boosted decision tree (BDT).
Using Statistical Analysis Software to Advance Nitro Plasticizer Wettability
Energy Technology Data Exchange (ETDEWEB)
Shear, Trevor Allan [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2017-08-29
Statistical analysis in science is an extremely powerful tool that is often underutilized. Additionally, it is frequently the case that data is misinterpreted or not used to its fullest extent. Utilizing the advanced software JMP®, many aspects of experimental design and data analysis can be evaluated and improved. This overview will detail the features of JMP® and how they were used to advance a project, resulting in time and cost savings, as well as the collection of scientifically sound data. The project analyzed in this report addresses the inability of a nitro plasticizer to coat a gold coated quartz crystal sensor used in a quartz crystal microbalance. Through the use of the JMP® software, the wettability of the nitro plasticizer was increased by over 200% using an atmospheric plasma pen, ensuring good sample preparation and reliable results.
Processes and subdivisions in diogenites, a multivariate statistical analysis
Harriott, T. A.; Hewins, R. H.
1984-01-01
Multivariate statistical techniques used on diogenite orthopyroxene analyses show the relationships that occur within diogenites and the two orthopyroxenite components (class I and II) in the polymict diogenite Garland. Cluster analysis shows that only Peckelsheim is similar to Garland class I (Fe-rich) and the other diogenites resemble Garland class II. The unique diogenite Y 75032 may be related to type I by fractionation. Factor analysis confirms the subdivision and shows that Fe does not correlate with the weakly incompatible elements across the entire pyroxene composition range, indicating that igneous fractionation is not the process controlling total diogenite composition variation. The occurrence of two groups of diogenites is interpreted as the result of sampling or mixing of two main sequences of orthopyroxene cumulates with slightly different compositions.
Statistical learning analysis in neuroscience: aiming for transparency
Directory of Open Access Journals (Sweden)
Michael Hanke
2010-05-01
Full Text Available Encouraged by a rise of reciprocal interest between the machine learning and neuroscience communities, several recent studies have demonstrated the explanatory power of statistical learning techniques for the analysis of neural data. In order to facilitate a wider adoption of these methods neuroscientific research needs to ensure a maximum of transparency to allow for comprehensive evaluation of the employed procedures. We argue that such transparency requires ``neuroscience-aware'' technology for the performance of multivariate pattern analyses of neural data that can be documented in a comprehensive, yet comprehensible way. Recently, we introduced PyMVPA, a specialized Python framework for machine learning based data analysis that addresses this demand. Here we review its features and applicability to various neural data modalities.
Statistical learning analysis in neuroscience: aiming for transparency.
Hanke, Michael; Halchenko, Yaroslav O; Haxby, James V; Pollmann, Stefan
2010-01-01
Encouraged by a rise of reciprocal interest between the machine learning and neuroscience communities, several recent studies have demonstrated the explanatory power of statistical learning techniques for the analysis of neural data. In order to facilitate a wider adoption of these methods, neuroscientific research needs to ensure a maximum of transparency to allow for comprehensive evaluation of the employed procedures. We argue that such transparency requires "neuroscience-aware" technology for the performance of multivariate pattern analyses of neural data that can be documented in a comprehensive, yet comprehensible way. Recently, we introduced PyMVPA, a specialized Python framework for machine learning based data analysis that addresses this demand. Here, we review its features and applicability to various neural data modalities.
Multivariate Statistical Analysis Applied in Wine Quality Evaluation
Directory of Open Access Journals (Sweden)
Jieling Zou
2015-08-01
Full Text Available This study applies multivariate statistical approaches to wine quality evaluation. With 27 red wine samples, four factors were identified out of 12 parameters by principal component analysis, explaining 89.06% of the total variance of data. As iterative weights calculated by the BP neural network revealed little difference from weights determined by information entropy method, the latter was chosen to measure the importance of indicators. Weighted cluster analysis performs well in classifying the sample group further into two sub-clusters. The second cluster of red wine samples, compared with its first, was lighter in color, tasted thinner and had fainter bouquet. Weighted TOPSIS method was used to evaluate the quality of wine in each sub-cluster. With scores obtained, each sub-cluster was divided into three grades. On the whole, the quality of lighter red wine was slightly better than the darker category. This study shows the necessity and usefulness of multivariate statistical techniques in both wine quality evaluation and parameter selection.
Statistical analysis of magnetically soft particles in magnetorheological elastomers
Gundermann, T.; Cremer, P.; Löwen, H.; Menzel, A. M.; Odenbach, S.
2017-04-01
The physical properties of magnetorheological elastomers (MRE) are a complex issue and can be influenced and controlled in many ways, e.g. by applying a magnetic field, by external mechanical stimuli, or by an electric potential. In general, the response of MRE materials to these stimuli is crucially dependent on the distribution of the magnetic particles inside the elastomer. Specific knowledge of the interactions between particles or particle clusters is of high relevance for understanding the macroscopic rheological properties and provides an important input for theoretical calculations. In order to gain a better insight into the correlation between the macroscopic effects and microstructure and to generate a database for theoretical analysis, x-ray micro-computed tomography (X-μCT) investigations as a base for a statistical analysis of the particle configurations were carried out. Different MREs with quantities of 2–15 wt% (0.27–2.3 vol%) of iron powder and different allocations of the particles inside the matrix were prepared. The X-μCT results were edited by an image processing software regarding the geometrical properties of the particles with and without the influence of an external magnetic field. Pair correlation functions for the positions of the particles inside the elastomer were calculated to statistically characterize the distributions of the particles in the samples.
Visualization methods for statistical analysis of microarray clusters
Directory of Open Access Journals (Sweden)
Li Kai
2005-05-01
Full Text Available Abstract Background The most common method of identifying groups of functionally related genes in microarray data is to apply a clustering algorithm. However, it is impossible to determine which clustering algorithm is most appropriate to apply, and it is difficult to verify the results of any algorithm due to the lack of a gold-standard. Appropriate data visualization tools can aid this analysis process, but existing visualization methods do not specifically address this issue. Results We present several visualization techniques that incorporate meaningful statistics that are noise-robust for the purpose of analyzing the results of clustering algorithms on microarray data. This includes a rank-based visualization method that is more robust to noise, a difference display method to aid assessments of cluster quality and detection of outliers, and a projection of high dimensional data into a three dimensional space in order to examine relationships between clusters. Our methods are interactive and are dynamically linked together for comprehensive analysis. Further, our approach applies to both protein and gene expression microarrays, and our architecture is scalable for use on both desktop/laptop screens and large-scale display devices. This methodology is implemented in GeneVAnD (Genomic Visual ANalysis of Datasets and is available at http://function.princeton.edu/GeneVAnD. Conclusion Incorporating relevant statistical information into data visualizations is key for analysis of large biological datasets, particularly because of high levels of noise and the lack of a gold-standard for comparisons. We developed several new visualization techniques and demonstrated their effectiveness for evaluating cluster quality and relationships between clusters.
Qiao, Renping; Cabral, Gabriela; Lettman, Molly M; Dammermann, Alexander; Dong, Gang
2012-11-14
The centriole is a conserved microtubule-based organelle essential for both centrosome formation and cilium biogenesis. Five conserved proteins for centriole duplication have been identified. Two of them, SAS-5 and SAS-6, physically interact with each other and are codependent for their targeting to procentrioles. However, it remains unclear how these two proteins interact at the molecular level. Here, we demonstrate that the short SAS-5 C-terminal domain (residues 390-404) specifically binds to a narrow central region (residues 275-288) of the SAS-6 coiled coil. This was supported by the crystal structure of the SAS-6 coiled-coil domain (CCD), which, together with mutagenesis studies, indicated that the association is mediated by synergistic hydrophobic and electrostatic interactions. The crystal structure also shows a periodic charge pattern along the SAS-6 CCD, which gives rise to an anti-parallel tetramer. Overall, our findings establish the molecular basis of the specific interaction between SAS-5 and SAS-6, and suggest that both proteins individually adopt an oligomeric conformation that is disrupted upon the formation of the hetero-complex to facilitate the correct assembly of the nine-fold symmetric centriole.
Energy Technology Data Exchange (ETDEWEB)
Kurtz, S.E.; Fields, D.E.
1983-10-01
This report describes a version of the TERPED/P computer code that is very useful for small data sets. A new algorithm for determining the Kolmogorov-Smirnov (KS) statistics is used to extend program applicability. The TERPED/P code facilitates the analysis of experimental data and assists the user in determining its probability distribution function. Graphical and numerical tests are performed interactively in accordance with the user's assumption of normally or log-normally distributed data. Statistical analysis options include computation of the chi-square statistic and the KS one-sample test statistic and the corresponding significance levels. Cumulative probability plots of the user's data are generated either via a local graphics terminal, a local line printer or character-oriented terminal, or a remote high-resolution graphics device such as the FR80 film plotter or the Calcomp paper plotter. Several useful computer methodologies suffer from limitations of their implementations of the KS nonparametric test. This test is one of the more powerful analysis tools for examining the validity of an assumption about the probability distribution of a set of data. KS algorithms are found in other analysis codes, including the Statistical Analysis Subroutine (SAS) package and earlier versions of TERPED. The inability of these algorithms to generate significance levels for sample sizes less than 50 has limited their usefulness. The release of the TERPED code described herein contains algorithms to allow computation of the KS statistic and significance level for data sets of, if the user wishes, as few as three points. Values computed for the KS statistic are within 3% of the correct value for all data set sizes.
中国古代的统计分析%Statistical analysis in ancient China
Institute of Scientific and Technical Information of China (English)
莫曰达
2003-01-01
Analyzing social and economic problems through statistics is one of an important aspects of statistics thoughts in ancient China. This paper demonstrates some situations of statistical analysis in ancient China.
Short-run and Current Analysis Model in Statistics
Directory of Open Access Journals (Sweden)
Constantin Anghelache
2006-01-01
Full Text Available Using the short-run statistic indicators is a compulsory requirement implied in the current analysis. Therefore, there is a system of EUROSTAT indicators on short run which has been set up in this respect, being recommended for utilization by the member-countries. On the basis of these indicators, there are regular, usually monthly, analysis being achieved in respect of: the production dynamic determination; the evaluation of the short-run investment volume; the development of the turnover; the wage evolution: the employment; the price indexes and the consumer price index (inflation; the volume of exports and imports and the extent to which the imports are covered by the exports and the sold of trade balance. The EUROSTAT system of indicators of conjuncture is conceived as an open system, so that it can be, at any moment extended or restricted, allowing indicators to be amended or even removed, depending on the domestic users requirements as well as on the specific requirements of the harmonization and integration. For the short-run analysis, there is also the World Bank system of indicators of conjuncture, which is utilized, relying on the data sources offered by the World Bank, The World Institute for Resources or other international organizations statistics. The system comprises indicators of the social and economic development and focuses on the indicators for the following three fields: human resources, environment and economic performances. At the end of the paper, there is a case study on the situation of Romania, for which we used all these indicators.
Short-run and Current Analysis Model in Statistics
Directory of Open Access Journals (Sweden)
Constantin Mitrut
2006-03-01
Full Text Available Using the short-run statistic indicators is a compulsory requirement implied in the current analysis. Therefore, there is a system of EUROSTAT indicators on short run which has been set up in this respect, being recommended for utilization by the member-countries. On the basis of these indicators, there are regular, usually monthly, analysis being achieved in respect of: the production dynamic determination; the evaluation of the short-run investment volume; the development of the turnover; the wage evolution: the employment; the price indexes and the consumer price index (inflation; the volume of exports and imports and the extent to which the imports are covered by the exports and the sold of trade balance. The EUROSTAT system of indicators of conjuncture is conceived as an open system, so that it can be, at any moment extended or restricted, allowing indicators to be amended or even removed, depending on the domestic users requirements as well as on the specific requirements of the harmonization and integration. For the short-run analysis, there is also the World Bank system of indicators of conjuncture, which is utilized, relying on the data sources offered by the World Bank, The World Institute for Resources or other international organizations statistics. The system comprises indicators of the social and economic development and focuses on the indicators for the following three fields: human resources, environment and economic performances. At the end of the paper, there is a case study on the situation of Romania, for which we used all these indicators.
The system for statistical analysis of logistic information
Directory of Open Access Journals (Sweden)
Khayrullin Rustam Zinnatullovich
2015-05-01
Full Text Available The current problem for managers in logistic and trading companies is the task of improving the operational business performance and developing the logistics support of sales. The development of logistics sales supposes development and implementation of a set of works for the development of the existing warehouse facilities, including both a detailed description of the work performed, and the timing of their implementation. Logistics engineering of warehouse complex includes such tasks as: determining the number and the types of technological zones, calculation of the required number of loading-unloading places, development of storage structures, development and pre-sales preparation zones, development of specifications of storage types, selection of loading-unloading equipment, detailed planning of warehouse logistics system, creation of architectural-planning decisions, selection of information-processing equipment, etc. The currently used ERP and WMS systems did not allow us to solve the full list of logistics engineering problems. In this regard, the development of specialized software products, taking into account the specifics of warehouse logistics, and subsequent integration of these software with ERP and WMS systems seems to be a current task. In this paper we suggest a system of statistical analysis of logistics information, designed to meet the challenges of logistics engineering and planning. The system is based on the methods of statistical data processing.The proposed specialized software is designed to improve the efficiency of the operating business and the development of logistics support of sales. The system is based on the methods of statistical data processing, the methods of assessment and prediction of logistics performance, the methods for the determination and calculation of the data required for registration, storage and processing of metal products, as well as the methods for planning the reconstruction and development
Simple and flexible SAS and SPSS programs for analyzing lag-sequential categorical data.
O'Connor, B P
1999-11-01
This paper describes simple and flexible programs for analyzing lag-sequential categorical data, using SAS and SPSS. The programs read a stream of codes and produce a variety of lag-sequential statistics, including transitional frequencies, expected transitional frequencies, transitional probabilities, adjusted residuals, z values, Yule's Q values, likelihood ratio tests of stationarity across time and homogeneity across groups or segments, transformed kappas for unidirectional dependence, bidirectional dependence, parallel and nonparallel dominance, and significance levels based on both parametric and randomization tests.
Statistical analysis of the breaking processes of Ni nanowires
Energy Technology Data Exchange (ETDEWEB)
Garcia-Mochales, P [Departamento de Fisica de la Materia Condensada, Facultad de Ciencias, Universidad Autonoma de Madrid, c/ Francisco Tomas y Valiente 7, Campus de Cantoblanco, E-28049-Madrid (Spain); Paredes, R [Centro de Fisica, Instituto Venezolano de Investigaciones CientIficas, Apartado 20632, Caracas 1020A (Venezuela); Pelaez, S; Serena, P A [Instituto de Ciencia de Materiales de Madrid, Consejo Superior de Investigaciones CientIficas, c/ Sor Juana Ines de la Cruz 3, Campus de Cantoblanco, E-28049-Madrid (Spain)], E-mail: pedro.garciamochales@uam.es
2008-06-04
We have performed a massive statistical analysis on the breaking behaviour of Ni nanowires using molecular dynamic simulations. Three stretching directions, five initial nanowire sizes and two temperatures have been studied. We have constructed minimum cross-section histograms and analysed for the first time the role played by monomers and dimers. The shape of such histograms and the absolute number of monomers and dimers strongly depend on the stretching direction and the initial size of the nanowire. In particular, the statistical behaviour of the breakage final stages of narrow nanowires strongly differs from the behaviour obtained for large nanowires. We have analysed the structure around monomers and dimers. Their most probable local configurations differ from those usually appearing in static electron transport calculations. Their non-local environments show disordered regions along the nanowire if the stretching direction is [100] or [110]. Additionally, we have found that, at room temperature, [100] and [110] stretching directions favour the appearance of non-crystalline staggered pentagonal structures. These pentagonal Ni nanowires are reported in this work for the first time. This set of results suggests that experimental Ni conducting histograms could show a strong dependence on the orientation and temperature.
Statistical Scalability Analysis of Communication Operations in Distributed Applications
Energy Technology Data Exchange (ETDEWEB)
Vetter, J S; McCracken, M O
2001-02-27
Current trends in high performance computing suggest that users will soon have widespread access to clusters of multiprocessors with hundreds, if not thousands, of processors. This unprecedented degree of parallelism will undoubtedly expose scalability limitations in existing applications, where scalability is the ability of a parallel algorithm on a parallel architecture to effectively utilize an increasing number of processors. Users will need precise and automated techniques for detecting the cause of limited scalability. This paper addresses this dilemma. First, we argue that users face numerous challenges in understanding application scalability: managing substantial amounts of experiment data, extracting useful trends from this data, and reconciling performance information with their application's design. Second, we propose a solution to automate this data analysis problem by applying fundamental statistical techniques to scalability experiment data. Finally, we evaluate our operational prototype on several applications, and show that statistical techniques offer an effective strategy for assessing application scalability. In particular, we find that non-parametric correlation of the number of tasks to the ratio of the time for individual communication operations to overall communication time provides a reliable measure for identifying communication operations that scale poorly.
Blog Content and User Engagement - An Insight Using Statistical Analysis.
Directory of Open Access Journals (Sweden)
Apoorva Vikrant Kulkarni
2013-06-01
Full Text Available Since the past few years organizations have increasingly realized the value of social media in positioning, propagating and marketing the product/service and organization itself. Today every organization be it small or big has realized the essence of creating a space in the World Wide Web. Social Media through its multifaceted platforms has enabled the organizations to propagate their brands. There are a number of social media networks which are helpful in spreading the message to customers. Many organizations are having full time web analytics teams that are regularly trying to ensure that prospectivecustomers are visiting their organization through various forms of social media. Web analytics is foreseen as a tool for Business Intelligence by organizations and there are a large number of analytics tools available for monitoring the visibility of a particular brand on the web. For example, Google has its ownanalytic tool that is very widely used. There are number of free as well as paid analytical tools available on the internet. The objective of this paper is to study what content in a blog present in the social media creates a greater impact on user engagement. The study statistically analyzes the relation between content of the blog and user engagement. The statistical analysis was carried out on a blog of a reputed management institute in Pune to arrive at conclusions.
Integrating SAS and GIS software to improve habitat-use estimates from radiotelemetry data
Kenow, K.P.; Wright, R.G.; Samuel, M.D.; Rasmussen, P.W.
2001-01-01
Radiotelemetry has been used commonly to remotely determine habitat use by a variety of wildlife species. However, habitat misclassification can occur because the true location of a radiomarked animal can only be estimated. Analytical methods that provide improved estimates of habitat use from radiotelemetry location data using a subsampling approach have been proposed previously. We developed software, based on these methods, to conduct improved habitat-use analyses. A Statistical Analysis System (SAS)-executable file generates a random subsample of points from the error distribution of an estimated animal location and formats the output into ARC/INFO-compatible coordinate and attribute files. An associated ARC/INFO Arc Macro Language (AML) creates a coverage of the random points, determines the habitat type at each random point from an existing habitat coverage, sums the number of subsample points by habitat type for each location, and outputs tile results in ASCII format. The proportion and precision of habitat types used is calculated from the subsample of points generated for each radiotelemetry location. We illustrate the method and software by analysis of radiotelemetry data for a female wild turkey (Meleagris gallopavo).
SPSS and SAS procedures for estimating indirect effects in simple mediation models.
Preacher, Kristopher J; Hayes, Andrew F
2004-11-01
Researchers often conduct mediation analysis in order to indirectly assess the effect of a proposed cause on some outcome through a proposed mediator. The utility of mediation analysis stems from its ability to go beyond the merely descriptive to a more functional understanding of the relationships among variables. A necessary component of mediation is a statistically and practically significant indirect effect. Although mediation hypotheses are frequently explored in psychological research, formal significance tests of indirect effects are rarely conducted. After a brief overview of mediation, we argue the importance of directly testing the significance of indirect effects and provide SPSS and SAS macros that facilitate estimation of the indirect effect with a normal theory approach and a bootstrap approach to obtaining confidence intervals, as well as the traditional approach advocated by Baron and Kenny (1986). We hope that this discussion and the macros will enhance the frequency of formal mediation tests in the psychology literature. Electronic copies of these macros may be downloaded from the Psychonomic Society's Web archive at www.psychonomic.org/archive/.
Statistical models of video structure for content analysis and characterization.
Vasconcelos, N; Lippman, A
2000-01-01
Content structure plays an important role in the understanding of video. In this paper, we argue that knowledge about structure can be used both as a means to improve the performance of content analysis and to extract features that convey semantic information about the content. We introduce statistical models for two important components of this structure, shot duration and activity, and demonstrate the usefulness of these models with two practical applications. First, we develop a Bayesian formulation for the shot segmentation problem that is shown to extend the standard thresholding model in an adaptive and intuitive way, leading to improved segmentation accuracy. Second, by applying the transformation into the shot duration/activity feature space to a database of movie clips, we also illustrate how the Bayesian model captures semantic properties of the content. We suggest ways in which these properties can be used as a basis for intuitive content-based access to movie libraries.
Frequency of PSV inspection optmization using statistical data analysis
Directory of Open Access Journals (Sweden)
Alexandre Guimarães Botelho
2015-12-01
Full Text Available The present paper shows how qualitative analytical methodologies can be enhanced by statistical failure data analysis of process equipment in order to select an appropriate and cost-effective maintenance policy to reduce equipment life cycle cost. As such, a case study was carried out with failure and maintenance data from a sample of pressure safety valves (PSV of a PETROBRAS’s oil and gas production unit. Data was classified according to a failure mode and effect analysis— FMEA, and adjusted using a weibull distribution. The results show the possibility of reduction of maintenance frequency representing 29% of event reduction, without increasing risk, as well as evidencing the potential failures which must be blocked by the inspection plan.
Supermarket Analysis Based On Product Discount and Statistics
Directory of Open Access Journals (Sweden)
Komal Kumawat
2014-03-01
Full Text Available E-commerce has been growing rapidly. Its domain can provide all the right ingredients for successful data mining and it is a significant domain of data mining. E commerce refers to buying and selling of products or services over electronic systems such as internet. Various e commerce systems give discount on product and allow user to buy product online. The basic idea used here is to predict the product sale based on discount applied to the product. Our analysis concentrates on how customer behaves when discount is allotted to him. We have developed a model which finds the customer behaviour when discount is applied to the product. This paper elaborates upon how a different technique like session, click stream is used to collect user data online based on discount applied to the product and how statistics is applied to data set to see the variation in the data.
Higher order statistical moment application for solar PV potential analysis
Basri, Mohd Juhari Mat; Abdullah, Samizee; Azrulhisham, Engku Ahmad; Harun, Khairulezuan
2016-10-01
Solar photovoltaic energy could be as alternative energy to fossil fuel, which is depleting and posing a global warming problem. However, this renewable energy is so variable and intermittent to be relied on. Therefore the knowledge of energy potential is very important for any site to build this solar photovoltaic power generation system. Here, the application of higher order statistical moment model is being analyzed using data collected from 5MW grid-connected photovoltaic system. Due to the dynamic changes of skewness and kurtosis of AC power and solar irradiance distributions of the solar farm, Pearson system where the probability distribution is calculated by matching their theoretical moments with that of the empirical moments of a distribution could be suitable for this purpose. On the advantage of the Pearson system in MATLAB, a software programming has been developed to help in data processing for distribution fitting and potential analysis for future projection of amount of AC power and solar irradiance availability.
Statistical analysis of $k$-nearest neighbor collaborative recommendation
Biau, Gérard; Rouvière, Laurent; 10.1214/09-AOS759
2010-01-01
Collaborative recommendation is an information-filtering technique that attempts to present information items that are likely of interest to an Internet user. Traditionally, collaborative systems deal with situations with two types of variables, users and items. In its most common form, the problem is framed as trying to estimate ratings for items that have not yet been consumed by a user. Despite wide-ranging literature, little is known about the statistical properties of recommendation systems. In fact, no clear probabilistic model even exists which would allow us to precisely describe the mathematical forces driving collaborative filtering. To provide an initial contribution to this, we propose to set out a general sequential stochastic model for collaborative recommendation. We offer an in-depth analysis of the so-called cosine-type nearest neighbor collaborative method, which is one of the most widely used algorithms in collaborative filtering, and analyze its asymptotic performance as the number of user...
Statistical uncertainty analysis of radon transport in nonisothermal, unsaturated soils
Energy Technology Data Exchange (ETDEWEB)
Holford, D.J.; Owczarski, P.C.; Gee, G.W.; Freeman, H.D.
1990-10-01
To accurately predict radon fluxes soils to the atmosphere, we must know more than the radium content of the soil. Radon flux from soil is affected not only by soil properties, but also by meteorological factors such as air pressure and temperature changes at the soil surface, as well as the infiltration of rainwater. Natural variations in meteorological factors and soil properties contribute to uncertainty in subsurface model predictions of radon flux, which, when coupled with a building transport model, will also add uncertainty to predictions of radon concentrations in homes. A statistical uncertainty analysis using our Rn3D finite-element numerical model was conducted to assess the relative importance of these meteorological factors and the soil properties affecting radon transport. 10 refs., 10 figs., 3 tabs.
A Statistical Analysis of Cointegration for I(2) Variables
DEFF Research Database (Denmark)
Johansen, Søren
1995-01-01
This paper discusses inference for I(2) variables in a VAR model. The estimation procedure suggested consists of two reduced rank regressions. The asymptotic distribution of the proposed estimators of the cointegrating coefficients is mixed Gaussian, which implies that asymptotic inference can...... be conducted using the ¿ sup2/sup distribution. It is shown to what extent inference on the cointegration ranks can be conducted using the tables already prepared for the analysis of cointegration of I(1) variables. New tables are needed for the test statistics to control the size of the tests. This paper...... contains a multivariate test for the existence of I(2) variables. This test is illustrated using a data set consisting of U.K. and foreign prices and interest rates as well as the exchange rate....
Spectral signature verification using statistical analysis and text mining
DeCoster, Mallory E.; Firpi, Alexe H.; Jacobs, Samantha K.; Cone, Shelli R.; Tzeng, Nigel H.; Rodriguez, Benjamin M.
2016-05-01
In the spectral science community, numerous spectral signatures are stored in databases representative of many sample materials collected from a variety of spectrometers and spectroscopists. Due to the variety and variability of the spectra that comprise many spectral databases, it is necessary to establish a metric for validating the quality of spectral signatures. This has been an area of great discussion and debate in the spectral science community. This paper discusses a method that independently validates two different aspects of a spectral signature to arrive at a final qualitative assessment; the textual meta-data and numerical spectral data. Results associated with the spectral data stored in the Signature Database1 (SigDB) are proposed. The numerical data comprising a sample material's spectrum is validated based on statistical properties derived from an ideal population set. The quality of the test spectrum is ranked based on a spectral angle mapper (SAM) comparison to the mean spectrum derived from the population set. Additionally, the contextual data of a test spectrum is qualitatively analyzed using lexical analysis text mining. This technique analyzes to understand the syntax of the meta-data to provide local learning patterns and trends within the spectral data, indicative of the test spectrum's quality. Text mining applications have successfully been implemented for security2 (text encryption/decryption), biomedical3 , and marketing4 applications. The text mining lexical analysis algorithm is trained on the meta-data patterns of a subset of high and low quality spectra, in order to have a model to apply to the entire SigDB data set. The statistical and textual methods combine to assess the quality of a test spectrum existing in a database without the need of an expert user. This method has been compared to other validation methods accepted by the spectral science community, and has provided promising results when a baseline spectral signature is
Classification of Malaysia aromatic rice using multivariate statistical analysis
Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A.; Omar, O.
2015-05-01
Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC-MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.
Classification of Malaysia aromatic rice using multivariate statistical analysis
Energy Technology Data Exchange (ETDEWEB)
Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A. [School of Mechatronic Engineering, Universiti Malaysia Perlis, Kampus Pauh Putra, 02600 Arau, Perlis (Malaysia); Omar, O. [Malaysian Agriculture Research and Development Institute (MARDI), Persiaran MARDI-UPM, 43400 Serdang, Selangor (Malaysia)
2015-05-15
Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC–MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.
Substrate Activity Screening (SAS) and Related Approaches in Medicinal Chemistry.
Gladysz, Rafaela; Lambeir, Anne-Marie; Joossens, Jurgen; Augustyns, Koen; Van der Veken, Pieter
2016-03-04
Substrate activity screening (SAS) was presented a decade ago by Ellman and co-workers as a straightforward methodology for the identification of fragment-sized building blocks for enzyme inhibitors. Ever since, SAS and variations derived from it have been successfully applied to the discovery of inhibitors of various families of enzymatically active drug targets. This review covers key achievements and challenges of SAS and related methodologies, including the modified substrate activity screening (MSAS) approach. Special attention is given to the kinetic and thermodynamic aspects of these methodologies, as a thorough understanding thereof is crucial for successfully transforming the identified fragment-sized hits into potent inhibitors.
SOCR: Statistics Online Computational Resource
Directory of Open Access Journals (Sweden)
Ivo D. Dinov
2006-10-01
Full Text Available The need for hands-on computer laboratory experience in undergraduate and graduate statistics education has been firmly established in the past decade. As a result a number of attempts have been undertaken to develop novel approaches for problem-driven statistical thinking, data analysis and result interpretation. In this paper we describe an integrated educational web-based framework for: interactive distribution modeling, virtual online probability experimentation, statistical data analysis, visualization and integration. Following years of experience in statistical teaching at all college levels using established licensed statistical software packages, like STATA, S-PLUS, R, SPSS, SAS, Systat, etc., we have attempted to engineer a new statistics education environment, the Statistics Online Computational Resource (SOCR. This resource performs many of the standard types of statistical analysis, much like other classical tools. In addition, it is designed in a plug-in object-oriented architecture and is completely platform independent, web-based, interactive, extensible and secure. Over the past 4 years we have tested, fine-tuned and reanalyzed the SOCR framework in many of our undergraduate and graduate probability and statistics courses and have evidence that SOCR resources build student's intuition and enhance their learning.
SOCR: Statistics Online Computational Resource
Directory of Open Access Journals (Sweden)
Ivo D. Dinov
2006-10-01
Full Text Available The need for hands-on computer laboratory experience in undergraduate and graduate statistics education has been firmly established in the past decade. As a result a number of attempts have been undertaken to develop novel approaches for problem-driven statistical thinking, data analysis and result interpretation. In this paper we describe an integrated educational web-based framework for: interactive distribution modeling, virtual online probability experimentation, statistical data analysis, visualization and integration. Following years of experience in statistical teaching at all college levels using established licensed statistical software packages, like STATA, S-PLUS, R, SPSS, SAS, Systat, etc., we have attempted to engineer a new statistics education environment, the Statistics Online Computational Resource (SOCR. This resource performs many of the standard types of statistical analysis, much like other classical tools. In addition, it is designed in a plug-in object-oriented architecture and is completely platform independent, web-based, interactive, extensible and secure. Over the past 4 years we have tested, fine-tuned and reanalyzed the SOCR framework in many of our undergraduate and graduate probability and statistics courses and have evidence that SOCR resources build student’s intuition and enhance their learning.
Microcomputers: Statistical Analysis Software. Evaluation Guide Number 5.
Gray, Peter J.
This guide discusses six sets of features to examine when purchasing a microcomputer-based statistics program: hardware requirements; data management; data processing; statistical procedures; printing; and documentation. While the current statistical packages have several negative features, they are cost saving and convenient for small to moderate…
Application of Integration of Spatial Statistical Analysis with GIS to Regional Economic Analysis
Institute of Scientific and Technical Information of China (English)
CHEN Fei; DU Daosheng
2004-01-01
This paper summarizes a few spatial statistical analysis methods for to measuring spatial autocorrelation and spatial association, discusses the criteria for the identification of spatial association by the use of global Moran Coefficient, Local Moran and Local Geary. Furthermore, a user-friendly statistical module, combining spatial statistical analysis methods with GIS visual techniques, is developed in Arcview using Avenue. An example is also given to show the usefulness of this module in identifying and quantifying the underlying spatial association patterns between economic units.
Statistical Analysis of Tank 5 Floor Sample Results
Energy Technology Data Exchange (ETDEWEB)
Shine, E. P.
2013-01-31
Sampling has been completed for the characterization of the residual material on the floor of Tank 5 in the F-Area Tank Farm at the Savannah River Site (SRS), near Aiken, SC. The sampling was performed by Savannah River Remediation (SRR) LLC using a stratified random sampling plan with volume-proportional compositing. The plan consisted of partitioning the residual material on the floor of Tank 5 into three non-overlapping strata: two strata enclosed accumulations, and a third stratum consisted of a thin layer of material outside the regions of the two accumulations. Each of three composite samples was constructed from five primary sample locations of residual material on the floor of Tank 5. Three of the primary samples were obtained from the stratum containing the thin layer of material, and one primary sample was obtained from each of the two strata containing an accumulation. This report documents the statistical analyses of the analytical results for the composite samples. The objective of the analysis is to determine the mean concentrations and upper 95% confidence (UCL95) bounds for the mean concentrations for a set of analytes in the tank residuals. The statistical procedures employed in the analyses were consistent with the Environmental Protection Agency (EPA) technical guidance by Singh and others [2010]. Savannah River National Laboratory (SRNL) measured the sample bulk density, nonvolatile beta, gross alpha, and the radionuclide1, elemental, and chemical concentrations three times for each of the composite samples. The analyte concentration data were partitioned into three separate groups for further analysis: analytes with every measurement above their minimum detectable concentrations (MDCs), analytes with no measurements above their MDCs, and analytes with a mixture of some measurement results above and below their MDCs. The means, standard deviations, and UCL95s were computed for the analytes in the two groups that had at least some measurements
STATISTICAL ANALYSIS OF TANK 5 FLOOR SAMPLE RESULTS
Energy Technology Data Exchange (ETDEWEB)
Shine, E.
2012-03-14
Sampling has been completed for the characterization of the residual material on the floor of Tank 5 in the F-Area Tank Farm at the Savannah River Site (SRS), near Aiken, SC. The sampling was performed by Savannah River Remediation (SRR) LLC using a stratified random sampling plan with volume-proportional compositing. The plan consisted of partitioning the residual material on the floor of Tank 5 into three non-overlapping strata: two strata enclosed accumulations, and a third stratum consisted of a thin layer of material outside the regions of the two accumulations. Each of three composite samples was constructed from five primary sample locations of residual material on the floor of Tank 5. Three of the primary samples were obtained from the stratum containing the thin layer of material, and one primary sample was obtained from each of the two strata containing an accumulation. This report documents the statistical analyses of the analytical results for the composite samples. The objective of the analysis is to determine the mean concentrations and upper 95% confidence (UCL95) bounds for the mean concentrations for a set of analytes in the tank residuals. The statistical procedures employed in the analyses were consistent with the Environmental Protection Agency (EPA) technical guidance by Singh and others [2010]. Savannah River National Laboratory (SRNL) measured the sample bulk density, nonvolatile beta, gross alpha, radionuclide, inorganic, and anion concentrations three times for each of the composite samples. The analyte concentration data were partitioned into three separate groups for further analysis: analytes with every measurement above their minimum detectable concentrations (MDCs), analytes with no measurements above their MDCs, and analytes with a mixture of some measurement results above and below their MDCs. The means, standard deviations, and UCL95s were computed for the analytes in the two groups that had at least some measurements above their
Statistical Analysis Of Tank 5 Floor Sample Results
Energy Technology Data Exchange (ETDEWEB)
Shine, E. P.
2012-08-01
Sampling has been completed for the characterization of the residual material on the floor of Tank 5 in the F-Area Tank Farm at the Savannah River Site (SRS), near Aiken, SC. The sampling was performed by Savannah River Remediation (SRR) LLC using a stratified random sampling plan with volume-proportional compositing. The plan consisted of partitioning the residual material on the floor of Tank 5 into three non-overlapping strata: two strata enclosed accumulations, and a third stratum consisted of a thin layer of material outside the regions of the two accumulations. Each of three composite samples was constructed from five primary sample locations of residual material on the floor of Tank 5. Three of the primary samples were obtained from the stratum containing the thin layer of material, and one primary sample was obtained from each of the two strata containing an accumulation. This report documents the statistical analyses of the analytical results for the composite samples. The objective of the analysis is to determine the mean concentrations and upper 95% confidence (UCL95) bounds for the mean concentrations for a set of analytes in the tank residuals. The statistical procedures employed in the analyses were consistent with the Environmental Protection Agency (EPA) technical guidance by Singh and others [2010]. Savannah River National Laboratory (SRNL) measured the sample bulk density, nonvolatile beta, gross alpha, and the radionuclide, elemental, and chemical concentrations three times for each of the composite samples. The analyte concentration data were partitioned into three separate groups for further analysis: analytes with every measurement above their minimum detectable concentrations (MDCs), analytes with no measurements above their MDCs, and analytes with a mixture of some measurement results above and below their MDCs. The means, standard deviations, and UCL95s were computed for the analytes in the two groups that had at least some measurements
RFI detection by automated feature extraction and statistical analysis
Winkel, B.; Kerp, J.; Stanko, S.
2007-01-01
In this paper we present an interference detection toolbox consisting of a high dynamic range Digital Fast-Fourier-Transform spectrometer (DFFT, based on FPGA-technology) and data analysis software for automated radio frequency interference (RFI) detection. The DFFT spectrometer allows high speed data storage of spectra on time scales of less than a second. The high dynamic range of the device assures constant calibration even during extremely powerful RFI events. The software uses an algorithm which performs a two-dimensional baseline fit in the time-frequency domain, searching automatically for RFI signals superposed on the spectral data. We demonstrate, that the software operates successfully on computer-generated RFI data as well as on real DFFT data recorded at the Effelsberg 100-m telescope. At 21-cm wavelength RFI signals can be identified down to the 4σ_rms level. A statistical analysis of all RFI events detected in our observational data revealed that: (1) mean signal strength is comparable to the astronomical line emission of the Milky Way, (2) interferences are polarised, (3) electronic devices in the neighbourhood of the telescope contribute significantly to the RFI radiation. We also show that the radiometer equation is no longer fulfilled in presence of RFI signals.
RFI detection by automated feature extraction and statistical analysis
Winkel, B; Stanko, S; Winkel, Benjamin; Kerp, Juergen; Stanko, Stephan
2006-01-01
In this paper we present an interference detection toolbox consisting of a high dynamic range Digital Fast-Fourier-Transform spectrometer (DFFT, based on FPGA-technology) and data analysis software for automated radio frequency interference (RFI) detection. The DFFT spectrometer allows high speed data storage of spectra on time scales of less than a second. The high dynamic range of the device assures constant calibration even during extremely powerful RFI events. The software uses an algorithm which performs a two-dimensional baseline fit in the time-frequency domain, searching automatically for RFI signals superposed on the spectral data. We demonstrate, that the software operates successfully on computer-generated RFI data as well as on real DFFT data recorded at the Effelsberg 100-m telescope. At 21-cm wavelength RFI signals can be identified down to the 4-sigma level. A statistical analysis of all RFI events detected in our observational data revealed that: (1) mean signal strength is comparable to the a...
Statistical analysis of plasma thermograms measured by differential scanning calorimetry.
Fish, Daniel J; Brewood, Greg P; Kim, Jong Sung; Garbett, Nichola C; Chaires, Jonathan B; Benight, Albert S
2010-11-01
Melting curves of human plasma measured by differential scanning calorimetry (DSC), known as thermograms, have the potential to markedly impact diagnosis of human diseases. A general statistical methodology is developed to analyze and classify DSC thermograms to analyze and classify thermograms. Analysis of an acquired thermogram involves comparison with a database of empirical reference thermograms from clinically characterized diseases. Two parameters, a distance metric, P, and correlation coefficient, r, are combined to produce a 'similarity metric,' ρ, which can be used to classify unknown thermograms into pre-characterized categories. Simulated thermograms known to lie within or fall outside of the 90% quantile range around a median reference are also analyzed. Results verify the utility of the methods and establish the apparent dynamic range of the metric ρ. Methods are then applied to data obtained from a collection of plasma samples from patients clinically diagnosed with SLE (lupus). High correspondence is found between curve shapes and values of the metric ρ. In a final application, an elementary classification rule is implemented to successfully analyze and classify unlabeled thermograms. These methods constitute a set of powerful yet easy to implement tools for quantitative classification, analysis and interpretation of DSC plasma melting curves.
A statistical design for testing apomictic diversification through linkage analysis.
Zeng, Yanru; Hou, Wei; Song, Shuang; Feng, Sisi; Shen, Lin; Xia, Guohua; Wu, Rongling
2014-03-01
The capacity of apomixis to generate maternal clones through seed reproduction has made it a useful characteristic for the fixation of heterosis in plant breeding. It has been observed that apomixis displays pronounced intra- and interspecific diversification, but the genetic mechanisms underlying this diversification remains elusive, obstructing the exploitation of this phenomenon in practical breeding programs. By capitalizing on molecular information in mapping populations, we describe and assess a statistical design that deploys linkage analysis to estimate and test the pattern and extent of apomictic differences at various levels from genotypes to species. The design is based on two reciprocal crosses between two individuals each chosen from a hermaphrodite or monoecious species. A multinomial distribution likelihood is constructed by combining marker information from two crosses. The EM algorithm is implemented to estimate the rate of apomixis and test its difference between two plant populations or species as the parents. The design is validated by computer simulation. A real data analysis of two reciprocal crosses between hickory (Carya cathayensis) and pecan (C. illinoensis) demonstrates the utilization and usefulness of the design in practice. The design provides a tool to address fundamental and applied questions related to the evolution and breeding of apomixis.
Data Analysis & Statistical Methods for Command File Errors
Meshkat, Leila; Waggoner, Bruce; Bryant, Larry
2014-01-01
This paper explains current work on modeling for managing the risk of command file errors. It is focused on analyzing actual data from a JPL spaceflight mission to build models for evaluating and predicting error rates as a function of several key variables. We constructed a rich dataset by considering the number of errors, the number of files radiated, including the number commands and blocks in each file, as well as subjective estimates of workload and operational novelty. We have assessed these data using different curve fitting and distribution fitting techniques, such as multiple regression analysis, and maximum likelihood estimation to see how much of the variability in the error rates can be explained with these. We have also used goodness of fit testing strategies and principal component analysis to further assess our data. Finally, we constructed a model of expected error rates based on the what these statistics bore out as critical drivers to the error rate. This model allows project management to evaluate the error rate against a theoretically expected rate as well as anticipate future error rates.
A Morphological and Statistical Analysis of Ansae in Barred Galaxies
Martinez-Valpuesta, I; Buta, R
2007-01-01
Many barred galaxies show a set of symmetric enhancements at the ends of the stellar bar, called {\\it ansae}, or the ``handles'' of the bar. The ansa bars have been in the literature for some decades, but their origin has still not been specifically addressed, although, they could be related to the growth process of bars. But even though ansae have been known for a long time, no statistical analysis of their relative frequency of occurrence has been performed yet. Similarly, there has been no study of the varieties in morphology of ansae even though significant morphological variations are known to characterise the features. In this paper, we make a quantitative analysis of the occurrence of ansae in barred galaxies, making use of {\\it The de Vaucouleurs Atlas of Galaxies} by Buta and coworkers. We find that $\\sim 40%$ of SB0's show ansae in their bars, thus confirming that ansae are common features in barred lenticulars. The ansa frequency decreases dramatically with later types, and hardly any ansae are fou...
Statistical analysis of the operating parameters which affect cupola emissions
Energy Technology Data Exchange (ETDEWEB)
Davis, J.W.; Draper, A.B.
1977-12-01
A sampling program was undertaken to determine the operating parameters which affected air pollution emission from gray iron foundry cupolas. The experimental design utilized the analysis of variance routine. Four independent variables were selected for examination on the basis of previous work reported in the literature. These were: (1) blast rate; (2) iron-coke ratio; (3) blast temperature; and (4) cupola size. The last variable was chosen since it most directly affects melt rate. Emissions from cupolas for which concern has been expressed are particle matter and carbon monoxide. The dependent variables were, therefore, particle loading, particle size distribution, and carbon monoxide concentration. Seven production foundries were visited and samples taken under conditions prescribed by the experimental plan. The data obtained from these tests were analyzed using the analysis of variance and other statistical techniques where applicable. The results indicated that blast rate, blast temperature, and cupola size affected particle emissions and the latter two also affected the particle size distribution. The particle size information was also unique in that it showed a consistent particle size distribution at all seven foundaries with a sizable fraction of the particles less than 1.0 micrometers in diameter.
Criminal victimization in Ukraine: analysis of statistical data
Directory of Open Access Journals (Sweden)
Serhiy Nezhurbida
2007-12-01
Full Text Available The article is based on the analysis of statistical data provided by law-enforcement, judicial and other bodies of Ukraine. The given analysis allows us to give an accurate quantity of a current status of crime victimization in Ukraine, to characterize its basic features (level, rate, structure, dynamics, and etc.. L’article se concentre sur l’analyse des données statystiques fournies par les institutions de contrôle sociale (forces de police et magistrature et par d’autres organes institutionnels ukrainiens. Les analyses effectuées attirent l'attention sur la situation actuelle des victimes du crime en Ukraine et aident à délinéer leur principales caractéristiques (niveau, taux, structure, dynamiques, etc.L’articolo si basa sull’analisi dei dati statistici forniti dalle agenzie del controllo sociale (forze dell'ordine e magistratura e da altri organi istituzionali ucraini. Le analisi effettuate forniscono molte informazioni sulla situazione attuale delle vittime del crimine in Ucraina e aiutano a delinearne le caratteristiche principali (livello, tasso, struttura, dinamiche, ecc..
Higher order statistical frequency domain decomposition for operational modal analysis
Nita, G. M.; Mahgoub, M. A.; Sharyatpanahi, S. G.; Cretu, N. C.; El-Fouly, T. M.
2017-02-01
Experimental methods based on modal analysis under ambient vibrational excitation are often employed to detect structural damages of mechanical systems. Many of such frequency domain methods, such as Basic Frequency Domain (BFD), Frequency Domain Decomposition (FFD), or Enhanced Frequency Domain Decomposition (EFFD), use as first step a Fast Fourier Transform (FFT) estimate of the power spectral density (PSD) associated with the response of the system. In this study it is shown that higher order statistical estimators such as Spectral Kurtosis (SK) and Sample to Model Ratio (SMR) may be successfully employed not only to more reliably discriminate the response of the system against the ambient noise fluctuations, but also to better identify and separate contributions from closely spaced individual modes. It is shown that a SMR-based Maximum Likelihood curve fitting algorithm may improve the accuracy of the spectral shape and location of the individual modes and, when combined with the SK analysis, it provides efficient means to categorize such individual spectral components according to their temporal dynamics as coherent or incoherent system responses to unknown ambient excitations.
Statistical analysis of emotions and opinions at Digg website
Pohorecki, Piotr; Mitrovic, Marija; Paltoglou, Georgios; Holyst, Janusz A
2012-01-01
We performed statistical analysis on data from the Digg.com website, which enables its users to express their opinion on news stories by taking part in forum-like discussions as well as directly evaluate previous posts and stories by assigning so called "diggs". Owing to fact that the content of each post has been annotated with its emotional value, apart from the strictly structural properties, the study also includes an analysis of the average emotional response of the posts commenting the main story. While analysing correlations at the story level, an interesting relationship between the number of diggs and the number of comments received by a story was found. The correlation between the two quantities is high for data where small threads dominate and consistently decreases for longer threads. However, while the correlation of the number of diggs and the average emotional response tends to grow for longer threads, correlations between numbers of comments and the average emotional response are almost zero. ...
Statistical Power Flow Analysis of an Imperfect Ribbed Cylinder
Blakemore, M.; Woodhouse, J.; Hardie, D. J. W.
1999-05-01
Prediction of the noise transmitted from machinery and flow sources on a submarine to the sonar arrays poses a complex problem. Vibrations in the pressure hull provide the main transmission mechanism. The pressure hull is characterised by a very large number of modes over the frequency range of interest (at least 100,000) and by high modal overlap, both of which place its analysis beyond the scope of finite element or boundary element methods. A method for calculating the transmission is presented, which is broadly based on Statistical Energy Analysis, but extended in two important ways: (1) a novel subsystem breakdown which exploits the particular geometry of a submarine pressure hull; (2) explicit modelling of energy density variation within a subsystem due to damping. The method takes account of fluid-structure interaction, the underlying pass/stop band characteristics resulting from the near-periodicity of the pressure hull construction, the effect of vibration isolators such as bulkheads, and the cumulative effect of irregularities (e.g., attachments and penetrations).
Statistical analysis of cone penetration resistance of railway ballast
Saussine, Gilles; Dhemaied, Amine; Delforge, Quentin; Benfeddoul, Selim
2017-06-01
Dynamic penetrometer tests are widely used in geotechnical studies for soils characterization but their implementation tends to be difficult. The light penetrometer test is able to give information about a cone resistance useful in the field of geotechnics and recently validated as a parameter for the case of coarse granular materials. In order to characterize directly the railway ballast on track and sublayers of ballast, a huge test campaign has been carried out for more than 5 years in order to build up a database composed of 19,000 penetration tests including endoscopic video record on the French railway network. The main objective of this work is to give a first statistical analysis of cone resistance in the coarse granular layer which represents a major component of railway track: the ballast. The results show that the cone resistance (qd) increases with depth and presents strong variations corresponding to layers of different natures identified using the endoscopic records. In the first zone corresponding to the top 30cm, (qd) increases linearly with a slope of around 1MPa/cm for fresh ballast and fouled ballast. In the second zone below 30cm deep, (qd) increases more slowly with a slope of around 0,3MPa/cm and decreases below 50cm. These results show that there is no clear difference between fresh and fouled ballast. Hence, the (qd) sensitivity is important and increases with depth. The (qd) distribution for a set of tests does not follow a normal distribution. In the upper 30cm layer of ballast of track, data statistical treatment shows that train load and speed do not have any significant impact on the (qd) distribution for clean ballast; they increase by 50% the average value of (qd) for fouled ballast and increase the thickness as well. Below the 30cm upper layer, train load and speed have a clear impact on the (qd) distribution.
RNA STRAND: The RNA Secondary Structure and Statistical Analysis Database
Directory of Open Access Journals (Sweden)
Andronescu Mirela
2008-08-01
Full Text Available Abstract Background The ability to access, search and analyse secondary structures of a large set of known RNA molecules is very important for deriving improved RNA energy models, for evaluating computational predictions of RNA secondary structures and for a better understanding of RNA folding. Currently there is no database that can easily provide these capabilities for almost all RNA molecules with known secondary structures. Results In this paper we describe RNA STRAND – the RNA secondary STRucture and statistical ANalysis Database, a curated database containing known secondary structures of any type and organism. Our new database provides a wide collection of known RNA secondary structures drawn from public databases, searchable and downloadable in a common format. Comprehensive statistical information on the secondary structures in our database is provided using the RNA Secondary Structure Analyser, a new tool we have developed to analyse RNA secondary structures. The information thus obtained is valuable for understanding to which extent and with which probability certain structural motifs can appear. We outline several ways in which the data provided in RNA STRAND can facilitate research on RNA structure, including the improvement of RNA energy models and evaluation of secondary structure prediction programs. In order to keep up-to-date with new RNA secondary structure experiments, we offer the necessary tools to add solved RNA secondary structures to our database and invite researchers to contribute to RNA STRAND. Conclusion RNA STRAND is a carefully assembled database of trusted RNA secondary structures, with easy on-line tools for searching, analyzing and downloading user selected entries, and is publicly available at http://www.rnasoft.ca/strand.
Analysis of Statistical Distributions of Energization Overvoltages of EHV Cables
DEFF Research Database (Denmark)
Ohno, Teruo; Ametani, Akihiro; Bak, Claus Leth
Insulation levels of EHV systems have been determined based on the statistical distribution of switching overvoltages since 1970s when the statistical distribution was found for overhead lines. Responding to an increase in the planned and installed EHV cables, the authors have derived the statist......Insulation levels of EHV systems have been determined based on the statistical distribution of switching overvoltages since 1970s when the statistical distribution was found for overhead lines. Responding to an increase in the planned and installed EHV cables, the authors have derived...... the statistical distribution of energization overvoltages for EHV cables and have made clear their characteristics compared with those of the overhead lines. This paper identifies the causes and physical meanings of the characteristics so that it becomes possible to use the obtained statistical distribution...... for the determination of insulation levels of cable systems....
Usual Dietary Intakes: SAS Macros for the NCI Method
SAS macros are currently available to facilitate modeling of a single dietary component, whether consumed daily or episodically; ratios of two dietary components that are consumed nearly every day; multiple dietary components, whether consumed daily or episodically.
Breast Cancer Risk Assessment SAS Macro (Gail Model)
A SAS macro (commonly referred to as the Gail Model) that projects absolute risk of invasive breast cancer according to NCI’s Breast Cancer Risk Assessment Tool (BCRAT) algorithm for specified race/ethnic groups and age intervals.
Islandlased tahavad SAS-i Skandinaavia turult kukutada / Mikk Salu
Salu, Mikk, 1975-
2006-01-01
Islandlastele kuuluv investeerimisfirma FL Grupp koondab lennufirmasid, et haarata liidrirolli Põhjala lennunduses. Lisad: Island; Islandi rikkaim mees; Uus põlvkond. Diagrammid: Islandi lennuärimehed versus SAS
Norberg, Peder; Gaztanaga, Enrique; Croton, Darren J
2008-01-01
We present a test of different error estimators for 2-point clustering statistics, appropriate for present and future large galaxy redshift surveys. Using an ensemble of very large dark matter LambdaCDM N-body simulations, we compare internal error estimators (jackknife and bootstrap) to external ones (Monte-Carlo realizations). For 3-dimensional clustering statistics, we find that none of the internal error methods investigated are able to reproduce neither accurately nor robustly the errors of external estimators on 1 to 25 Mpc/h scales. The standard bootstrap overestimates the variance of xi(s) by ~40% on all scales probed, but recovers, in a robust fashion, the principal eigenvectors of the underlying covariance matrix. The jackknife returns the correct variance on large scales, but significantly overestimates it on smaller scales. This scale dependence in the jackknife affects the recovered eigenvectors, which tend to disagree on small scales with the external estimates. Our results have important implic...
Structure of the SAS-6 cartwheel hub from Leishmania major.
van Breugel, Mark; Wilcken, Rainer; McLaughlin, Stephen H; Rutherford, Trevor J; Johnson, Christopher M
2014-01-01
Centrioles are cylindrical cell organelles with a ninefold symmetric peripheral microtubule array that is essential to template cilia and flagella. They are built around a central cartwheel assembly that is organized through homo-oligomerization of the centriolar protein SAS-6, but whether SAS-6 self-assembly can dictate cartwheel and thereby centriole symmetry is unclear. Here we show that Leishmania major SAS-6 crystallizes as a 9-fold symmetric cartwheel and provide the X-ray structure of this assembly at a resolution of 3.5 Å. We furthermore demonstrate that oligomerization of Leishmania SAS-6 can be inhibited by a small molecule in vitro and provide indications for its binding site. Our results firmly establish that SAS-6 can impose cartwheel symmetry on its own and indicate how this process might occur mechanistically in vivo. Importantly, our data also provide a proof-of-principle that inhibition of SAS-6 oligomerization by small molecules is feasible. DOI: http://dx.doi.org/10.7554/eLife.01812.001.
Statistical Analysis of Data with Non-Detectable Values
Energy Technology Data Exchange (ETDEWEB)
Frome, E.L.
2004-08-26
Environmental exposure measurements are, in general, positive and may be subject to left censoring, i.e. the measured value is less than a ''limit of detection''. In occupational monitoring, strategies for assessing workplace exposures typically focus on the mean exposure level or the probability that any measurement exceeds a limit. A basic problem of interest in environmental risk assessment is to determine if the mean concentration of an analyte is less than a prescribed action level. Parametric methods, used to determine acceptable levels of exposure, are often based on a two parameter lognormal distribution. The mean exposure level and/or an upper percentile (e.g. the 95th percentile) are used to characterize exposure levels, and upper confidence limits are needed to describe the uncertainty in these estimates. In certain situations it is of interest to estimate the probability of observing a future (or ''missed'') value of a lognormal variable. Statistical methods for random samples (without non-detects) from the lognormal distribution are well known for each of these situations. In this report, methods for estimating these quantities based on the maximum likelihood method for randomly left censored lognormal data are described and graphical methods are used to evaluate the lognormal assumption. If the lognormal model is in doubt and an alternative distribution for the exposure profile of a similar exposure group is not available, then nonparametric methods for left censored data are used. The mean exposure level, along with the upper confidence limit, is obtained using the product limit estimate, and the upper confidence limit on the 95th percentile (i.e. the upper tolerance limit) is obtained using a nonparametric approach. All of these methods are well known but computational complexity has limited their use in routine data analysis with left censored data. The recent development of the R environment for statistical
TECHNIQUE OF THE STATISTICAL ANALYSIS OF INVESTMENT APPEAL OF THE REGION
Directory of Open Access Journals (Sweden)
А. А. Vershinina
2014-01-01
Full Text Available The technique of the statistical analysis of investment appeal of the region is given in scientific article for direct foreign investments. Definition of a technique of the statistical analysis is given, analysis stages reveal, the mathematico-statistical tools are considered.
Analysis of Statistical Methods Currently used in Toxicology Journals
Na, Jihye; Yang, Hyeri; Bae, SeungJin; Lim, Kyung-Min
2014-01-01
Statistical methods are frequently used in toxicology, yet it is not clear whether the methods employed by the studies are used consistently and conducted based on sound statistical grounds. The purpose of this paper is to describe statistical methods used in top toxicology journals. More specifically, we sampled 30 papers published in 2014 from Toxicology and Applied Pharmacology, Archives of Toxicology, and Toxicological Science and described methodologies used to provide descriptive and in...
Analysis of Statistical Distributions of Energization Overvoltages of EHV Cables
DEFF Research Database (Denmark)
Ohno, Teruo; Ametani, Akihiro; Bak, Claus Leth;
Insulation levels of EHV systems have been determined based on the statistical distribution of switching overvoltages since 1970s when the statistical distribution was found for overhead lines. Responding to an increase in the planned and installed EHV cables, the authors have derived...... the statistical distribution of energization overvoltages for EHV cables and have made clear their characteristics compared with those of the overhead lines. This paper identifies the causes and physical meanings of the characteristics so that it becomes possible to use the obtained statistical distribution...... for the determination of insulation levels of cable systems....
Analysis and Evaluation of Statistical Models for Integrated Circuits Design
Directory of Open Access Journals (Sweden)
Sáenz-Noval J.J.
2011-10-01
Full Text Available Statistical models for integrated circuits (IC allow us to estimate the percentage of acceptable devices in the batch before fabrication. Actually, Pelgrom is the statistical model most accepted in the industry; however it was derived from a micrometer technology, which does not guarantee reliability in nanometric manufacturing processes. This work considers three of the most relevant statistical models in the industry and evaluates their limitations and advantages in analog design, so that the designer has a better criterion to make a choice. Moreover, it shows how several statistical models can be used for each one of the stages and design purposes.
Significance analysis and statistical mechanics: an application to clustering.
Łuksza, Marta; Lässig, Michael; Berg, Johannes
2010-11-26
This Letter addresses the statistical significance of structures in random data: given a set of vectors and a measure of mutual similarity, how likely is it that a subset of these vectors forms a cluster with enhanced similarity among its elements? The computation of this cluster p value for randomly distributed vectors is mapped onto a well-defined problem of statistical mechanics. We solve this problem analytically, establishing a connection between the physics of quenched disorder and multiple-testing statistics in clustering and related problems. In an application to gene expression data, we find a remarkable link between the statistical significance of a cluster and the functional relationships between its genes.
A Statistical Aggregation Engine for Climatology and Trend Analysis
Chapman, D. R.; Simon, T. A.; Halem, M.
2014-12-01
Fundamental climate data records (FCDRs) from satellite instruments often span tens to hundreds of terabytes or even petabytes in scale. These large volumes make it difficult to aggregate or summarize their climatology and climate trends. It is especially cumbersome to supply the full derivation (provenance) of these aggregate calculations. We present a lightweight and resilient software platform, Gridderama that simplifies the calculation of climatology by exploiting the "Data-Cube" topology often present in earth observing satellite records. By using the large array storage (LAS) paradigm, Gridderama allows the analyst to more easily produce a series of aggregate climate data products at progressively coarser spatial and temporal resolutions. Furthermore, provenance tracking and extensive visualization capabilities allow the analyst to track down and correct for data problems such as missing data and outliers that may impact the scientific results. We have developed and applied Gridderama to calculate a trend analysis of 55 Terabytes of AIRS Level 1b infrared radiances, and show statistically significant trending in the greenhouse gas absorption bands as observed by AIRS over the 2003-2012 decade. We will extend this calculation to show regional changes in CO2 concentration from AIRS over the 2003-2012 decade by using a neural network retrieval algorithm.
Statistical Analysis of Loss of Offsite Power Events
Directory of Open Access Journals (Sweden)
Andrija Volkanovski
2016-01-01
Full Text Available This paper presents the results of the statistical analysis of the loss of offsite power events (LOOP registered in four reviewed databases. The reviewed databases include the IRSN (Institut de Radioprotection et de Sûreté Nucléaire SAPIDE database and the GRS (Gesellschaft für Anlagen- und Reaktorsicherheit mbH VERA database reviewed over the period from 1992 to 2011. The US NRC (Nuclear Regulatory Commission Licensee Event Reports (LERs database and the IAEA International Reporting System (IRS database were screened for relevant events registered over the period from 1990 to 2013. The number of LOOP events in each year in the analysed period and mode of operation are assessed during the screening. The LOOP frequencies obtained for the French and German nuclear power plants (NPPs during critical operation are of the same order of magnitude with the plant related events as a dominant contributor. A frequency of one LOOP event per shutdown year is obtained for German NPPs in shutdown mode of operation. For the US NPPs, the obtained LOOP frequency for critical and shutdown mode is comparable to the one assessed in NUREG/CR-6890. Decreasing trend is obtained for the LOOP events registered in three databases (IRSN, GRS, and NRC.
Statistical Analysis of Resistivity Anomalies Caused by Underground Caves
Frid, V.; Averbach, A.; Frid, M.; Dudkinski, D.; Liskevich, G.
2017-03-01
Geophysical prospecting of underground caves being performed on a construction site is often still a challenging procedure. Estimation of a likelihood level of an anomaly found is frequently a mandatory requirement of a project principal due to necessity of risk/safety assessment. However, the methodology of such estimation is not hitherto developed. Aiming to put forward such a methodology the present study (being performed as a part of an underground caves mapping prior to the land development on the site area) consisted of application of electrical resistivity tomography (ERT) together with statistical analysis utilized for the likelihood assessment of underground anomalies located. The methodology was first verified via a synthetic modeling technique and applied to the in situ collected ERT data and then crossed referenced with intrusive investigations (excavation and drilling) for the data verification. The drilling/excavation results showed that the proper discovering of underground caves can be done if anomaly probability level is not lower than 90 %. Such a probability value was shown to be consistent with the modeling results. More than 30 underground cavities were discovered on the site utilizing the methodology.
Ensemble Solar Forecasting Statistical Quantification and Sensitivity Analysis: Preprint
Energy Technology Data Exchange (ETDEWEB)
Cheung, WanYin; Zhang, Jie; Florita, Anthony; Hodge, Bri-Mathias; Lu, Siyuan; Hamann, Hendrik F.; Sun, Qian; Lehman, Brad
2015-12-08
Uncertainties associated with solar forecasts present challenges to maintain grid reliability, especially at high solar penetrations. This study aims to quantify the errors associated with the day-ahead solar forecast parameters and the theoretical solar power output for a 51-kW solar power plant in a utility area in the state of Vermont, U.S. Forecasts were generated by three numerical weather prediction (NWP) models, including the Rapid Refresh, the High Resolution Rapid Refresh, and the North American Model, and a machine-learning ensemble model. A photovoltaic (PV) performance model was adopted to calculate theoretical solar power generation using the forecast parameters (e.g., irradiance, cell temperature, and wind speed). Errors of the power outputs were quantified using statistical moments and a suite of metrics, such as the normalized root mean squared error (NRMSE). In addition, the PV model's sensitivity to different forecast parameters was quantified and analyzed. Results showed that the ensemble model yielded forecasts in all parameters with the smallest NRMSE. The NRMSE of solar irradiance forecasts of the ensemble NWP model was reduced by 28.10% compared to the best of the three NWP models. Further, the sensitivity analysis indicated that the errors of the forecasted cell temperature attributed only approximately 0.12% to the NRMSE of the power output as opposed to 7.44% from the forecasted solar irradiance.
Utility green pricing programs: A statistical analysis of program effectiveness
Energy Technology Data Exchange (ETDEWEB)
Wiser, Ryan; Olson, Scott; Bird, Lori; Swezey, Blair
2004-02-01
Development of renewable energy. Such programs have grown in number in recent years. The design features and effectiveness of these programs varies considerably, however, leading a variety of stakeholders to suggest specific marketing and program design features that might improve customer response and renewable energy sales. This report analyzes actual utility green pricing program data to provide further insight into which program features might help maximize both customer participation in green pricing programs and the amount of renewable energy purchased by customers in those programs. Statistical analysis is performed on both the residential and non-residential customer segments. Data comes from information gathered through a questionnaire completed for 66 utility green pricing programs in early 2003. The questionnaire specifically gathered data on residential and non-residential participation, amount of renewable energy sold, program length, the type of renewable supply used, program price/cost premiums, types of consumer research and program evaluation performed, different sign-up options available, program marketing efforts, and ancillary benefits offered to participants.
Statistical analysis of CSP plants by simulating extensive meteorological series
Pavón, Manuel; Fernández, Carlos M.; Silva, Manuel; Moreno, Sara; Guisado, María V.; Bernardos, Ana
2017-06-01
The feasibility analysis of any power plant project needs the estimation of the amount of energy it will be able to deliver to the grid during its lifetime. To achieve this, its feasibility study requires a precise knowledge of the solar resource over a long term period. In Concentrating Solar Power projects (CSP), financing institutions typically requires several statistical probability of exceedance scenarios of the expected electric energy output. Currently, the industry assumes a correlation between probabilities of exceedance of annual Direct Normal Irradiance (DNI) and energy yield. In this work, this assumption is tested by the simulation of the energy yield of CSP plants using as input a 34-year series of measured meteorological parameters and solar irradiance. The results of this work show that, even if some correspondence between the probabilities of exceedance of annual DNI values and energy yields is found, the intra-annual distribution of DNI may significantly affect this correlation. This result highlights the need of standardized procedures for the elaboration of representative DNI time series representative of a given probability of exceedance of annual DNI.
Metrology Optical Power Budgeting in SIM Using Statistical Analysis Techniques
Kuan, Gary M
2008-01-01
The Space Interferometry Mission (SIM) is a space-based stellar interferometry instrument, consisting of up to three interferometers, which will be capable of micro-arc second resolution. Alignment knowledge of the three interferometer baselines requires a three-dimensional, 14-leg truss with each leg being monitored by an external metrology gauge. In addition, each of the three interferometers requires an internal metrology gauge to monitor the optical path length differences between the two sides. Both external and internal metrology gauges are interferometry based, operating at a wavelength of 1319 nanometers. Each gauge has fiber inputs delivering measurement and local oscillator (LO) power, split into probe-LO and reference-LO beam pairs. These beams experience power loss due to a variety of mechanisms including, but not restricted to, design efficiency, material attenuation, element misalignment, diffraction, and coupling efficiency. Since the attenuation due to these sources may degrade over time, an accounting of the range of expected attenuation is needed so an optical power margin can be book kept. A method of statistical optical power analysis and budgeting, based on a technique developed for deep space RF telecommunications, is described in this paper and provides a numerical confidence level for having sufficient optical power relative to mission metrology performance requirements.
Statistical analysis of the ambiguities in the asteroid period determinations
Butkiewicz-Bąk, M.; Kwiatkowski, T.; Bartczak, P.; Dudziński, G.; Marciniak, A.
2017-09-01
Among asteroids there exist ambiguities in their rotation period determinations. They are due to incomplete coverage of the rotation, noise and/or aliases resulting from gaps between separate lightcurves. To help to remove such uncertainties, basic characteristic of the lightcurves resulting from constraints imposed by the asteroid shapes and geometries of observations should be identified. We simulated light variations of asteroids whose shapes were modelled as Gaussian random spheres, with random orientations of spin vectors and phase angles changed every 5° from 0° to 65°. This produced 1.4 million lightcurves. For each simulated lightcurve, Fourier analysis has been made and the harmonic of the highest amplitude was recorded. From the statistical point of view, all lightcurves observed at phase angles α 0.2 mag, are bimodal. Second most frequently dominating harmonic is the first one, with the 3rd harmonic following right after. For 1 per cent of lightcurves with amplitudes A < 0.1 mag and phase angles α < 40°, 4th harmonic dominates.
Ensemble Solar Forecasting Statistical Quantification and Sensitivity Analysis
Energy Technology Data Exchange (ETDEWEB)
Cheung, WanYin; Zhang, Jie; Florita, Anthony; Hodge, Bri-Mathias; Lu, Siyuan; Hamann, Hendrik F.; Sun, Qian; Lehman, Brad
2015-10-02
Uncertainties associated with solar forecasts present challenges to maintain grid reliability, especially at high solar penetrations. This study aims to quantify the errors associated with the day-ahead solar forecast parameters and the theoretical solar power output for a 51-kW solar power plant in a utility area in the state of Vermont, U.S. Forecasts were generated by three numerical weather prediction (NWP) models, including the Rapid Refresh, the High Resolution Rapid Refresh, and the North American Model, and a machine-learning ensemble model. A photovoltaic (PV) performance model was adopted to calculate theoretical solar power generation using the forecast parameters (e.g., irradiance, cell temperature, and wind speed). Errors of the power outputs were quantified using statistical moments and a suite of metrics, such as the normalized root mean squared error (NRMSE). In addition, the PV model's sensitivity to different forecast parameters was quantified and analyzed. Results showed that the ensemble model yielded forecasts in all parameters with the smallest NRMSE. The NRMSE of solar irradiance forecasts of the ensemble NWP model was reduced by 28.10% compared to the best of the three NWP models. Further, the sensitivity analysis indicated that the errors of the forecasted cell temperature attributed only approximately 0.12% to the NRMSE of the power output as opposed to 7.44% from the forecasted solar irradiance.
Fluorescence correlation spectroscopy: Statistical analysis and biological applications
Saffarian, Saveez
2002-01-01
The experimental design and realization of an apparatus which can be used both for single molecule fluorescence detection and also fluorescence correlation and cross correlation spectroscopy is presented. A thorough statistical analysis of the fluorescence correlation functions including the analysis of bias and errors based on analytical derivations has been carried out. Using the methods developed here, the mechanism of binding and cleavage site recognition of matrix metalloproteinases (MMP) for their substrates has been studied. We demonstrate that two of the MMP family members, Collagenase (MMP-1) and Gelatinase A (MMP-2) exhibit diffusion along their substrates, the importance of this diffusion process and its biological implications are discussed. We show through truncation mutants that the hemopexin domain of the MMP-2 plays and important role in the substrate diffusion of this enzyme. Single molecule diffusion of the collagenase MMP-1 has been observed on collagen fibrils and shown to be biased. The discovered biased diffusion would make the MMP-1 molecule an active motor, thus making it the first active motor that is not coupled to ATP hydrolysis. The possible sources of energy for this enzyme and their implications are discussed. We propose that a possible source of energy for the enzyme can be in the rearrangement of the structure of collagen fibrils. In a separate application, using the methods developed here, we have observed an intermediate in the intestinal fatty acid binding protein folding process through the changes in its hydrodynamic radius also the fluctuations in the structure of the IFABP in solution were measured using FCS.
Statistical analysis and optimization of igbt manufacturing flow
Directory of Open Access Journals (Sweden)
Baranov V. V.
2015-02-01
Full Text Available The use of computer simulation, design and optimization of power electronic devices formation technological processes can significantly reduce development time, improve the accuracy of calculations, choose the best options for implementation based on strict mathematical analysis. One of the most common power electronic devices is isolated gate bipolar transistor (IGBT, which combines the advantages of MOSFET and bipolar transistor. The achievement of high requirements for these devices is only possible by optimizing device design and manufacturing process parameters. Therefore important and necessary step in the modern cycle of IC design and manufacturing is to carry out the statistical analysis. Procedure of the IGBT threshold voltage optimization was realized. Through screening experiments according to the Plackett-Burman design the most important input parameters (factors that have the greatest impact on the output characteristic was detected. The coefficients of the approximation polynomial adequately describing the relationship between the input parameters and investigated output characteristics ware determined. Using the calculated approximation polynomial, a series of multiple, in a cycle of Monte Carlo, calculations to determine the spread of threshold voltage values at selected ranges of input parameters deviation were carried out. Combinations of input process parameters values were determined randomly by a normal distribution within a given range of changes. The procedure of IGBT process parameters optimization consist a mathematical problem of determining the value range of the input significant structural and technological parameters providing the change of the IGBT threshold voltage in a given interval. The presented results demonstrate the effectiveness of the proposed optimization techniques.
Allen, Kirk
The Statistics Concept Inventory (SCI) is a multiple choice test designed to assess students' conceptual understanding of topics typically encountered in an introductory statistics course. This dissertation documents the development of the SCI from Fall 2002 up to Spring 2006. The first phase of the project essentially sought to answer the question: "Can you write a test to assess topics typically encountered in introductory statistics?" Book One presents the results utilized in answering this question in the affirmative. The bulk of the results present the development and evolution of the items, primarily relying on objective metrics to gauge effectiveness but also incorporating student feedback. The second phase boils down to: "Now that you have the test, what else can you do with it?" This includes an exploration of Cronbach's alpha, the most commonly-used measure of test reliability in the literature. An online version of the SCI was designed, and its equivalency to the paper version is assessed. Adding an extra wrinkle to the online SCI, subjects rated their answer confidence. These results show a general positive trend between confidence and correct responses. However, some items buck this trend, revealing potential sources of misunderstandings, with comparisons offered to the extant statistics and probability educational research. The third phase is a re-assessment of the SCI: "Are you sure?" A factor analytic study favored a uni-dimensional structure for the SCI, although maintaining the likelihood of a deeper structure if more items can be written to tap similar topics. A shortened version of the instrument is proposed, demonstrated to be able to maintain a reliability nearly identical to that of the full instrument. Incorporating student feedback and a faculty topics survey, improvements to the items and recommendations for further research are proposed. The state of the concept inventory movement is assessed, to offer a comparison to the work presented
Brandon, Paul R.; Harrison, George M.; Lawton, Brian E.
2013-01-01
When evaluators plan site-randomized experiments, they must conduct the appropriate statistical power analyses. These analyses are most likely to be valid when they are based on data from the jurisdictions in which the studies are to be conducted. In this method note, we provide software code, in the form of a SAS macro, for producing statistical…
Statistics and Analysis of CIAE’s Meteorological Observed Data
Institute of Scientific and Technical Information of China (English)
ZHANG; Liang; CHENG; Wei-ya
2015-01-01
The work analyzes the recent years’meteorological observed data of CIAE site.The suited statistical method is selected for environment condition evaluation.1 Statistical method The data types are stability,wind direction,wind frequency,wind speed,temperature,and
Analysis of room transfer function and reverberant signal statistics
DEFF Research Database (Denmark)
Georganti, Eleftheria; Mourjopoulos, John; Jacobsen, Finn
2008-01-01
smoothing (e.g., as in complex smoothing) with respect to the original RTF statistics. More specifically, the RTF statistics, derived after the complex smoothing calculation, are compared to the original statistics across space inside typical rooms, by varying the source, the receiver position...... and the corresponding ratio of the direct and reverberant signal. In addition, this work examines the statistical quantities for speech and audio signals prior to their reproduction within rooms and when recorded in rooms. Histograms and other statistical distributions are used to compare RTF minima of typical...... “anechoic” and “reverberant” audio speech signals, in order to model the alterations due to room acoustics. The above results are obtained from both in-situ room response measurements and controlled acoustical response simulations....
Petocz, Agnes; Newbery, Glenn
2010-01-01
Statistics education in psychology often falls disappointingly short of its goals. The increasing use of qualitative approaches in statistics education research has extended and enriched our understanding of statistical cognition processes, and thus facilitated improvements in statistical education and practices. Yet conceptual analysis, a…
Combined statistical analysis of landslide release and propagation
Mergili, Martin; Rohmaneo, Mohammad; Chu, Hone-Jay
2016-04-01
Statistical methods - often coupled with stochastic concepts - are commonly employed to relate areas affected by landslides with environmental layers, and to estimate spatial landslide probabilities by applying these relationships. However, such methods only concern the release of landslides, disregarding their motion. Conceptual models for mass flow routing are used for estimating landslide travel distances and possible impact areas. Automated approaches combining release and impact probabilities are rare. The present work attempts to fill this gap by a fully automated procedure combining statistical and stochastic elements, building on the open source GRASS GIS software: (1) The landslide inventory is subset into release and deposition zones. (2) We employ a traditional statistical approach to estimate the spatial release probability of landslides. (3) We back-calculate the probability distribution of the angle of reach of the observed landslides, employing the software tool r.randomwalk. One set of random walks is routed downslope from each pixel defined as release area. Each random walk stops when leaving the observed impact area of the landslide. (4) The cumulative probability function (cdf) derived in (3) is used as input to route a set of random walks downslope from each pixel in the study area through the DEM, assigning the probability gained from the cdf to each pixel along the path (impact probability). The impact probability of a pixel is defined as the average impact probability of all sets of random walks impacting a pixel. Further, the average release probabilities of the release pixels of all sets of random walks impacting a given pixel are stored along with the area of the possible release zone. (5) We compute the zonal release probability by increasing the release probability according to the size of the release zone - the larger the zone, the larger the probability that a landslide will originate from at least one pixel within this zone. We
Mascaró, Maite; Sacristán, Ana Isabel; Rufino, Marta M.
2016-01-01
For the past 4 years, we have been involved in a project that aims to enhance the teaching and learning of experimental analysis and statistics, of environmental and biological sciences students, through computational programming activities (using R code). In this project, through an iterative design, we have developed sequences of R-code-based…
Lu, Kung-Wen; Chen, Jung-Chou; Lai, Tung-Yuan; Yang, Jai-Sing; Weng, Shu-Wen; Ma, Yi-Shih; Tang, Nou-Ying; Lu, Pei-Jung; Weng, Jing-Ru; Chung, Jing-Gung
2010-01-01
Gypenosides (Gyp) are the major components of Gynostemma pentaphyllum Makino, a Chinese medical plant. Recently, Gyp has been shown to induce cell cycle arrest and apoptosis in many human cancer cell lines. However, there is no available information to address the effects of Gyp on DNA damage and DNA repair-associated gene expression in human oral cancer cells. Therefore, we investigated whether Gyp induced DNA damage and DNA repair gene expression in human oral cancer SAS cells. The results from flow cytometric assay indicated that Gyp-induced cytotoxic effects led to a decrease in the percentage of viable SAS cells. The results from comet assay revealed that the incubation of SAS cells with Gyp led to a longer DNA migration smear (comet tail) when compared with control and this effect was dose-dependent. The results from real-time PCR analysis indicated that treatment of SAS cells with 180 mug/ml of Gyp for 24 h led to a decrease in 14-3-3sigma, DNA-dependent serine/threonine protein kinase (DNAPK), p53, ataxia telangiectasia mutated (ATM), ataxia-telangiectasia and Rad3-related (ATR) and breast cancer gene 1 (BRCA1) mRNA expression. These observations may explain the cell death caused by Gyp in SAS cells. Taken together, Gyp induced DNA damage and inhibited DNA repair-associated gene expressions in human oral cancer SAS cells in vitro.
Directory of Open Access Journals (Sweden)
Kathy Ahern
2002-09-01
Full Text Available This study investigates triangulation of the findings of a qualitative analysis by applying an exploratory factor analysis to themes identified in a phenomenological study. A questionnaire was developed from a phenomenological analysis of parents' experiences of parenting a child with Developmental Coordination Disorder (DCD. The questionnaire was administered to 114 parents of DCD children and data were analyzed using an exploratory factor analysis. The extracted factors provided support for the validity of the original qualitative analysis, and a commentary on the validity of the process is provided. The emerging description is of the compromises that were necessary to translate qualitative themes into statistical factors, and of the ways in which the statistical analysis suggests further qualitative study.
Probability and Statistics Questions and Tests : a critical analysis
Directory of Open Access Journals (Sweden)
Fabrizio Maturo
2015-06-01
Full Text Available In probability and statistics courses, a popular method for the evaluation of the students is to assess them using multiple choice tests. The use of these tests allows to evaluate certain types of skills such as fast response, short-term memory, mental clarity and ability to compete. In our opinion, the verification through testing can certainly be useful for the analysis of certain aspects, and to speed up the process of assessment, but we should be aware of the limitations of such a standardized procedure and then exclude that the assessments of pupils, classes and schools can be reduced to processing of test results. To prove this thesis, this article argues in detail the main test limits, presents some recent models which have been proposed in the literature and suggests some alternative valuation methods. Quesiti e test di Probabilità e Statistica: un'analisi critica Nei corsi di Probabilità e Statistica, un metodo molto diffuso per la valutazione degli studenti consiste nel sottoporli a quiz a risposta multipla. L'uso di questi test permette di valutare alcuni tipi di abilità come la rapidità di risposta, la memoria a breve termine, la lucidità mentale e l'attitudine a gareggiare. A nostro parere, la verifica attraverso i test può essere sicuramente utile per l'analisi di alcuni aspetti e per velocizzare il percorso di valutazione ma si deve essere consapevoli dei limiti di una tale procedura standardizzata e quindi escludere che le valutazioni di alunni, classi e scuole possano essere ridotte a elaborazioni di risultati di test. A dimostrazione di questa tesi, questo articolo argomenta in dettaglio i limiti principali dei test, presenta alcuni recenti modelli proposti in letteratura e propone alcuni metodi di valutazione alternativi. Parole Chiave: item responce theory, valutazione, test, probabilità
Statistical analysis of simple repeats in the human genome
Piazza, F.; Liò, P.
2005-03-01
The human genome contains repetitive DNA at different level of sequence length, number and dispersion. Highly repetitive DNA is particularly rich in homo- and di-nucleotide repeats, while middle repetitive DNA is rich of families of interspersed, mobile elements hundreds of base pairs (bp) long, among which belong the Alu families. A link between homo- and di-polymeric tracts and mobile elements has been recently highlighted. In particular, the mobility of Alu repeats, which form 10% of the human genome, has been correlated with the length of poly(A) tracts located at one end of the Alu. These tracts have a rigid and non-bendable structure and have an inhibitory effect on nucleosomes, which normally compact the DNA. We performed a statistical analysis of the genome-wide distribution of lengths and inter-tract separations of poly(X) and poly(XY) tracts in the human genome. Our study shows that in humans the length distributions of these sequences reflect the dynamics of their expansion and DNA replication. By means of general tools from linguistics, we show that the latter play the role of highly-significant content-bearing terms in the DNA text. Furthermore, we find that such tracts are positioned in a non-random fashion, with an apparent periodicity of 150 bases. This allows us to extend the link between repetitive, highly mobile elements such as Alus and low-complexity words in human DNA. More precisely, we show that Alus are sources of poly(X) tracts, which in turn affect in a subtle way the combination and diversification of gene expression and the fixation of multigene families.
Emerging Trends and Statistical Analysis in Computational Modeling in Agriculture
Directory of Open Access Journals (Sweden)
Sunil Kumar
2015-03-01
Full Text Available In this paper the authors have tried to describe emerging trend in computational modelling used in the sphere of agriculture. Agricultural computational modelling with the use of intelligence techniques for computing the agricultural output by providing minimum input data to lessen the time through cutting down the multi locational field trials and also the labours and other inputs is getting momentum. Development of locally suitable integrated farming systems (IFS is the utmost need of the day, particularly in India where about 95% farms are under small and marginal holding size. Optimization of the size and number of the various enterprises to the desired IFS model for a particular set of agro-climate is essential components of the research to sustain the agricultural productivity for not only filling the stomach of the bourgeoning population of the country, but also to enhance the nutritional security and farms return for quality life. Review of literature pertaining to emerging trends in computational modelling applied in field of agriculture is done and described below for the purpose of understanding its trends mechanism behavior and its applications. Computational modelling is increasingly effective for designing and analysis of the system. Computa-tional modelling is an important tool to analyses the effect of different scenarios of climate and management options on the farming systems and its interaction among themselves. Further, authors have also highlighted the applications of computational modeling in integrated farming system, crops, weather, soil, climate, horticulture and statistical used in agriculture which can show the path to the agriculture researcher and rural farming community to replace some of the traditional techniques.
A statistical framework for differential network analysis from microarray data
Directory of Open Access Journals (Sweden)
Datta Somnath
2010-02-01
Full Text Available Abstract Background It has been long well known that genes do not act alone; rather groups of genes act in consort during a biological process. Consequently, the expression levels of genes are dependent on each other. Experimental techniques to detect such interacting pairs of genes have been in place for quite some time. With the advent of microarray technology, newer computational techniques to detect such interaction or association between gene expressions are being proposed which lead to an association network. While most microarray analyses look for genes that are differentially expressed, it is of potentially greater significance to identify how entire association network structures change between two or more biological settings, say normal versus diseased cell types. Results We provide a recipe for conducting a differential analysis of networks constructed from microarray data under two experimental settings. At the core of our approach lies a connectivity score that represents the strength of genetic association or interaction between two genes. We use this score to propose formal statistical tests for each of following queries: (i whether the overall modular structures of the two networks are different, (ii whether the connectivity of a particular set of "interesting genes" has changed between the two networks, and (iii whether the connectivity of a given single gene has changed between the two networks. A number of examples of this score is provided. We carried out our method on two types of simulated data: Gaussian networks and networks based on differential equations. We show that, for appropriate choices of the connectivity scores and tuning parameters, our method works well on simulated data. We also analyze a real data set involving normal versus heavy mice and identify an interesting set of genes that may play key roles in obesity. Conclusions Examining changes in network structure can provide valuable information about the
Olive mill wastewater characteristics: modelling and statistical analysis
Directory of Open Access Journals (Sweden)
Martins-Dias, Susete
2004-09-01
Full Text Available A synthesis of the work carried out on Olive Mill Wastewater (OMW characterisation is given, covering articles published over the last 50 years. Data on OMW characterisation found in the literature are summarised and correlations between them and with phenolic compounds content are sought. This permits the characteristics of an OMW to be estimated from one simple measurement: the phenolic compounds concentration. A model based on OMW characterisations accounting 6 countries was developed along with a model for Portuguese OMW. The statistical analysis of the correlations obtained indicates that Chemical Oxygen Demand of a given OMW is a second-degree polynomial function of its phenolic compounds concentration. Tests to evaluate the regressions significance were carried out, based on multivariable ANOVA analysis, on visual standardised residuals distribution and their means for confidence levels of 95 and 99 %, validating clearly these models. This modelling work will help in the future planning, operation and monitoring of an OMW treatment plant.Presentamos una síntesis de los trabajos realizados en los últimos 50 años relacionados con la caracterización del alpechín. Realizamos una recopilación de los datos publicados, buscando correlaciones entre los datos relativos al alpechín y los compuestos fenólicos. Esto permite la determinación de las características del alpechín a partir de una sola medida: La concentración de compuestos fenólicos. Proponemos dos modelos, uno basado en datos relativos a seis países y un segundo aplicado únicamente a Portugal. El análisis estadístico de las correlaciones obtenidas indica que la demanda química de oxígeno de un determinado alpechín es una función polinómica de segundo grado de su concentración de compuestos fenólicos. Se comprobó la significancia de esta correlación mediante la aplicación del análisis multivariable ANOVA, y además se evaluó la distribución de residuos y sus
Statistical Design, Models and Analysis for the Job Change Framework.
Gleser, Leon Jay
1990-01-01
Proposes statistical methodology for testing Loughead and Black's "job change thermostat." Discusses choice of target population; relationship between job satisfaction and values, perceptions, and opportunities; and determinants of job change. (SK)
Detailed statistical analysis plan for the pulmonary protection trial
DEFF Research Database (Denmark)
Buggeskov, Katrine B; Jakobsen, Janus C; Secher, Niels H
2014-01-01
BACKGROUND: Pulmonary dysfunction complicates cardiac surgery that includes cardiopulmonary bypass. The pulmonary protection trial evaluates effect of pulmonary perfusion on pulmonary function in patients suffering from chronic obstructive pulmonary disease. This paper presents the statistical plan...
STATISTICAL ANALYSIS OF SOME EXPERIMENTAL FATIGUE TESTS RESULTS
Adrian Stere PARIS; Gheorghe AMZA; Claudiu BABIŞ; Dan Niţoi
2012-01-01
The paper details the results of processing the fatigue data experiments to find the regression function. Application software for statistical processing like ANOVA and regression calculi are properly utilized, with emphasis on popular software like MSExcel and CurveExpert
Analysis of Statistical Methods Currently used in Toxicology Journals.
Na, Jihye; Yang, Hyeri; Bae, SeungJin; Lim, Kyung-Min
2014-09-01
Statistical methods are frequently used in toxicology, yet it is not clear whether the methods employed by the studies are used consistently and conducted based on sound statistical grounds. The purpose of this paper is to describe statistical methods used in top toxicology journals. More specifically, we sampled 30 papers published in 2014 from Toxicology and Applied Pharmacology, Archives of Toxicology, and Toxicological Science and described methodologies used to provide descriptive and inferential statistics. One hundred thirteen endpoints were observed in those 30 papers, and most studies had sample size less than 10, with the median and the mode being 6 and 3 & 6, respectively. Mean (105/113, 93%) was dominantly used to measure central tendency, and standard error of the mean (64/113, 57%) and standard deviation (39/113, 34%) were used to measure dispersion, while few studies provide justifications regarding why the methods being selected. Inferential statistics were frequently conducted (93/113, 82%), with one-way ANOVA being most popular (52/93, 56%), yet few studies conducted either normality or equal variance test. These results suggest that more consistent and appropriate use of statistical method is necessary which may enhance the role of toxicology in public health.
Algebraic Monte Carlo precedure reduces statistical analysis time and cost factors
Africano, R. C.; Logsdon, T. S.
1967-01-01
Algebraic Monte Carlo procedure statistically analyzes performance parameters in large, complex systems. The individual effects of input variables can be isolated and individual input statistics can be changed without having to repeat the entire analysis.
A SAS Package for Logistic Two-Phase Studies
Directory of Open Access Journals (Sweden)
Walter Schill
2014-04-01
Full Text Available Two-phase designs, in which for a large study a dichotomous outcome and partial or proxy information on risk factors is available, whereas precise or complete measurements on covariates have been obtained only in a stratified sub-sample, extend the standard case-control design and have been proven useful in practice. The application of two-phase designs, however, seems to be hampered by the lack of appropriate, easy-to-use software. This paper introduces sas-twophase-package, a collection of SAS-macros, to fulfill this task. sas-twophase-package implements weighted likelihood, pseudo likelihood and semi- parametric maximum likelihood estimation via the EM algorithm and via profile likelihood in two-phase settings with dichotomous outcome and a given stratification.
Social Anxiety Scale for Adolescents (SAS-A) Short Form.
Nelemans, Stefanie A; Meeus, Wim H J; Branje, Susan J T; Van Leeuwen, Karla; Colpin, Hilde; Verschueren, Karine; Goossens, Luc
2017-01-01
In this study, we examined the longitudinal measurement invariance of a 12-item short version of the Social Anxiety Scale for Adolescents (SAS-A) in two 4-year longitudinal community samples ( Nsample 1 = 815, Mage T1 = 13.38 years; Nsample 2 = 551, Mage T1 = 14.82 years). Using confirmatory factor analyses, we found strict longitudinal measurement invariance for the three-factor structure of the SAS-A across adolescence, across samples, and across gender. Some developmental changes in social anxiety were found from early to mid-adolescence, as well as gender differences across adolescence. These findings suggest that the short version of the SAS-A is a developmentally appropriate instrument that can be used effectively to examine adolescent social anxiety development.
Statistics and data analysis for financial engineering with R examples
Ruppert, David
2015-01-01
The new edition of this influential textbook, geared towards graduate or advanced undergraduate students, teaches the statistics necessary for financial engineering. In doing so, it illustrates concepts using financial markets and economic data, R Labs with real-data exercises, and graphical and analytic methods for modeling and diagnosing modeling errors. Financial engineers now have access to enormous quantities of data. To make use of these data, the powerful methods in this book, particularly about volatility and risks, are essential. Strengths of this fully-revised edition include major additions to the R code and the advanced topics covered. Individual chapters cover, among other topics, multivariate distributions, copulas, Bayesian computations, risk management, multivariate volatility and cointegration. Suggested prerequisites are basic knowledge of statistics and probability, matrices and linear algebra, and calculus. There is an appendix on probability, statistics and linear algebra. Practicing fina...
Statistical mechanics analysis of thresholding 1-bit compressed sensing
Xu, Yingying
2016-01-01
The one-bit compressed sensing framework aims to reconstruct a sparse signal by only using the sign information of its linear measurements. To compensate for the loss of scale information, past studies in the area have proposed recovering the signal by imposing an additional constraint on the L2-norm of the signal. Recently, an alternative strategy that captures scale information by introducing a threshold parameter to the quantization process was advanced. In this paper, we analyze the typical behavior of the thresholding 1-bit compressed sensing utilizing the replica method of statistical mechanics, so as to gain an insight for properly setting the threshold value. Our result shows that, fixing the threshold at a constant value yields better performance than varying it randomly when the constant is optimally tuned, statistically. Unfortunately, the optimal threshold value depends on the statistical properties of the target signal, which may not be known in advance. In order to handle this inconvenience, we ...
Sensitivity Analysis and Statistical Convergence of a Saltating Particle Model
Maldonado, S
2016-01-01
Saltation models provide considerable insight into near-bed sediment transport. This paper outlines a simple, efficient numerical model of stochastic saltation, which is validated against previously published experimental data on saltation in a channel of nearly horizontal bed. Convergence tests are systematically applied to ensure the model is free from statistical errors emanating from the number of particle hops considered. Two criteria for statistical convergence are derived; according to the first criterion, at least $10^3$ hops appear to be necessary for convergent results, whereas $10^4$ saltations seem to be the minimum required in order to achieve statistical convergence in accordance with the second criterion. Two empirical formulae for lift force are considered: one dependent on the slip (relative) velocity of the particle multiplied by the vertical gradient of the horizontal flow velocity component; the other dependent on the difference between the squares of the slip velocity components at the to...
Analysis of Alignment Influence on 3-D Anthropometric Statistics
Institute of Scientific and Technical Information of China (English)
CAI Xiuwen; LI Zhizhong; CHANG Chien-Chi; DEMPSEY Patrick
2005-01-01
Three-dimensional (3-D) surface anthropometry can provide much more useful information for many applications such as ergonomic product design than traditional individual body dimension measurements. However, the traditional definition of the percentile calculation is designed only for 1-D anthropometric data estimates. The same approach cannot be applied directly to 3-D anthropometric statistics otherwise it could lead to misinterpretations. In this paper, the influence of alignment references on 3-D anthropometric statistics is analyzed mathematically, which shows that different alignment reference points (for example, landmarks) for translation alignment could result in different object shapes if 3-D anthropometric data are processed for percentile values based on coordinates and that dimension percentile calculations based on coordinate statistics are incompatible with those traditionally based on individual dimensions.
Statistical analysis of natural disasters and related losses
Pisarenko, VF
2014-01-01
The study of disaster statistics and disaster occurrence is a complicated interdisciplinary field involving the interplay of new theoretical findings from several scientific fields like mathematics, physics, and computer science. Statistical studies on the mode of occurrence of natural disasters largely rely on fundamental findings in the statistics of rare events, which were derived in the 20th century. With regard to natural disasters, it is not so much the fact that the importance of this problem for mankind was recognized during the last third of the 20th century - the myths one encounters in ancient civilizations show that the problem of disasters has always been recognized - rather, it is the fact that mankind now possesses the necessary theoretical and practical tools to effectively study natural disasters, which in turn supports effective, major practical measures to minimize their impact. All the above factors have resulted in considerable progress in natural disaster research. Substantial accrued ma...
Analysis of linear weighted order statistics CFAR algorithm
Institute of Scientific and Technical Information of China (English)
孟祥伟; 关键; 何友
2004-01-01
CFAR technique is widely used in radar targets detection fields. Traditional algorithm is cell averaging (CA),which can give a good detection performance in a relatively ideal environment. Recently, censoring technique is adopted to make the detector perform robustly. Ordered statistic (OS) and trimmed mean (TM) methods are proposed. TM methods treat the reference samples which participate in clutter power estimates equally, but this processing will not realize the effective estimates of clutter power. Therefore, in this paper a quasi best weighted (Q BW) order statistics algorithm is presented. In special cases, QBW reduces to CA and the censored mean level detector (CMLD).
Categorical and nonparametric data analysis choosing the best statistical technique
Nussbaum, E Michael
2014-01-01
Featuring in-depth coverage of categorical and nonparametric statistics, this book provides a conceptual framework for choosing the most appropriate type of test in various research scenarios. Class tested at the University of Nevada, the book's clear explanations of the underlying assumptions, computer simulations, and Exploring the Concept boxes help reduce reader anxiety. Problems inspired by actual studies provide meaningful illustrations of the techniques. The underlying assumptions of each test and the factors that impact validity and statistical power are reviewed so readers can explain
Want independent validation and assurance? Ask for a SAS-70.
Boutin, Christopher C
2008-08-01
The AICPA's Statement on Auditing Standards No.70, Service Organizations addresses CPA audits of service providers conducted to verify that a provider has adequate controls over its operations. Hospitals should request a SAS-70, the report produced by such an audit, from all of their third-party service providers. SAS-70s can be issued for a specific date or for a six-month period, and they typically consist of three sections: a CPA opinion, a description of controls, and information about the design of the controls.
Observations of A0620-00 by SAS-3
Bradt, H.; Matilsky, T.
1976-01-01
The transient X-ray source A0620-00 was observed by the SAS-3 group with the SAS-3 X-ray observatory. At maximum X-ray luminosity limits of 2% were placed on periodic variations from 0.2 ms - 2,000 sec. A precise position was obtained with the rotating modulation collimator. This led directly to radio and optical identification by groups at the NRAO, Arecibo, and McGraw Hill Observatories. The low energy (0.15-0.9 keV) system was pointed at the source, and a spectrum was derived. Hardness ratios are presented, as well as detailed light curves.
Bayesian statistical analysis of censored data in geotechnical engineering
DEFF Research Database (Denmark)
Ditlevsen, Ove Dalager; Tarp-Johansen, Niels Jacob; Denver, Hans
2000-01-01
The geotechnical engineer is often faced with the problem ofhow to assess the statistical properties of a soil parameter on the basis ofa sample measured in-situ or in the laboratory with the defect that somevalues have been replaced by interval bounds because the corresponding soilparameter values...
Statistical analysis of lightning electric field measured under Malaysian condition
Salimi, Behnam; Mehranzamir, Kamyar; Abdul-Malek, Zulkurnain
2014-02-01
Lightning is an electrical discharge during thunderstorms that can be either within clouds (Inter-Cloud), or between clouds and ground (Cloud-Ground). The Lightning characteristics and their statistical information are the foundation for the design of lightning protection system as well as for the calculation of lightning radiated fields. Nowadays, there are various techniques to detect lightning signals and to determine various parameters produced by a lightning flash. Each technique provides its own claimed performances. In this paper, the characteristics of captured broadband electric fields generated by cloud-to-ground lightning discharges in South of Malaysia are analyzed. A total of 130 cloud-to-ground lightning flashes from 3 separate thunderstorm events (each event lasts for about 4-5 hours) were examined. Statistical analyses of the following signal parameters were presented: preliminary breakdown pulse train time duration, time interval between preliminary breakdowns and return stroke, multiplicity of stroke, and percentages of single stroke only. The BIL model is also introduced to characterize the lightning signature patterns. Observations on the statistical analyses show that about 79% of lightning signals fit well with the BIL model. The maximum and minimum of preliminary breakdown time duration of the observed lightning signals are 84 ms and 560 us, respectively. The findings of the statistical results show that 7.6% of the flashes were single stroke flashes, and the maximum number of strokes recorded was 14 multiple strokes per flash. A preliminary breakdown signature in more than 95% of the flashes can be identified.
Statistical Lineament Analysis in South Greenland Based on Landsat Imagery
DEFF Research Database (Denmark)
Conradsen, Knut; Nilsson, Gert; Thyrsted, Tage
1986-01-01
Linear features, mapped visually from MSS channel-7 photoprints (1: 1 000 000) of Landsat images from South Greenland, were digitized and analyzed statistically. A sinusoidal curve was fitted to the frequency distribution which was then divided into ten significant classes of azimuthal trends. Maps...
Did Tanzania Achieve the Second Millennium Development Goal? Statistical Analysis
Magoti, Edwin
2016-01-01
Development Goal "Achieve universal primary education", the challenges faced, along with the way forward towards achieving the fourth Sustainable Development Goal "Ensure inclusive and equitable quality education and promote lifelong learning opportunities for all". Statistics show that Tanzania has made very promising steps…
STATISTICAL ANALYSIS OF SOME EXPERIMENTAL FATIGUE TESTS RESULTS
Directory of Open Access Journals (Sweden)
Adrian Stere PARIS
2012-05-01
Full Text Available The paper details the results of processing the fatigue data experiments to find the regression function. Application software for statistical processing like ANOVA and regression calculi are properly utilized, with emphasis on popular software like MSExcel and CurveExpert
Statistical simulation and counterfactual analysis in social sciences
Directory of Open Access Journals (Sweden)
François Gélineau
2012-06-01
Full Text Available In this paper, we present statistical simulation techniques of interest in substantial interpretation of regression results. Taking stock of recent literature on causality, we argue that such techniques can operate within a counterfactual framework. To illustrate, we report findings using post-electoral data on voter turnout.
Statistical Analysis of Human Reliability of Armored Equipment
Institute of Scientific and Technical Information of China (English)
LIU Wei-ping; CAO Wei-guo; REN Jing
2007-01-01
Human errors of seven types of armored equipment, which occur during the course of field test, are statistically analyzed. The human error-to-armored equipment failure ratio is obtained. The causes of human errors are analyzed. The distribution law of human errors is acquired. The ratio of human errors and human reliability index are also calculated.
Traitement des données par le logiciel SAS: introduction au module de base
Carletti, Isabelle; Prevot, Hugues
2006-01-01
Cette note constitue une introduction au module de base SAS. Après un bref aperçu de l'environnement de travail SAS et des principes du langage SAS, les étapes DATA et PROC d'un programme SAS sont exposées en plus en détails.
Bapanpally, Chandra; Maganty, Gayatri; Khan, Shah; Kasra, Akif; Morey, Amit
2014-01-01
The SAS Molecular Tests method for the detection of E. coli O157 in various food matrixes has been certified by the AOAC Research Institute and designated Performance Tested Method No. 031203. The current method modification includes the optional immunomagnetic separation (IMS) to enrich the bacteria as well as optional visual fluorescence readout without the use of a turbidimeter. The following study was conducted to validate the proposed modifications against the U.S. Department of Agriculture/Food Safety and Inspection Service Microbiology Laboratory Guidebook (MLG) and the U.S. Food and Drug Administration Bacteriological Analytical Manual (BAM) reference methods. E. coli O157:H7 ATCC 35150 and NM (non-motile) 700377 strains were used to inoculate the ground beef, beef trim, and spinach to obtain 20 low (0.23-2 CFU/test portion) and five high levels (3-5 CFU/test portion) of inoculations. Enriched samples were tested directly and subjected to anti-E. coli IMS prior to the SAS Molecular Tests. Results were determined via visual fluorescence and via turbidity using a turbidimeter. All the replicates, irrespective of the results, were confirmed using MLG 5.05 or BAM Chapter 4A methods. Results indicated that there were no significant differences in the detection of fractional positives (5-15 positives out of 20 replicates) with any of the methods tested above as compared to the reference methods. No false positives or negatives were detected except for two in the ground beef with IMS+Turbidity method. No false-negative samples were detected. Statistical analysis indicated that the modified methods were equivalent to the reference methods in detecting E. coli O157:H7 in the food matrixes tested. SAS Molecular Tests E. coli O157 Detection Kit can be used to detect E. coli O157:H7 in ground beef, beef trim, and spinach. The inclusion of IMS in the modified method improved the detection rate of E. coli O157:H7 in spinach and showed comparable detection rate in ground
Interfaces between statistical analysis packages and the ESRI geographic information system
Masuoka, E.
1980-01-01
Interfaces between ESRI's geographic information system (GIS) data files and real valued data files written to facilitate statistical analysis and display of spatially referenced multivariable data are described. An example of data analysis which utilized the GIS and the statistical analysis system is presented to illustrate the utility of combining the analytic capability of a statistical package with the data management and display features of the GIS.
Quartiles in Elementary Statistics
Langford, Eric
2006-01-01
The calculation of the upper and lower quartile values of a data set in an elementary statistics course is done in at least a dozen different ways, depending on the text or computer/calculator package being used (such as SAS, JMP, MINITAB, "Excel," and the TI-83 Plus). In this paper, we examine the various methods and offer a suggestion for a new…
Statistical traffic modeling of MPEG frame size: Experiments and Analysis
Directory of Open Access Journals (Sweden)
Haniph A. Latchman
2009-12-01
Full Text Available For guaranteed quality of service (QoS and sufficient bandwidth in a communication network which provides an integrated multimedia service, it is important to obtain an analytical and tractable model of the compressed MPEG data. This paper presents a statistical approach to a group of picture (GOP MPEG frame size model to increase network traffic performance in a communication network. We extract MPEG frame data from commercial DVD movies and make probability histograms to analyze the statistical characteristics of MPEG frame data. Six candidates of probability distributions are considered here and their parameters are obtained from the empirical data using the maximum likelihood estimation (MLE. This paper shows that the lognormal distribution is the best fitting model of MPEG-2 total frame data.
PERFORMANCE ANALYSIS OF SECOND-ORDER STATISTICS FOR CYCLOSTATIONARY SIGNALS
Institute of Scientific and Technical Information of China (English)
姜鸣; 陈进
2002-01-01
The second-order statistics for cyclostationary signals were introduced, and their performance were discussed. It especially researched the time lag characteristic of the cyclic autocorrelation function and spectral correlation characteristic of spectral correlation density function. It was pointed out that those functions can be available to extract the time-vary information of the kind of non-stationary signals. Using the relations of time lag-cyclic frequency and frequency-cyclic frequency independently, vibration signals of a rolling element bearing measured on test bed were analyzed. The results indicate that the second-order cyclostationary statistics might provide a powerful tool for the feature extracting and fault diagnosis of rolling element bearing.
A Statistical Framework for the Functional Analysis of Metagenomes
Energy Technology Data Exchange (ETDEWEB)
Sharon, Itai; Pati, Amrita; Markowitz, Victor; Pinter, Ron Y.
2008-10-01
Metagenomic studies consider the genetic makeup of microbial communities as a whole, rather than their individual member organisms. The functional and metabolic potential of microbial communities can be analyzed by comparing the relative abundance of gene families in their collective genomic sequences (metagenome) under different conditions. Such comparisons require accurate estimation of gene family frequencies. They present a statistical framework for assessing these frequencies based on the Lander-Waterman theory developed originally for Whole Genome Shotgun (WGS) sequencing projects. They also provide a novel method for assessing the reliability of the estimations which can be used for removing seemingly unreliable measurements. They tested their method on a wide range of datasets, including simulated genomes and real WGS data from sequencing projects of whole genomes. Results suggest that their framework corrects inherent biases in accepted methods and provides a good approximation to the true statistics of gene families in WGS projects.
STATISTICAL ANALYSYS OF THE SCFE OF A BRAZILAN MINERAL COAL
Directory of Open Access Journals (Sweden)
DARIVA Cláudio
1997-01-01
Full Text Available The influence of some process variables on the productivity of the fractions (liquid yield times fraction percent obtained from SCFE of a Brazilian mineral coal using isopropanol and ethanol as primary solvents is analyzed using statistical techniques. A full factorial 23 experimental design was adopted to investigate the effects of process variables (temperature, pressure and cosolvent concentration on the extraction products. The extracts were analyzed by the Preparative Liquid Chromatography-8 fractions method (PLC-8, a reliable, non destructive solvent fractionation method, especially developed for coal-derived liquids. Empirical statistical modeling was carried out in order to reproduce the experimental data. Correlations obtained were always greater than 0.98. Four specific process criteria were used to allow process optimization. Results obtained show that it is not possible to maximize both extract productivity and purity (through the minimization of heavy fraction content simultaneously by manipulating the mentioned process variables.
Statistical approaches for the analysis of DNA methylation microarray data.
Siegmund, Kimberly D
2011-06-01
Following the rapid development and adoption in DNA methylation microarray assays, we are now experiencing a growth in the number of statistical tools to analyze the resulting large-scale data sets. As is the case for other microarray applications, biases caused by technical issues are of concern. Some of these issues are old (e.g., two-color dye bias and probe- and array-specific effects), while others are new (e.g., fragment length bias and bisulfite conversion efficiency). Here, I highlight characteristics of DNA methylation that suggest standard statistical tools developed for other data types may not be directly suitable. I then describe the microarray technologies most commonly in use, along with the methods used for preprocessing and obtaining a summary measure. I finish with a section describing downstream analyses of the data, focusing on methods that model percentage DNA methylation as the outcome, and methods for integrating DNA methylation with gene expression or genotype data.
Ambiguity and nonidentifiability in the statistical analysis of neural codes
Amarasingham, Asohan; Geman, Stuart; Harrison, Matthew T.
2015-01-01
Many experimental studies of neural coding rely on a statistical interpretation of the theoretical notion of the rate at which a neuron fires spikes. For example, neuroscientists often ask, “Does a population of neurons exhibit more synchronous spiking than one would expect from the covariability of their instantaneous firing rates?” For another example, “How much of a neuron’s observed spiking variability is caused by the variability of its instantaneous firing rate, and how much is caused by spike timing variability?” However, a neuron’s theoretical firing rate is not necessarily well-defined. Consequently, neuroscientific questions involving the theoretical firing rate do not have a meaning in isolation but can only be interpreted in light of additional statistical modeling choices. Ignoring this ambiguity can lead to inconsistent reasoning or wayward conclusions. We illustrate these issues with examples drawn from the neural-coding literature. PMID:25934918
Statistical analysis of motion contrast in optical coherence tomography angiography
Cheng, Yuxuan; Pan, Cong; Lu, Tongtong; Hong, Tianyu; Ding, Zhihua; Li, Peng
2015-01-01
Optical coherence tomography angiography (Angio-OCT), mainly based on the temporal dynamics of OCT scattering signals, has found a range of potential applications in clinical and scientific researches. In this work, based on the model of random phasor sums, temporal statistics of the complex-valued OCT signals are mathematically described. Statistical distributions of the amplitude differential (AD) and complex differential (CD) Angio-OCT signals are derived. The theories are validated through the flow phantom and live animal experiments. Using the model developed in this work, the origin of the motion contrast in Angio-OCT is mathematically explained, and the implications in the improvement of motion contrast are further discussed, including threshold determination and its residual classification error, averaging method, and scanning protocol. The proposed mathematical model of Angio-OCT signals can aid in the optimal design of the system and associated algorithms.
Common misconceptions about data analysis and statistics1
Motulsky, Harvey J
2015-01-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: (1) P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. (2) Overemphasis on P values rather than on the actual size of the observed effect. (3) Overuse of statistical hypothesis testing, and being seduced by the word “significant”. (4) Overreliance on standard errors, which are often misunderstood. PMID:25692012
Statistics Analysis: FDI Obsorbing in Jan-Jun
Institute of Scientific and Technical Information of China (English)
Jovi Shu
2008-01-01
@@ According to China's Customs' statistics,from January to May 2008,the import and export volume of foreign-invested enterprises totaled US＄563.612 billion,an increase of 21.28 percent over the same period of last year,5.02 percent lower than the growth rate of the country(26.30 percent)in the same period,accounting for 55.69 percent of the total import and export of the country.(See Chart 1)
Statistical Mechanical Analysis of Compressed Sensing Utilizing Correlated Compression Matrix
Takeda, Koujin
2010-01-01
We investigate a reconstruction limit of compressed sensing for a reconstruction scheme based on the L1-norm minimization utilizing a correlated compression matrix with a statistical mechanics method. We focus on the compression matrix modeled as the Kronecker-type random matrix studied in research on multi-input multi-output wireless communication systems. We found that strong one-dimensional correlations between expansion bases of original information slightly degrade reconstruction performance.
Statistical Mechanics Analysis of LDPC Coding in MIMO Gaussian Channels
Alamino, Roberto C.; Saad, David
2007-01-01
Using analytical methods of statistical mechanics, we analyse the typical behaviour of a multiple-input multiple-output (MIMO) Gaussian channel with binary inputs under LDPC network coding and joint decoding. The saddle point equations for the replica symmetric solution are found in particular realizations of this channel, including a small and large number of transmitters and receivers. In particular, we examine the cases of a single transmitter, a single receiver and the symmetric and asymm...
Learning to Translate: A Statistical and Computational Analysis
Directory of Open Access Journals (Sweden)
Marco Turchi
2012-01-01
Full Text Available We present an extensive experimental study of Phrase-based Statistical Machine Translation, from the point of view of its learning capabilities. Very accurate Learning Curves are obtained, using high-performance computing, and extrapolations of the projected performance of the system under different conditions are provided. Our experiments confirm existing and mostly unpublished beliefs about the learning capabilities of statistical machine translation systems. We also provide insight into the way statistical machine translation learns from data, including the respective influence of translation and language models, the impact of phrase length on performance, and various unlearning and perturbation analyses. Our results support and illustrate the fact that performance improves by a constant amount for each doubling of the data, across different language pairs, and different systems. This fundamental limitation seems to be a direct consequence of Zipf law governing textual data. Although the rate of improvement may depend on both the data and the estimation method, it is unlikely that the general shape of the learning curve will change without major changes in the modeling and inference phases. Possible research directions that address this issue include the integration of linguistic rules or the development of active learning procedures.
The Digital Divide in Romania – A Statistical Analysis
Directory of Open Access Journals (Sweden)
Daniela BORISOV
2012-06-01
Full Text Available The digital divide is a subject of major importance in the current economic circumstances in which Information and Communication Technologies (ICT are seen as a significant determinant of increasing the domestic competitiveness and contribute to better life quality. Latest international reports regarding various aspects of ICT usage in modern society reveal a decrease of overall digital disparity towards the average trends of the worldwide ITC’s sector – this relates to latest advances of mobile and computer penetration rates, both for personal use and for households/ business. In Romania, the low starting point in the development of economy and society in the ICT direction was, in some extent, compensated by the rapid annual growth of the last decade. Even with these dynamic developments, the statistical data still indicate poor positions in European Union hierarchy; in this respect, the prospects of a rapid recovery of the low performance of the Romanian ICT endowment and usage and the issue continue to be regarded as a challenge for progress in economic and societal terms. The paper presents several methods for assessing the current state of ICT related aspects in terms of Internet usage based on the latest data provided by international databases. The current position of Romanian economy is judged according to several economy using statistical methods based on variability measurements: the descriptive statistics indicators, static measures of disparities and distance metrics.
Statistical mechanics analysis of thresholding 1-bit compressed sensing
Xu, Yingying; Kabashima, Yoshiyuki
2016-08-01
The one-bit compressed sensing framework aims to reconstruct a sparse signal by only using the sign information of its linear measurements. To compensate for the loss of scale information, past studies in the area have proposed recovering the signal by imposing an additional constraint on the l 2-norm of the signal. Recently, an alternative strategy that captures scale information by introducing a threshold parameter to the quantization process was advanced. In this paper, we analyze the typical behavior of thresholding 1-bit compressed sensing utilizing the replica method of statistical mechanics, so as to gain an insight for properly setting the threshold value. Our result shows that fixing the threshold at a constant value yields better performance than varying it randomly when the constant is optimally tuned, statistically. Unfortunately, the optimal threshold value depends on the statistical properties of the target signal, which may not be known in advance. In order to handle this inconvenience, we develop a heuristic that adaptively tunes the threshold parameter based on the frequency of positive (or negative) values in the binary outputs. Numerical experiments show that the heuristic exhibits satisfactory performance while incurring low computational cost.
Radar Derived Spatial Statistics of Summer Rain. Volume 2; Data Reduction and Analysis
Konrad, T. G.; Kropfli, R. A.
1975-01-01
Data reduction and analysis procedures are discussed along with the physical and statistical descriptors used. The statistical modeling techniques are outlined and examples of the derived statistical characterization of rain cells in terms of the several physical descriptors are presented. Recommendations concerning analyses which can be pursued using the data base collected during the experiment are included.
Using multivariate statistical analysis to assess changes in water ...
African Journals Online (AJOL)
analysis (CCA) showed that the environmental variables used in the analysis, discharge and month of ... International studies with regard to impacts on aquatic systems .... frequently used to assess for the impact of acidic deposition on.
SAS Radisson esitab Tallinnale väljakutse / Tõnis Arnover
Arnover, Tõnis, 1952-
2000-01-01
Tallinna Radisson SAS hotelli juht T. Bodin on kindel, et suudab pakkuda tipphotellile kohast teenindus. Vt. samas Mägus, Feliks. Hinnasõda hotellide vahel jääb ära. Ilmunud ka: Delovõje Vedomosti 1. nov. lk. 20-21
Tubulin nucleotide status controls Sas-4-dependent pericentriolar material recruitment.
Gopalakrishnan, Jayachandran; Chim, Yiu-Cheung Frederick; Ha, Andrew; Basiri, Marcus L; Lerit, Dorothy A; Rusan, Nasser M; Avidor-Reiss, Tomer
2012-08-01
Regulated centrosome biogenesis is required for accurate cell division and for maintaining genome integrity. Centrosomes consist of a centriole pair surrounded by a protein network known as pericentriolar material (PCM). PCM assembly is a tightly regulated, critical step that determines the size and capability of centrosomes. Here, we report a role for tubulin in regulating PCM recruitment through the conserved centrosomal protein Sas-4. Tubulin directly binds to Sas-4; together they are components of cytoplasmic complexes of centrosomal proteins. A Sas-4 mutant, which cannot bind tubulin, enhances centrosomal protein complex formation and has abnormally large centrosomes with excessive activity. These results suggest that tubulin negatively regulates PCM recruitment. Whereas tubulin-GTP prevents Sas-4 from forming protein complexes, tubulin-GDP promotes it. Thus, the regulation of PCM recruitment by tubulin depends on its GTP/GDP-bound state. These results identify a role for tubulin in regulating PCM recruitment independent of its well-known role as a building block of microtubules. On the basis of its guanine-bound state, tubulin can act as a molecular switch in PCM recruitment.
Using SAS PROC MCMC for Item Response Theory Models
Ames, Allison J.; Samonte, Kelli
2015-01-01
Interest in using Bayesian methods for estimating item response theory models has grown at a remarkable rate in recent years. This attentiveness to Bayesian estimation has also inspired a growth in available software such as WinBUGS, R packages, BMIRT, MPLUS, and SAS PROC MCMC. This article intends to provide an accessible overview of Bayesian…
Mazumdar, Madhu; Banerjee, Samprit; Van Epps, Heather L
2010-01-01
A majority of original articles published in biomedical journals include some form of statistical analysis. Unfortunately, many of the articles contain errors in statistical design and/or analysis. These errors are worrisome, as the misuse of statistics jeopardizes the process of scientific discovery and the accumulation of scientific knowledge. To help avoid these errors and improve statistical reporting, four approaches are suggested: (1) development of guidelines for statistical reporting that could be adopted by all journals, (2) improvement in statistics curricula in biomedical research programs with an emphasis on hands-on teaching by biostatisticians, (3) expansion and enhancement of biomedical science curricula in statistics programs, and (4) increased participation of biostatisticians in the peer review process along with the adoption of more rigorous journal editorial policies regarding statistics. In this chapter, we provide an overview of these issues with emphasis to the field of molecular biology and highlight the need for continuing efforts on all fronts.
Kotula, Paul G; Keenan, Michael R
2006-12-01
Multivariate statistical analysis methods have been applied to scanning transmission electron microscopy (STEM) energy-dispersive X-ray spectral images. The particular application of the multivariate curve resolution (MCR) technique provides a high spectral contrast view of the raw spectral image. The power of this approach is demonstrated with a microelectronics failure analysis. Specifically, an unexpected component describing a chemical contaminant was found, as well as a component consistent with a foil thickness change associated with the focused ion beam specimen preparation process. The MCR solution is compared with a conventional analysis of the same spectral image data set.
A COMPARATIVE STATISTICAL ANALYSIS OF RICE CULTIVARS DATA
Directory of Open Access Journals (Sweden)
Mugemangango Cyprien
2012-12-01
Full Text Available In this paper, rice cultivars data have been analysed by three different statisticaltechniques viz. Split-plot analysis in RBD, two-factor factorial analysis in RBD and analysis oftwo-way classified data with several observations per cell. The powers of the tests under differentmethods of analysis have been calculated. The method of two-way classified data with severalobservations per cell is found better followed by two-factor factorial technique in RBD and splitplot analysis for analyzing the given data.
Research on the integrative strategy of spatial statistical analysis of GIS
Xie, Zhong; Han, Qi Juan; Wu, Liang
2008-12-01
Presently, the spacial social and natural phenomenon is studied by both the GIS technique and statistics methods. However, plenty of complex practical applications restrict these research methods. The data models and technologies exploited are full of special localization. This paper firstly sums up the requirement of spacial statistical analysis. On the base of the requirement, the universal spatial statistical models are transformed into the function tools in statistical GIS system. A pyramidal structure of three layers is brought forward. Therefore, it is feasible to combine the techniques of spacial dada management, searches and visualization in GIS with the methods of processing data in the statistic analysis. It will form an integrative statistical GIS environment with the management, analysis, application and assistant decision-making of spacial statistical information.
Statistical analysis of questionnaires a unified approach based on R and Stata
Bartolucci, Francesco; Gnaldi, Michela
2015-01-01
Statistical Analysis of Questionnaires: A Unified Approach Based on R and Stata presents special statistical methods for analyzing data collected by questionnaires. The book takes an applied approach to testing and measurement tasks, mirroring the growing use of statistical methods and software in education, psychology, sociology, and other fields. It is suitable for graduate students in applied statistics and psychometrics and practitioners in education, health, and marketing.The book covers the foundations of classical test theory (CTT), test reliability, va
JAWS data collection, analysis highlights, and microburst statistics
Mccarthy, J.; Roberts, R.; Schreiber, W.
1983-01-01
Organization, equipment, and the current status of the Joint Airport Weather Studies project initiated in relation to the microburst phenomenon are summarized. Some data collection techniques and preliminary statistics on microburst events recorded by Doppler radar are discussed as well. Radar studies show that microbursts occur much more often than expected, with majority of the events being potentially dangerous to landing or departing aircraft. Seventy events were registered, with the differential velocities ranging from 10 to 48 m/s; headwind/tailwind velocity differentials over 20 m/s are considered seriously hazardous. It is noted that a correlation is yet to be established between the velocity differential and incoherent radar reflectivity.
Symbolic Data Analysis Conceptual Statistics and Data Mining
Billard, Lynne
2012-01-01
With the advent of computers, very large datasets have become routine. Standard statistical methods don't have the power or flexibility to analyse these efficiently, and extract the required knowledge. An alternative approach is to summarize a large dataset in such a way that the resulting summary dataset is of a manageable size and yet retains as much of the knowledge in the original dataset as possible. One consequence of this is that the data may no longer be formatted as single values, but be represented by lists, intervals, distributions, etc. The summarized data have their own internal s
Statistical analysis of the metrological properties of float glass
Yates, Brian W.; Duffy, Alan M.
2008-08-01
The radius of curvature, slope error, surface roughness and associated height distribution and power spectral density of uncoated commercial float glass samples have been measured in our Canadian Light Source Optical Metrology Facility, using our Micromap-570 surface profiler and long trace profilometer. The statistical differences in these parameters have been investigated between the tin and air sides of float glass. The effect of soaking the float glass in sulfuric acid to try to dissolve the tin contamination has also been investigated, and untreated and post-treatment surface roughness measurements compared. We report the results of our studies on these float glass samples.
Introduction to statistical data analysis for the life sciences
Ekstrom, Claus Thorn
2014-01-01
This text provides a computational toolbox that enables students to analyze real datasets and gain the confidence and skills to undertake more sophisticated analyses. Although accessible with any statistical software, the text encourages a reliance on R. For those new to R, an introduction to the software is available in an appendix. The book also includes end-of-chapter exercises as well as an entire chapter of case exercises that help students apply their knowledge to larger datasets and learn more about approaches specific to the life sciences.