Sensitivity analysis and related analysis : A survey of statistical techniques
Kleijnen, J.P.C.
1995-01-01
This paper reviews the state of the art in five related types of analysis, namely (i) sensitivity or what-if analysis, (ii) uncertainty or risk analysis, (iii) screening, (iv) validation, and (v) optimization. The main question is: when should which type of analysis be applied; which statistical
Metrology Optical Power Budgeting in SIM Using Statistical Analysis Techniques
Kuan, Gary M
2008-01-01
The Space Interferometry Mission (SIM) is a space-based stellar interferometry instrument, consisting of up to three interferometers, which will be capable of micro-arc second resolution. Alignment knowledge of the three interferometer baselines requires a three-dimensional, 14-leg truss with each leg being monitored by an external metrology gauge. In addition, each of the three interferometers requires an internal metrology gauge to monitor the optical path length differences between the two sides. Both external and internal metrology gauges are interferometry based, operating at a wavelength of 1319 nanometers. Each gauge has fiber inputs delivering measurement and local oscillator (LO) power, split into probe-LO and reference-LO beam pairs. These beams experience power loss due to a variety of mechanisms including, but not restricted to, design efficiency, material attenuation, element misalignment, diffraction, and coupling efficiency. Since the attenuation due to these sources may degrade over time, an accounting of the range of expected attenuation is needed so an optical power margin can be book kept. A method of statistical optical power analysis and budgeting, based on a technique developed for deep space RF telecommunications, is described in this paper and provides a numerical confidence level for having sufficient optical power relative to mission metrology performance requirements.
Categorical and nonparametric data analysis choosing the best statistical technique
Nussbaum, E Michael
2014-01-01
Featuring in-depth coverage of categorical and nonparametric statistics, this book provides a conceptual framework for choosing the most appropriate type of test in various research scenarios. Class tested at the University of Nevada, the book's clear explanations of the underlying assumptions, computer simulations, and Exploring the Concept boxes help reduce reader anxiety. Problems inspired by actual studies provide meaningful illustrations of the techniques. The underlying assumptions of each test and the factors that impact validity and statistical power are reviewed so readers can explain
TECHNIQUE OF THE STATISTICAL ANALYSIS OF INVESTMENT APPEAL OF THE REGION
Directory of Open Access Journals (Sweden)
А. А. Vershinina
2014-01-01
Full Text Available The technique of the statistical analysis of investment appeal of the region is given in scientific article for direct foreign investments. Definition of a technique of the statistical analysis is given, analysis stages reveal, the mathematico-statistical tools are considered.
Stalked protozoa identification by image analysis and multivariable statistical techniques.
Amaral, A L; Ginoris, Y P; Nicolau, A; Coelho, M A Z; Ferreira, E C
2008-06-01
Protozoa are considered good indicators of the treatment quality in activated sludge systems as they are sensitive to physical, chemical and operational processes. Therefore, it is possible to correlate the predominance of certain species or groups and several operational parameters of the plant. This work presents a semiautomatic image analysis procedure for the recognition of the stalked protozoa species most frequently found in wastewater treatment plants by determining the geometrical, morphological and signature data and subsequent processing by discriminant analysis and neural network techniques. Geometrical descriptors were found to be responsible for the best identification ability and the identification of the crucial Opercularia and Vorticella microstoma microorganisms provided some degree of confidence to establish their presence in wastewater treatment plants.
A COMPARISON OF SOME STATISTICAL TECHNIQUES FOR ROAD ACCIDENT ANALYSIS
OPPE, S INST ROAD SAFETY RES, SWOV
1992-01-01
At the TRRL/SWOV Workshop on Accident Analysis Methodology, heldin Amsterdam in 1988, the need to establish a methodology for the analysis of road accidents was firmly stated by all participants. Data from different countries cannot be compared because there is no agreement on research methodology,
Stalked protozoa identification by image analysis and multivariable statistical techniques
Amaral, A.L.; Ginoris, Y. P.; Nicolau, Ana; M.A.Z. Coelho; Ferreira, E. C.
2008-01-01
Protozoa are considered good indicators of the treatment quality in activated sludge systems as they are sensitive to physical, chemical and operational processes. Therefore, it is possible to correlate the predominance of certain species or groups and several operational parameters of the plant. This work presents a semiautomatic image analysis procedure for the recognition of the stalked protozoa species most frequently found in wastewater treatment plants by determinin...
Analysis of compressive fracture in rock using statistical techniques
Energy Technology Data Exchange (ETDEWEB)
Blair, S.C.
1994-12-01
Fracture of rock in compression is analyzed using a field-theory model, and the processes of crack coalescence and fracture formation and the effect of grain-scale heterogeneities on macroscopic behavior of rock are studied. The model is based on observations of fracture in laboratory compression tests, and incorporates assumptions developed using fracture mechanics analysis of rock fracture. The model represents grains as discrete sites, and uses superposition of continuum and crack-interaction stresses to create cracks at these sites. The sites are also used to introduce local heterogeneity. Clusters of cracked sites can be analyzed using percolation theory. Stress-strain curves for simulated uniaxial tests were analyzed by studying the location of cracked sites, and partitioning of strain energy for selected intervals. Results show that the model implicitly predicts both development of shear-type fracture surfaces and a strength-vs-size relation that are similar to those observed for real rocks. Results of a parameter-sensitivity analysis indicate that heterogeneity in the local stresses, attributed to the shape and loading of individual grains, has a first-order effect on strength, and that increasing local stress heterogeneity lowers compressive strength following an inverse power law. Peak strength decreased with increasing lattice size and decreasing mean site strength, and was independent of site-strength distribution. A model for rock fracture based on a nearest-neighbor algorithm for stress redistribution is also presented and used to simulate laboratory compression tests, with promising results.
Application of Multivariable Statistical Techniques in Plant-wide WWTP Control Strategies Analysis
DEFF Research Database (Denmark)
Flores Alsina, Xavier; Comas, J.; Rodríguez-Roda, I.
2007-01-01
The main objective of this paper is to present the application of selected multivariable statistical techniques in plant-wide wastewater treatment plant (WWTP) control strategies analysis. In this study, cluster analysis (CA), principal component analysis/factor analysis (PCA/FA) and discriminant...
Optimization techniques in statistics
Rustagi, Jagdish S
1994-01-01
Statistics help guide us to optimal decisions under uncertainty. A large variety of statistical problems are essentially solutions to optimization problems. The mathematical techniques of optimization are fundamentalto statistical theory and practice. In this book, Jagdish Rustagi provides full-spectrum coverage of these methods, ranging from classical optimization and Lagrange multipliers, to numerical techniques using gradients or direct search, to linear, nonlinear, and dynamic programming using the Kuhn-Tucker conditions or the Pontryagin maximal principle. Variational methods and optimiza
Donges, Jonathan F; Loew, Alexander; Marwan, Norbert; Kurths, Jürgen
2013-01-01
Eigen techniques such as empirical orthogonal function (EOF) or coupled pattern (CP) analysis have been frequently used for detecting patterns in multivariate climatological data sets. Recently, statistical methods originating from the theory of complex networks have been employed for the very same purpose of spatio-temporal analysis. This climate network analysis is usually based on the same set of similarity matrices as is used in classical EOF or CP analysis, e.g., the correlation matrix of a single climatological field or the cross-correlation matrix between two distinct climatological fields. In this study, formal relationships between both eigen and network approaches are derived and illustrated using exemplary data sets. These results allow to pinpoint that climate network analysis can complement classical eigen techniques and provides substantial additional information on the higher-order structure of statistical interrelationships in climatological data sets. Hence, climate networks are a valuable su...
The statistical analysis techniques to support the NGNP fuel performance experiments
Pham, Binh T.; Einerson, Jeffrey J.
2013-10-01
This paper describes the development and application of statistical analysis techniques to support the Advanced Gas Reactor (AGR) experimental program on Next Generation Nuclear Plant (NGNP) fuel performance. The experiments conducted in the Idaho National Laboratory's Advanced Test Reactor employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule. The tests are instrumented with thermocouples embedded in graphite blocks and the target quantity (fuel temperature) is regulated by the He-Ne gas mixture that fills the gap volume. Three techniques for statistical analysis, namely control charting, correlation analysis, and regression analysis, are implemented in the NGNP Data Management and Analysis System for automated processing and qualification of the AGR measured data. The neutronic and thermal code simulation results are used for comparative scrutiny. The ultimate objective of this work includes (a) a multi-faceted system for data monitoring and data accuracy testing, (b) identification of possible modes of diagnostics deterioration and changes in experimental conditions, (c) qualification of data for use in code validation, and (d) identification and use of data trends to support effective control of test conditions with respect to the test target. Analysis results and examples given in the paper show the three statistical analysis techniques providing a complementary capability to warn of thermocouple failures. It also suggests that the regression analysis models relating calculated fuel temperatures and thermocouple readings can enable online regulation of experimental parameters (i.e. gas mixture content), to effectively maintain the fuel temperature within a given range.
Statistical Techniques for Project Control
Badiru, Adedeji B
2012-01-01
A project can be simple or complex. In each case, proven project management processes must be followed. In all cases of project management implementation, control must be exercised in order to assure that project objectives are achieved. Statistical Techniques for Project Control seamlessly integrates qualitative and quantitative tools and techniques for project control. It fills the void that exists in the application of statistical techniques to project control. The book begins by defining the fundamentals of project management then explores how to temper quantitative analysis with qualitati
Mathematical and statistical analysis
Houston, A. Glen
1988-01-01
The goal of the mathematical and statistical analysis component of RICIS is to research, develop, and evaluate mathematical and statistical techniques for aerospace technology applications. Specific research areas of interest include modeling, simulation, experiment design, reliability assessment, and numerical analysis.
Design of U-Geometry Parameters Using Statistical Analysis Techniques in the U-Bending Process
Wiriyakorn Phanitwong; Untika Boochakul; Sutasn Thipprakmas
2017-01-01
The various U-geometry parameters in the U-bending process result in processing difficulties in the control of the spring-back characteristic. In this study, the effects of U-geometry parameters, including channel width, bend angle, material thickness, tool radius, as well as workpiece length, and their design, were investigated using a combination of finite element method (FEM) simulation, and statistical analysis techniques. Based on stress distribution analyses, the FEM simulation results ...
The Statistical Analysis Techniques to Support the NGNP Fuel Performance Experiments
Energy Technology Data Exchange (ETDEWEB)
Bihn T. Pham; Jeffrey J. Einerson
2010-06-01
This paper describes the development and application of statistical analysis techniques to support the AGR experimental program on NGNP fuel performance. The experiments conducted in the Idaho National Laboratory’s Advanced Test Reactor employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule. The tests are instrumented with thermocouples embedded in graphite blocks and the target quantity (fuel/graphite temperature) is regulated by the He-Ne gas mixture that fills the gap volume. Three techniques for statistical analysis, namely control charting, correlation analysis, and regression analysis, are implemented in the SAS-based NGNP Data Management and Analysis System (NDMAS) for automated processing and qualification of the AGR measured data. The NDMAS also stores daily neutronic (power) and thermal (heat transfer) code simulation results along with the measurement data, allowing for their combined use and comparative scrutiny. The ultimate objective of this work includes (a) a multi-faceted system for data monitoring and data accuracy testing, (b) identification of possible modes of diagnostics deterioration and changes in experimental conditions, (c) qualification of data for use in code validation, and (d) identification and use of data trends to support effective control of test conditions with respect to the test target. Analysis results and examples given in the paper show the three statistical analysis techniques providing a complementary capability to warn of thermocouple failures. It also suggests that the regression analysis models relating calculated fuel temperatures and thermocouple readings can enable online regulation of experimental parameters (i.e. gas mixture content), to effectively maintain the target quantity (fuel temperature) within a given range.
Ratner, Bruce
2011-01-01
The second edition of a bestseller, Statistical and Machine-Learning Data Mining: Techniques for Better Predictive Modeling and Analysis of Big Data is still the only book, to date, to distinguish between statistical data mining and machine-learning data mining. The first edition, titled Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data, contained 17 chapters of innovative and practical statistical data mining techniques. In this second edition, renamed to reflect the increased coverage of machine-learning data mining techniques, the author has
Beginning statistics with data analysis
Mosteller, Frederick; Rourke, Robert EK
2013-01-01
This introduction to the world of statistics covers exploratory data analysis, methods for collecting data, formal statistical inference, and techniques of regression and analysis of variance. 1983 edition.
Design of U-Geometry Parameters Using Statistical Analysis Techniques in the U-Bending Process
Directory of Open Access Journals (Sweden)
Wiriyakorn Phanitwong
2017-06-01
Full Text Available The various U-geometry parameters in the U-bending process result in processing difficulties in the control of the spring-back characteristic. In this study, the effects of U-geometry parameters, including channel width, bend angle, material thickness, tool radius, as well as workpiece length, and their design, were investigated using a combination of finite element method (FEM simulation, and statistical analysis techniques. Based on stress distribution analyses, the FEM simulation results clearly identified the different bending mechanisms and effects of U-geometry parameters on the spring-back characteristic in the U-bending process, with and without pressure pads. The statistical analyses elucidated that the bend angle and channel width have a major influence in cases with and without pressure pads, respectively. The experiments were carried out to validate the FEM simulation results. Additionally, the FEM simulation results were in agreement with the experimental results, in terms of the bending forces and bending angles.
Computer program uses Monte Carlo techniques for statistical system performance analysis
Wohl, D. P.
1967-01-01
Computer program with Monte Carlo sampling techniques determines the effect of a component part of a unit upon the overall system performance. It utilizes the full statistics of the disturbances and misalignments of each component to provide unbiased results through simulated random sampling.
Energy Technology Data Exchange (ETDEWEB)
Halim, Zakiah Abd [Universiti Teknikal Malaysia Melaka (Malaysia); Jamaludin, Nordin; Junaidi, Syarif [Faculty of Engineering and Built, Universiti Kebangsaan Malaysia, Bangi (Malaysia); Yahya, Syed Yusainee Syed [Universiti Teknologi MARA, Shah Alam (Malaysia)
2015-04-15
Current steel tubes inspection techniques are invasive, and the interpretation and evaluation of inspection results are manually done by skilled personnel. This paper presents a statistical analysis of high frequency stress wave signals captured from a newly developed noninvasive, non-destructive tube inspection technique known as the vibration impact acoustic emission (VIAE) technique. Acoustic emission (AE) signals have been introduced into the ASTM A179 seamless steel tubes using an impact hammer, and the AE wave propagation was captured using an AE sensor. Specifically, a healthy steel tube as the reference tube and four steel tubes with through-hole artificial defect at different locations were used in this study. The AE features extracted from the captured signals are rise time, peak amplitude, duration and count. The VIAE technique also analysed the AE signals using statistical features such as root mean square (r.m.s.), energy, and crest factor. It was evident that duration, count, r.m.s., energy and crest factor could be used to automatically identify the presence of defect in carbon steel tubes using AE signals captured using the non-invasive VIAE technique.
21 CFR 820.250 - Statistical techniques.
2010-04-01
... 21 Food and Drugs 8 2010-04-01 2010-04-01 false Statistical techniques. 820.250 Section 820.250...) MEDICAL DEVICES QUALITY SYSTEM REGULATION Statistical Techniques § 820.250 Statistical techniques. (a... statistical techniques required for establishing, controlling, and verifying the acceptability of process...
Bonetti, Jennifer; Quarino, Lawrence
2014-05-01
This study has shown that the combination of simple techniques with the use of multivariate statistics offers the potential for the comparative analysis of soil samples. Five samples were obtained from each of twelve state parks across New Jersey in both the summer and fall seasons. Each sample was examined using particle-size distribution, pH analysis in both water and 1 M CaCl2 , and a loss on ignition technique. Data from each of the techniques were combined, and principal component analysis (PCA) and canonical discriminant analysis (CDA) were used for multivariate data transformation. Samples from different locations could be visually differentiated from one another using these multivariate plots. Hold-one-out cross-validation analysis showed error rates as low as 3.33%. Ten blind study samples were analyzed resulting in no misclassifications using Mahalanobis distance calculations and visual examinations of multivariate plots. Seasonal variation was minimal between corresponding samples, suggesting potential success in forensic applications. © 2014 American Academy of Forensic Sciences.
Development of Evaluation Technique of GMAW Welding Quality Based on Statistical Analysis
Institute of Scientific and Technical Information of China (English)
FENG Shengqiang; TERASAKI Hidenri; KOMIZO Yuichi; HU Shengsun; CHEN Donggao; MA Zhihua
2014-01-01
Nondestructive techniques for appraising gas metal arc welding(GMAW) faults plays a very important role in on-line quality controllability and prediction of the GMAW process. On-line welding quality controllability and prediction have several disadvantages such as high cost, low efficiency, complication and greatly being affected by the environment. An enhanced, efficient evaluation technique for evaluating welding faults based on Mahalanobis distance(MD) and normal distribution is presented. In addition, a new piece of equipment, designated the weld quality tester(WQT), is developed based on the proposed evaluation technique. MD is superior to other multidimensional distances such as Euclidean distance because the covariance matrix used for calculating MD takes into account correlations in the data and scaling. The values of MD obtained from welding current and arc voltage are assumed to follow a normal distribution. The normal distribution has two parameters: the meanm and standard deviations of the data. In the proposed evaluation technique used by the WQT, values of MD located in the range from zero tom+3s are regarded as “good”. Two experiments which involve changing the flow of shielding gas and smearing paint on the surface of the substrate are conducted in order to verify the sensitivity of the proposed evaluation technique and the feasibility of using WQT. The experimental results demonstrate the usefulness of the WQT for evaluating welding quality. The proposed technique can be applied to implement the on-line welding quality controllability and prediction, which is of great importance to design some novel equipment for weld quality detection.
Downscaling Statistical Model Techniques for Climate Change Analysis Applied to the Amazon Region
Directory of Open Access Journals (Sweden)
David Mendes
2014-01-01
Full Text Available The Amazon is an area covered predominantly by dense tropical rainforest with relatively small inclusions of several other types of vegetation. In the last decades, scientific research has suggested a strong link between the health of the Amazon and the integrity of the global climate: tropical forests and woodlands (e.g., savannas exchange vast amounts of water and energy with the atmosphere and are thought to be important in controlling local and regional climates. Consider the importance of the Amazon biome to the global climate changes impacts and the role of the protected area in the conservation of biodiversity and state-of-art of downscaling model techniques based on ANN Calibrate and run a downscaling model technique based on the Artificial Neural Network (ANN that is applied to the Amazon region in order to obtain regional and local climate predicted data (e.g., precipitation. Considering the importance of the Amazon biome to the global climate changes impacts and the state-of-art of downscaling techniques for climate models, the shower of this work is presented as follows: the use of ANNs good similarity with the observation in the cities of Belém and Manaus, with correlations of approximately 88.9% and 91.3%, respectively, and spatial distribution, especially in the correction process, representing a good fit.
Noshadi, Masoud; Ghafourian, Amir
2016-07-01
This research investigated the quality of groundwater of 298 wells during 10 years, in Fars province, southern Iran, to survey spatial variation of groundwater quality and also major sources of hydro-chemical components for drinking and agricultural uses. To classify the sampling stations in each year, hierarchical cluster analysis, using the Euclidean distances and "Ward" method, was used. According to the results of cluster analysis, there were three quality groups in groundwater of the research area: first group of 170 wells with type of Ca-HCO3, second group of 98 wells with type of Ca-HCO3, and third group of 30 wells with type of Na-Cl. Hydro-chemical parameters were increased from the first to the third group, and on the basis of Schoeller and USSL diagrams, the water of wells of the third group was considered unsuitable for irrigation and drinking. Principal component (PC) analysis and factor analysis reduced the complex and voluminous data matrix into three main components, accounting for more than 80 % of the total variance. The first PC contained TDS, EC, TH, Na(+), Cl(-), Mg(2+), SO4 (2-), Ca(2+), and SAR parameters. Therefore, the first dominant factor was salinity. In PC2, HCO3 and pH were the dominant parameters, which may indicate weathering of silicate minerals. The PC3 contained high loadings for NO2 (2-) and NO3 (-). This factor indicates anthropogenic contaminants that may be caused by improper disposal of domestic wastes or the use of chemical fertilizers in agriculture and leaching of them.
Tools & techniques--statistics: propensity score techniques.
da Costa, Bruno R; Gahl, Brigitta; Jüni, Peter
2014-10-01
Propensity score (PS) techniques are useful if the number of potential confounding pretreatment variables is large and the number of analysed outcome events is rather small so that conventional multivariable adjustment is hardly feasible. Only pretreatment characteristics should be chosen to derive PS, and only when they are probably associated with outcome. A careful visual inspection of PS will help to identify areas of no or minimal overlap, which suggests residual confounding, and trimming of the data according to the distribution of PS will help to minimise residual confounding. Standardised differences in pretreatment characteristics provide a useful check of the success of the PS technique employed. As with conventional multivariable adjustment, PS techniques cannot account for confounding variables that are not or are only imperfectly measured, and no PS technique is a substitute for an adequately designed randomised trial.
Deconstructing Statistical Analysis
Snell, Joel
2014-01-01
Using a very complex statistical analysis and research method for the sake of enhancing the prestige of an article or making a new product or service legitimate needs to be monitored and questioned for accuracy. 1) The more complicated the statistical analysis, and research the fewer the number of learned readers can understand it. This adds a…
Quest for HI Turbulence Statistics New Techniques
Lazarian, A; Esquivel, A; Esquivel, Alejandro
2001-01-01
HI data cubes are sources of unique information on interstellar turbulence. Doppler shifts due to supersonic motions contain information on turbulent velocity field which is otherwise difficult to obtain. However, the problem of separation of velocity and density fluctuations within HI data cubes is far from being trivial. Analytical description of the emissivity statistics of channel maps (velocity slices) in Lazarian & Pogosyan (2000) showed that the relative contribution of the density and velocity fluctuations depends on the thickness of the velocity slice. In particular, power-law assymptotics of the emissivity fluctuations change when the dispersion of the velocity at the scale under study becomes of the order of the velocity slice thickness (integrated width of the channel map). These results are the foundations of the Velocity-Channel Analysis (VCA) technique which allows to determine velocity and density statistics using 21-cm data cubes. The VCA has been successfully tested using data cubes obta...
Common pitfalls in statistical analysis: Logistic regression.
Ranganathan, Priya; Pramesh, C S; Aggarwal, Rakesh
2017-01-01
Logistic regression analysis is a statistical technique to evaluate the relationship between various predictor variables (either categorical or continuous) and an outcome which is binary (dichotomous). In this article, we discuss logistic regression analysis and the limitations of this technique.
Asfahani, Jamal
2014-02-01
Factor analysis technique is proposed in this research for interpreting the combination of nuclear well logging, including natural gamma ray, density and neutron-porosity, and the electrical well logging of long and short normal, in order to characterize the large extended basaltic areas in southern Syria. Kodana well logging data are used for testing and applying the proposed technique. The four resulting score logs enable to establish the lithological score cross-section of the studied well. The established cross-section clearly shows the distribution and the identification of four kinds of basalt which are hard massive basalt, hard basalt, pyroclastic basalt and the alteration basalt products, clay. The factor analysis technique is successfully applied on the Kodana well logging data in southern Syria, and can be used efficiently when several wells and huge well logging data with high number of variables are required to be interpreted.
Statistical and Computational Techniques in Manufacturing
2012-01-01
In recent years, interest in developing statistical and computational techniques for applied manufacturing engineering has been increased. Today, due to the great complexity of manufacturing engineering and the high number of parameters used, conventional approaches are no longer sufficient. Therefore, in manufacturing, statistical and computational techniques have achieved several applications, namely, modelling and simulation manufacturing processes, optimization manufacturing parameters, monitoring and control, computer-aided process planning, etc. The present book aims to provide recent information on statistical and computational techniques applied in manufacturing engineering. The content is suitable for final undergraduate engineering courses or as a subject on manufacturing at the postgraduate level. This book serves as a useful reference for academics, statistical and computational science researchers, mechanical, manufacturing and industrial engineers, and professionals in industries related to manu...
Directory of Open Access Journals (Sweden)
Cláudio Roberto Rosário
2012-07-01
Full Text Available The purpose of this research is to improve the practice on customer satisfaction analysis The article presents an analysis model to analyze the answers of a customer satisfaction evaluation in a systematic way with the aid of multivariate statistical techniques, specifically, exploratory analysis with PCA – Partial Components Analysis with HCA - Hierarchical Cluster Analysis. It was tried to evaluate the applicability of the model to be used by the issue company as a tool to assist itself on identifying the value chain perceived by the customer when applied the questionnaire of customer satisfaction. It was found with the assistance of multivariate statistical analysis that it was observed similar behavior among customers. It also allowed the company to conduct reviews on questions of the questionnaires, using analysis of the degree of correlation between the questions that was not a company’s practice before this research.
Modern Multivariate Statistical Techniques Regression, Classification, and Manifold Learning
Izenman, Alan Julian
2006-01-01
Describes the advances in computation and data storage that led to the introduction of many statistical tools for high-dimensional data analysis. Focusing on multivariate analysis, this book discusses nonlinear methods as well as linear methods. It presents an integrated mixture of classical and modern multivariate statistical techniques.
Energy Technology Data Exchange (ETDEWEB)
Hahn, A.A.
1994-11-01
The complexity of instrumentation sometimes requires data analysis to be done before the result is presented to the control room. This tutorial reviews some of the theoretical assumptions underlying the more popular forms of data analysis and presents simple examples to illuminate the advantages and hazards of different techniques.
Associative Analysis in Statistics
Directory of Open Access Journals (Sweden)
Mihaela Muntean
2015-03-01
Full Text Available In the last years, the interest in technologies such as in-memory analytics and associative search has increased. This paper explores how you can use in-memory analytics and an associative model in statistics. The word “associative” puts the emphasis on understanding how datasets relate to one another. The paper presents the main characteristics of “associative” data model. Also, the paper presents how to design an associative model for labor market indicators analysis. The source is the EU Labor Force Survey. Also, this paper presents how to make associative analysis.
Per Object statistical analysis
DEFF Research Database (Denmark)
2008-01-01
Variable. This procedure was developed in order to be able to export objects as ESRI shape data with the 90-percentile of the Hue of each object's pixels as an item in the shape attribute table. This procedure uses a sub-level single pixel chessboard segmentation, loops for each of the objects......This RS code is to do Object-by-Object analysis of each Object's sub-objects, e.g. statistical analysis of an object's individual image data pixels. Statistics, such as percentiles (so-called "quartiles") are derived by the process, but the return of that can only be a Scene Variable, not an Object...... of a specific class in turn, and uses as pair of PPO stages to derive the statistics and then assign them to the objects' Object Variables. It may be that this could all be done in some other, simply way, but several other ways that were tried did not succeed. The procedure ouptut has been tested against...
A survey of statistical downscaling techniques
Energy Technology Data Exchange (ETDEWEB)
Zorita, E.; Storch, H. von [GKSS-Forschungszentrum Geesthacht GmbH (Germany). Inst. fuer Hydrophysik
1997-12-31
The derivation of regional information from integrations of coarse-resolution General Circulation Models (GCM) is generally referred to as downscaling. The most relevant statistical downscaling techniques are described here and some particular examples are worked out in detail. They are classified into three main groups: linear methods, classification methods and deterministic non-linear methods. Their performance in a particular example, winter rainfall in the Iberian peninsula, is compared to a simple downscaling analog method. It is found that the analog method performs equally well than the more complicated methods. Downscaling analysis can be also used as a tool to validate regional performance of global climate models by analyzing the covariability of the simulated large-scale climate and the regional climates. (orig.) [Deutsch] Die Ableitung regionaler Information aus Integrationen grob aufgeloester Klimamodelle wird als `Regionalisierung` bezeichnet. Dieser Beitrag beschreibt die wichtigsten statistischen Regionalisierungsverfahren und gibt darueberhinaus einige detaillierte Beispiele. Regionalisierungsverfahren lassen sich in drei Hauptgruppen klassifizieren: lineare Verfahren, Klassifikationsverfahren und nicht-lineare deterministische Verfahren. Diese Methoden werden auf den Niederschlag auf der iberischen Halbinsel angewandt und mit den Ergebnissen eines einfachen Analog-Modells verglichen. Es wird festgestellt, dass die Ergebnisse der komplizierteren Verfahren im wesentlichen auch mit der Analog-Methode erzielt werden koennen. Eine weitere Anwendung der Regionalisierungsmethoden besteht in der Validierung globaler Klimamodelle, indem die simulierte und die beobachtete Kovariabilitaet zwischen dem grosskaligen und dem regionalen Klima miteinander verglichen wird. (orig.)
Applied multivariate statistical analysis
Härdle, Wolfgang Karl
2015-01-01
Focusing on high-dimensional applications, this 4th edition presents the tools and concepts used in multivariate data analysis in a style that is also accessible for non-mathematicians and practitioners. It surveys the basic principles and emphasizes both exploratory and inferential statistics; a new chapter on Variable Selection (Lasso, SCAD and Elastic Net) has also been added. All chapters include practical exercises that highlight applications in different multivariate data analysis fields: in quantitative financial studies, where the joint dynamics of assets are observed; in medicine, where recorded observations of subjects in different locations form the basis for reliable diagnoses and medication; and in quantitative marketing, where consumers’ preferences are collected in order to construct models of consumer behavior. All of these examples involve high to ultra-high dimensions and represent a number of major fields in big data analysis. The fourth edition of this book on Applied Multivariate ...
The maximum entropy technique. System's statistical description
Belashev, B Z
2002-01-01
The maximum entropy technique (MENT) is applied for searching the distribution functions of physical values. MENT takes into consideration the demand of maximum entropy, the characteristics of the system and the connection conditions, naturally. It is allowed to apply MENT for statistical description of closed and open systems. The examples in which MENT had been used for the description of the equilibrium and nonequilibrium states and the states far from the thermodynamical equilibrium are considered
Boisrobert, Loic; Laclaustra, Martin; Bossa, Matias; Frangi, Andres G.; Frangi, Alejandro F.
2005-04-01
Clinical studies report that impaired endothelial function is associated with Cardio-Vascular Diseases (CVD) and their risk factors. One commonly used mean for assessing endothelial function is Flow-Mediated Dilation (FMD). Classically, FMD is quantified using local indexes e.g. maximum peak dilation. Although such parameters have been successfully linked to CVD risk factors and other clinical variables, this description does not consider all the information contained in the complete vasodilation curve. Moreover, the relation between flow impulse and the vessel vasodilation response to this stimulus, although not clearly known, seems to be important and is not taken into account in the majority of studies. In this paper we propose a novel global parameterization for the vasodilation and the flow curves of a FMD test. This parameterization uses Principal Component Analysis (PCA) to describe independently and jointly the variability of flow and FMD curves. These curves are obtained using computerized techniques (based on edge detection and image registration, respectively) to analyze the ultrasound image sequences. The global description obtained through PCA yields a detailed characterization of the morphology of such curves allowing the extraction of intuitive quantitative information of the vasodilation process and its interplay with flow changes. This parameterization is consistent with traditional measurements and, in a database of 177 subjects, seems to correlate more strongly (and with more clinical parameters) than classical measures to CVD risk factors and clinical parameters such as LDL- and HDL-Cholesterol.
Material analysis on engineering statistics
Energy Technology Data Exchange (ETDEWEB)
Lee, Seung Hun
2008-03-15
This book is about material analysis on engineering statistics using mini tab, which includes technical statistics and seven tools of QC, probability distribution, presumption and checking, regression analysis, tim series analysis, control chart, process capacity analysis, measurement system analysis, sampling check, experiment planning, response surface analysis, compound experiment, Taguchi method, and non parametric statistics. It is good for university and company to use because it deals with theory first and analysis using mini tab on 6 sigma BB and MBB.
Karthik, M. N.; Davis, Moshe
2004-01-01
Searching techniques for Case Based Reasoning systems involve extensive methods of elimination. In this paper, we look at a new method of arriving at the right solution by performing a series of transformations upon the data. These involve N-gram based comparison and deduction of the input data with the case data, using Morphemes and Phonemes as the deciding parameters. A similar technique for eliminating possible errors using a noise removal function is performed. The error tracking and elim...
Velasco-Tapia, Fernando
2014-01-01
Magmatic processes have usually been identified and evaluated using qualitative or semiquantitative geochemical or isotopic tools based on a restricted number of variables. However, a more complete and quantitative view could be reached applying multivariate analysis, mass balance techniques, and statistical tests. As an example, in this work a statistical and quantitative scheme is applied to analyze the geochemical features for the Sierra de las Cruces (SC) volcanic range (Mexican Volcanic Belt). In this locality, the volcanic activity (3.7 to 0.5 Ma) was dominantly dacitic, but the presence of spheroidal andesitic enclaves and/or diverse disequilibrium features in majority of lavas confirms the operation of magma mixing/mingling. New discriminant-function-based multidimensional diagrams were used to discriminate tectonic setting. Statistical tests of discordancy and significance were applied to evaluate the influence of the subducting Cocos plate, which seems to be rather negligible for the SC magmas in relation to several major and trace elements. A cluster analysis following Ward's linkage rule was carried out to classify the SC volcanic rocks geochemical groups. Finally, two mass-balance schemes were applied for the quantitative evaluation of the proportion of the end-member components (dacitic and andesitic magmas) in the comingled lavas (binary mixtures).
Directory of Open Access Journals (Sweden)
Fernando Velasco-Tapia
2014-01-01
Full Text Available Magmatic processes have usually been identified and evaluated using qualitative or semiquantitative geochemical or isotopic tools based on a restricted number of variables. However, a more complete and quantitative view could be reached applying multivariate analysis, mass balance techniques, and statistical tests. As an example, in this work a statistical and quantitative scheme is applied to analyze the geochemical features for the Sierra de las Cruces (SC volcanic range (Mexican Volcanic Belt. In this locality, the volcanic activity (3.7 to 0.5 Ma was dominantly dacitic, but the presence of spheroidal andesitic enclaves and/or diverse disequilibrium features in majority of lavas confirms the operation of magma mixing/mingling. New discriminant-function-based multidimensional diagrams were used to discriminate tectonic setting. Statistical tests of discordancy and significance were applied to evaluate the influence of the subducting Cocos plate, which seems to be rather negligible for the SC magmas in relation to several major and trace elements. A cluster analysis following Ward’s linkage rule was carried out to classify the SC volcanic rocks geochemical groups. Finally, two mass-balance schemes were applied for the quantitative evaluation of the proportion of the end-member components (dacitic and andesitic magmas in the comingled lavas (binary mixtures.
Statistical Analysis and validation
Hoefsloot, H.C.J.; Horvatovich, P.; Bischoff, R.
2013-01-01
In this chapter guidelines are given for the selection of a few biomarker candidates from a large number of compounds with a relative low number of samples. The main concepts concerning the statistical validation of the search for biomarkers are discussed. These complicated methods and concepts are
Statistical principles and techniques in scientific and social research
Krzanowski, Wojtek J
2007-01-01
This text provides a clear discussion of the basic statistical concepts and methods frequently encountered in statistical research. Assuming only a basic level of Mathematics, and with numerous examples and illustrations, this text is a valuable resource for students and researchers in the Sciences and Social Sciences. - ;This graduate-level text provides a survey of the logic and reasoning underpinning statistical analysis, as well as giving a broad-brush overview of the various statistical techniques that play a major role in scientific and social investigations. Arranged in rough historical order, the text starts with the ideas of probability that underpin statistical methods and progresses through the developments of the nineteenth and twentieth centuries to modern concerns and solutions. Assuming only a basic level of Mathematics and with numerous examples and illustrations, this text presents a valuable resource not only to the experienced researcher but also to the student, by complementing courses in ...
Statistical analysis of management data
Gatignon, Hubert
2013-01-01
This book offers a comprehensive approach to multivariate statistical analyses. It provides theoretical knowledge of the concepts underlying the most important multivariate techniques and an overview of actual applications.
Statistical and Economic Techniques for Site-specific Nematode Management.
Liu, Zheng; Griffin, Terry; Kirkpatrick, Terrence L
2014-03-01
Recent advances in precision agriculture technologies and spatial statistics allow realistic, site-specific estimation of nematode damage to field crops and provide a platform for the site-specific delivery of nematicides within individual fields. This paper reviews the spatial statistical techniques that model correlations among neighboring observations and develop a spatial economic analysis to determine the potential of site-specific nematicide application. The spatial econometric methodology applied in the context of site-specific crop yield response contributes to closing the gap between data analysis and realistic site-specific nematicide recommendations and helps to provide a practical method of site-specifically controlling nematodes.
Guimarães Nobre, Gabriela; Arnbjerg-Nielsen, Karsten; Rosbjerg, Dan; Madsen, Henrik
2016-04-01
Traditionally, flood risk assessment studies have been carried out from a univariate frequency analysis perspective. However, statistical dependence between hydrological variables, such as extreme rainfall and extreme sea surge, is plausible to exist, since both variables to some extent are driven by common meteorological conditions. Aiming to overcome this limitation, multivariate statistical techniques has the potential to combine different sources of flooding in the investigation. The aim of this study was to apply a range of statistical methodologies for analyzing combined extreme hydrological variables that can lead to coastal and urban flooding. The study area is the Elwood Catchment, which is a highly urbanized catchment located in the city of Port Phillip, Melbourne, Australia. The first part of the investigation dealt with the marginal extreme value distributions. Two approaches to extract extreme value series were applied (Annual Maximum and Partial Duration Series), and different probability distribution functions were fit to the observed sample. Results obtained by using the Generalized Pareto distribution demonstrate the ability of the Pareto family to model the extreme events. Advancing into multivariate extreme value analysis, first an investigation regarding the asymptotic properties of extremal dependence was carried out. As a weak positive asymptotic dependence between the bivariate extreme pairs was found, the Conditional method proposed by Heffernan and Tawn (2004) was chosen. This approach is suitable to model bivariate extreme values, which are relatively unlikely to occur together. The results show that the probability of an extreme sea surge occurring during a one-hour intensity extreme precipitation event (or vice versa) can be twice as great as what would occur when assuming independent events. Therefore, presuming independence between these two variables would result in severe underestimation of the flooding risk in the study area.
Statistical Analysis of Random Simulations : Bootstrap Tutorial
Deflandre, D.; Kleijnen, J.P.C.
2002-01-01
The bootstrap is a simple but versatile technique for the statistical analysis of random simulations.This tutorial explains the basics of that technique, and applies it to the well-known M/M/1 queuing simulation.In that numerical example, different responses are studied.For some responses, bootstrap
Statistical methods for astronomical data analysis
Chattopadhyay, Asis Kumar
2014-01-01
This book introduces “Astrostatistics” as a subject in its own right with rewarding examples, including work by the authors with galaxy and Gamma Ray Burst data to engage the reader. This includes a comprehensive blending of Astrophysics and Statistics. The first chapter’s coverage of preliminary concepts and terminologies for astronomical phenomenon will appeal to both Statistics and Astrophysics readers as helpful context. Statistics concepts covered in the book provide a methodological framework. A unique feature is the inclusion of different possible sources of astronomical data, as well as software packages for converting the raw data into appropriate forms for data analysis. Readers can then use the appropriate statistical packages for their particular data analysis needs. The ideas of statistical inference discussed in the book help readers determine how to apply statistical tests. The authors cover different applications of statistical techniques already developed or specifically introduced for ...
DEFF Research Database (Denmark)
Ris Hansen, Inge; Søgaard, Karen; Gram, Bibi
2015-01-01
This is the analysis plan for the multicentre randomised control study looking at the effect of training and exercises in chronic neck pain patients that is being conducted in Jutland and Funen, Denmark. This plan will be used as a work description for the analyses of the data collected....
Lightweight and Statistical Techniques for Petascale PetaScale Debugging
Energy Technology Data Exchange (ETDEWEB)
Miller, Barton
2014-06-30
This project investigated novel techniques for debugging scientific applications on petascale architectures. In particular, we developed lightweight tools that narrow the problem space when bugs are encountered. We also developed techniques that either limit the number of tasks and the code regions to which a developer must apply a traditional debugger or that apply statistical techniques to provide direct suggestions of the location and type of error. We extend previous work on the Stack Trace Analysis Tool (STAT), that has already demonstrated scalability to over one hundred thousand MPI tasks. We also extended statistical techniques developed to isolate programming errors in widely used sequential or threaded applications in the Cooperative Bug Isolation (CBI) project to large scale parallel applications. Overall, our research substantially improved productivity on petascale platforms through a tool set for debugging that complements existing commercial tools. Previously, Office Of Science application developers relied either on primitive manual debugging techniques based on printf or they use tools, such as TotalView, that do not scale beyond a few thousand processors. However, bugs often arise at scale and substantial effort and computation cycles are wasted in either reproducing the problem in a smaller run that can be analyzed with the traditional tools or in repeated runs at scale that use the primitive techniques. New techniques that work at scale and automate the process of identifying the root cause of errors were needed. These techniques significantly reduced the time spent debugging petascale applications, thus leading to a greater overall amount of time for application scientists to pursue the scientific objectives for which the systems are purchased. We developed a new paradigm for debugging at scale: techniques that reduced the debugging scenario to a scale suitable for traditional debuggers, e.g., by narrowing the search for the root-cause analysis
Air Quality Forecasting through Different Statistical and Artificial Intelligence Techniques
Mishra, D.; Goyal, P.
2014-12-01
Urban air pollution forecasting has emerged as an acute problem in recent years because there are sever environmental degradation due to increase in harmful air pollutants in the ambient atmosphere. In this study, there are different types of statistical as well as artificial intelligence techniques are used for forecasting and analysis of air pollution over Delhi urban area. These techniques are principle component analysis (PCA), multiple linear regression (MLR) and artificial neural network (ANN) and the forecasting are observed in good agreement with the observed concentrations through Central Pollution Control Board (CPCB) at different locations in Delhi. But such methods suffers from disadvantages like they provide limited accuracy as they are unable to predict the extreme points i.e. the pollution maximum and minimum cut-offs cannot be determined using such approach. Also, such methods are inefficient approach for better output forecasting. But with the advancement in technology and research, an alternative to the above traditional methods has been proposed i.e. the coupling of statistical techniques with artificial Intelligence (AI) can be used for forecasting purposes. The coupling of PCA, ANN and fuzzy logic is used for forecasting of air pollutant over Delhi urban area. The statistical measures e.g., correlation coefficient (R), normalized mean square error (NMSE), fractional bias (FB) and index of agreement (IOA) of the proposed model are observed in better agreement with the all other models. Hence, the coupling of statistical and artificial intelligence can be use for the forecasting of air pollutant over urban area.
Predicting radiotherapy outcomes using statistical learning techniques
Energy Technology Data Exchange (ETDEWEB)
El Naqa, Issam; Bradley, Jeffrey D; Deasy, Joseph O [Washington University, Saint Louis, MO (United States); Lindsay, Patricia E; Hope, Andrew J [Department of Radiation Oncology, Princess Margaret Hospital, Toronto, ON (Canada)
2009-09-21
Radiotherapy outcomes are determined by complex interactions between treatment, anatomical and patient-related variables. A common obstacle to building maximally predictive outcome models for clinical practice is the failure to capture potential complexity of heterogeneous variable interactions and applicability beyond institutional data. We describe a statistical learning methodology that can automatically screen for nonlinear relations among prognostic variables and generalize to unseen data before. In this work, several types of linear and nonlinear kernels to generate interaction terms and approximate the treatment-response function are evaluated. Examples of institutional datasets of esophagitis, pneumonitis and xerostomia endpoints were used. Furthermore, an independent RTOG dataset was used for 'generalizabilty' validation. We formulated the discrimination between risk groups as a supervised learning problem. The distribution of patient groups was initially analyzed using principle components analysis (PCA) to uncover potential nonlinear behavior. The performance of the different methods was evaluated using bivariate correlations and actuarial analysis. Over-fitting was controlled via cross-validation resampling. Our results suggest that a modified support vector machine (SVM) kernel method provided superior performance on leave-one-out testing compared to logistic regression and neural networks in cases where the data exhibited nonlinear behavior on PCA. For instance, in prediction of esophagitis and pneumonitis endpoints, which exhibited nonlinear behavior on PCA, the method provided 21% and 60% improvements, respectively. Furthermore, evaluation on the independent pneumonitis RTOG dataset demonstrated good generalizabilty beyond institutional data in contrast with other models. This indicates that the prediction of treatment response can be improved by utilizing nonlinear kernel methods for discovering important nonlinear interactions among
Predicting radiotherapy outcomes using statistical learning techniques*
El Naqa, Issam; Bradley, Jeffrey D; Lindsay, Patricia E; Hope, Andrew J; Deasy, Joseph O
2013-01-01
Radiotherapy outcomes are determined by complex interactions between treatment, anatomical and patient-related variables. A common obstacle to building maximally predictive outcome models for clinical practice is the failure to capture potential complexity of heterogeneous variable interactions and applicability beyond institutional data. We describe a statistical learning methodology that can automatically screen for nonlinear relations among prognostic variables and generalize to unseen data before. In this work, several types of linear and nonlinear kernels to generate interaction terms and approximate the treatment-response function are evaluated. Examples of institutional datasets of esophagitis, pneumonitis and xerostomia endpoints were used. Furthermore, an independent RTOG dataset was used for ‘generalizabilty’ validation. We formulated the discrimination between risk groups as a supervised learning problem. The distribution of patient groups was initially analyzed using principle components analysis (PCA) to uncover potential nonlinear behavior. The performance of the different methods was evaluated using bivariate correlations and actuarial analysis. Over-fitting was controlled via cross-validation resampling. Our results suggest that a modified support vector machine (SVM) kernel method provided superior performance on leave-one-out testing compared to logistic regression and neural networks in cases where the data exhibited nonlinear behavior on PCA. For instance, in prediction of esophagitis and pneumonitis endpoints, which exhibited nonlinear behavior on PCA, the method provided 21% and 60% improvements, respectively. Furthermore, evaluation on the independent pneumonitis RTOG dataset demonstrated good generalizabilty beyond institutional data in contrast with other models. This indicates that the prediction of treatment response can be improved by utilizing nonlinear kernel methods for discovering important nonlinear interactions among model
Predicting radiotherapy outcomes using statistical learning techniques
El Naqa, Issam; Bradley, Jeffrey D.; Lindsay, Patricia E.; Hope, Andrew J.; Deasy, Joseph O.
2009-09-01
Radiotherapy outcomes are determined by complex interactions between treatment, anatomical and patient-related variables. A common obstacle to building maximally predictive outcome models for clinical practice is the failure to capture potential complexity of heterogeneous variable interactions and applicability beyond institutional data. We describe a statistical learning methodology that can automatically screen for nonlinear relations among prognostic variables and generalize to unseen data before. In this work, several types of linear and nonlinear kernels to generate interaction terms and approximate the treatment-response function are evaluated. Examples of institutional datasets of esophagitis, pneumonitis and xerostomia endpoints were used. Furthermore, an independent RTOG dataset was used for 'generalizabilty' validation. We formulated the discrimination between risk groups as a supervised learning problem. The distribution of patient groups was initially analyzed using principle components analysis (PCA) to uncover potential nonlinear behavior. The performance of the different methods was evaluated using bivariate correlations and actuarial analysis. Over-fitting was controlled via cross-validation resampling. Our results suggest that a modified support vector machine (SVM) kernel method provided superior performance on leave-one-out testing compared to logistic regression and neural networks in cases where the data exhibited nonlinear behavior on PCA. For instance, in prediction of esophagitis and pneumonitis endpoints, which exhibited nonlinear behavior on PCA, the method provided 21% and 60% improvements, respectively. Furthermore, evaluation on the independent pneumonitis RTOG dataset demonstrated good generalizabilty beyond institutional data in contrast with other models. This indicates that the prediction of treatment response can be improved by utilizing nonlinear kernel methods for discovering important nonlinear interactions among model
Research design and statistical analysis
Myers, Jerome L; Lorch Jr, Robert F
2013-01-01
Research Design and Statistical Analysis provides comprehensive coverage of the design principles and statistical concepts necessary to make sense of real data. The book's goal is to provide a strong conceptual foundation to enable readers to generalize concepts to new research situations. Emphasis is placed on the underlying logic and assumptions of the analysis and what it tells the researcher, the limitations of the analysis, and the consequences of violating assumptions. Sampling, design efficiency, and statistical models are emphasized throughout. As per APA recommendations
Statistical Analysis by Statistical Physics Model for the STOCK Markets
Wang, Tiansong; Wang, Jun; Fan, Bingli
A new stochastic stock price model of stock markets based on the contact process of the statistical physics systems is presented in this paper, where the contact model is a continuous time Markov process, one interpretation of this model is as a model for the spread of an infection. Through this model, the statistical properties of Shanghai Stock Exchange (SSE) and Shenzhen Stock Exchange (SZSE) are studied. In the present paper, the data of SSE Composite Index and the data of SZSE Component Index are analyzed, and the corresponding simulation is made by the computer computation. Further, we investigate the statistical properties, fat-tail phenomena, the power-law distributions, and the long memory of returns for these indices. The techniques of skewness-kurtosis test, Kolmogorov-Smirnov test, and R/S analysis are applied to study the fluctuation characters of the stock price returns.
Zimmerman, G. A.; Olsen, E. T.
1992-01-01
Noise power estimation in the High-Resolution Microwave Survey (HRMS) sky survey element is considered as an example of a constant false alarm rate (CFAR) signal detection problem. Order-statistic-based noise power estimators for CFAR detection are considered in terms of required estimator accuracy and estimator dynamic range. By limiting the dynamic range of the value to be estimated, the performance of an order-statistic estimator can be achieved by simpler techniques requiring only a single pass of the data. Simple threshold-and-count techniques are examined, and it is shown how several parallel threshold-and-count estimation devices can be used to expand the dynamic range to meet HRMS system requirements with minimal hardware complexity. An input/output (I/O) efficient limited-precision order-statistic estimator with wide but limited dynamic range is also examined.
Spatial analysis statistics, visualization, and computational methods
Oyana, Tonny J
2015-01-01
An introductory text for the next generation of geospatial analysts and data scientists, Spatial Analysis: Statistics, Visualization, and Computational Methods focuses on the fundamentals of spatial analysis using traditional, contemporary, and computational methods. Outlining both non-spatial and spatial statistical concepts, the authors present practical applications of geospatial data tools, techniques, and strategies in geographic studies. They offer a problem-based learning (PBL) approach to spatial analysis-containing hands-on problem-sets that can be worked out in MS Excel or ArcGIS-as well as detailed illustrations and numerous case studies. The book enables readers to: Identify types and characterize non-spatial and spatial data Demonstrate their competence to explore, visualize, summarize, analyze, optimize, and clearly present statistical data and results Construct testable hypotheses that require inferential statistical analysis Process spatial data, extract explanatory variables, conduct statisti...
Statistical shape analysis with applications in R
Dryden, Ian L
2016-01-01
A thoroughly revised and updated edition of this introduction to modern statistical methods for shape analysis Shape analysis is an important tool in the many disciplines where objects are compared using geometrical features. Examples include comparing brain shape in schizophrenia; investigating protein molecules in bioinformatics; and describing growth of organisms in biology. This book is a significant update of the highly-regarded `Statistical Shape Analysis’ by the same authors. The new edition lays the foundations of landmark shape analysis, including geometrical concepts and statistical techniques, and extends to include analysis of curves, surfaces, images and other types of object data. Key definitions and concepts are discussed throughout, and the relative merits of different approaches are presented. The authors have included substantial new material on recent statistical developments and offer numerous examples throughout the text. Concepts are introduced in an accessible manner, while reta...
Statistical Analysis with Missing Data
Little, Roderick J A
2002-01-01
Praise for the First Edition of Statistical Analysis with Missing Data ""An important contribution to the applied statistics literature.... I give the book high marks for unifying and making accessible much of the past and current work in this important area."" Ã¢ÂÂWilliam E. Strawderman, Rutgers University ""This book...provide[s] interesting real-life examples, stimulating end-of-chapter exercises, and up-to-date references. It should be on every applied statisticianÃ¢ÂÂs bookshelf."" Ã¢ÂÂThe Statistician ""The book should be studied in the statistical methods d
Bayesian Methods for Statistical Analysis
Puza, Borek
2015-01-01
Bayesian methods for statistical analysis is a book on statistical methods for analysing a wide variety of data. The book consists of 12 chapters, starting with basic concepts and covering numerous topics, including Bayesian estimation, decision theory, prediction, hypothesis testing, hierarchical models, Markov chain Monte Carlo methods, finite population inference, biased sampling and nonignorable nonresponse. The book contains many exercises, all with worked solutions, including complete c...
Comparison of three Statistical Classification Techniques for Maser Identification
Manning, Ellen M; Ellingsen, Simon P; Breen, Shari L; Chen, Xi; Humphries, Melissa
2016-01-01
We applied three statistical classification techniques - linear discriminant analysis (LDA), logistic regression and random forests - to three astronomical datasets associated with searches for interstellar masers. We compared the performance of these methods in identifying whether specific mid-infrared or millimetre continuum sources are likely to have associated interstellar masers. We also discuss the ease, or otherwise, with which the results of each classification technique can be interpreted. Non-parametric methods have the potential to make accurate predictions when there are complex relationships between critical parameters. We found that for the small datasets the parametric methods logistic regression and LDA performed best, for the largest dataset the non-parametric method of random forests performed with comparable accuracy to parametric techniques, rather than any significant improvement. This suggests that at least for the specific examples investigated here accuracy of the predictions obtained ...
Common pitfalls in statistical analysis: Clinical versus statistical significance
Ranganathan, Priya; Pramesh, C. S.; Buyse, Marc
2015-01-01
In clinical research, study results, which are statistically significant are often interpreted as being clinically important. While statistical significance indicates the reliability of the study results, clinical significance reflects its impact on clinical practice. The third article in this series exploring pitfalls in statistical analysis clarifies the importance of differentiating between statistical significance and clinical significance. PMID:26229754
Statistical normalization techniques for magnetic resonance imaging
Directory of Open Access Journals (Sweden)
Russell T. Shinohara
2014-01-01
Full Text Available While computed tomography and other imaging techniques are measured in absolute units with physical meaning, magnetic resonance images are expressed in arbitrary units that are difficult to interpret and differ between study visits and subjects. Much work in the image processing literature on intensity normalization has focused on histogram matching and other histogram mapping techniques, with little emphasis on normalizing images to have biologically interpretable units. Furthermore, there are no formalized principles or goals for the crucial comparability of image intensities within and across subjects. To address this, we propose a set of criteria necessary for the normalization of images. We further propose simple and robust biologically motivated normalization techniques for multisequence brain imaging that have the same interpretation across acquisitions and satisfy the proposed criteria. We compare the performance of different normalization methods in thousands of images of patients with Alzheimer's disease, hundreds of patients with multiple sclerosis, and hundreds of healthy subjects obtained in several different studies at dozens of imaging centers.
Statistical methods for bioimpedance analysis
Directory of Open Access Journals (Sweden)
Christian Tronstad
2014-04-01
Full Text Available This paper gives a basic overview of relevant statistical methods for the analysis of bioimpedance measurements, with an aim to answer questions such as: How do I begin with planning an experiment? How many measurements do I need to take? How do I deal with large amounts of frequency sweep data? Which statistical test should I use, and how do I validate my results? Beginning with the hypothesis and the research design, the methodological framework for making inferences based on measurements and statistical analysis is explained. This is followed by a brief discussion on correlated measurements and data reduction before an overview is given of statistical methods for comparison of groups, factor analysis, association, regression and prediction, explained in the context of bioimpedance research. The last chapter is dedicated to the validation of a new method by different measures of performance. A flowchart is presented for selection of statistical method, and a table is given for an overview of the most important terms of performance when evaluating new measurement technology.
Regularized Statistical Analysis of Anatomy
DEFF Research Database (Denmark)
Sjöstrand, Karl
2007-01-01
This thesis presents the application and development of regularized methods for the statistical analysis of anatomical structures. Focus is on structure-function relationships in the human brain, such as the connection between early onset of Alzheimer’s disease and shape changes of the corpus cal...
Statistical Analysis for Performance Comparison
Directory of Open Access Journals (Sweden)
Priyanka Dutta
2013-07-01
Full Text Available Performance responsiveness and scalability is a make-or-break quality for software. Nearly everyone runsinto performance problems at one time or another. This paper discusses about performance issues facedduring Pre Examination Process Automation System (PEPAS implemented in java technology. Thechallenges faced during the life cycle of the project and the mitigation actions performed. It compares 3java technologies and shows how improvements are made through statistical analysis in response time ofthe application. The paper concludes with result analysis.
Bayesian Inference in Statistical Analysis
Box, George E P
2011-01-01
The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson The Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences Rob
Tools for Basic Statistical Analysis
Luz, Paul L.
2005-01-01
Statistical Analysis Toolset is a collection of eight Microsoft Excel spreadsheet programs, each of which performs calculations pertaining to an aspect of statistical analysis. These programs present input and output data in user-friendly, menu-driven formats, with automatic execution. The following types of calculations are performed: Descriptive statistics are computed for a set of data x(i) (i = 1, 2, 3 . . . ) entered by the user. Normal Distribution Estimates will calculate the statistical value that corresponds to cumulative probability values, given a sample mean and standard deviation of the normal distribution. Normal Distribution from two Data Points will extend and generate a cumulative normal distribution for the user, given two data points and their associated probability values. Two programs perform two-way analysis of variance (ANOVA) with no replication or generalized ANOVA for two factors with four levels and three repetitions. Linear Regression-ANOVA will curvefit data to the linear equation y=f(x) and will do an ANOVA to check its significance.
Application of multivariate statistical techniques in microbial ecology.
Paliy, O; Shankar, V
2016-03-01
Recent advances in high-throughput methods of molecular analyses have led to an explosion of studies generating large-scale ecological data sets. In particular, noticeable effect has been attained in the field of microbial ecology, where new experimental approaches provided in-depth assessments of the composition, functions and dynamic changes of complex microbial communities. Because even a single high-throughput experiment produces large amount of data, powerful statistical techniques of multivariate analysis are well suited to analyse and interpret these data sets. Many different multivariate techniques are available, and often it is not clear which method should be applied to a particular data set. In this review, we describe and compare the most widely used multivariate statistical techniques including exploratory, interpretive and discriminatory procedures. We consider several important limitations and assumptions of these methods, and we present examples of how these approaches have been utilized in recent studies to provide insight into the ecology of the microbial world. Finally, we offer suggestions for the selection of appropriate methods based on the research question and data set structure.
Statistical Analysis of Data for Timber Strengths
DEFF Research Database (Denmark)
Sørensen, John Dalsgaard
2003-01-01
Statistical analyses are performed for material strength parameters from a large number of specimens of structural timber. Non-parametric statistical analysis and fits have been investigated for the following distribution types: Normal, Lognormal, 2 parameter Weibull and 3-parameter Weibull....... The statistical fits have generally been made using all data and the lower tail of the data. The Maximum Likelihood Method and the Least Square Technique have been used to estimate the statistical parameters in the selected distributions. The results show that the 2-parameter Weibull distribution gives the best...... fits to the data available, especially if tail fits are used whereas the Log Normal distribution generally gives a poor fit and larger coefficients of variation, especially if tail fits are used. The implications on the reliability level of typical structural elements and on partial safety factors...
Comparison of Three Statistical Classification Techniques for Maser Identification
Manning, Ellen M.; Holland, Barbara R.; Ellingsen, Simon P.; Breen, Shari L.; Chen, Xi; Humphries, Melissa
2016-04-01
We applied three statistical classification techniques-linear discriminant analysis (LDA), logistic regression, and random forests-to three astronomical datasets associated with searches for interstellar masers. We compared the performance of these methods in identifying whether specific mid-infrared or millimetre continuum sources are likely to have associated interstellar masers. We also discuss the interpretability of the results of each classification technique. Non-parametric methods have the potential to make accurate predictions when there are complex relationships between critical parameters. We found that for the small datasets the parametric methods logistic regression and LDA performed best, for the largest dataset the non-parametric method of random forests performed with comparable accuracy to parametric techniques, rather than any significant improvement. This suggests that at least for the specific examples investigated here accuracy of the predictions obtained is not being limited by the use of parametric models. We also found that for LDA, transformation of the data to match a normal distribution led to a significant improvement in accuracy. The different classification techniques had significant overlap in their predictions; further astronomical observations will enable the accuracy of these predictions to be tested.
Statistics and analysis of scientific data
Bonamente, Massimiliano
2017-01-01
The revised second edition of this textbook provides the reader with a solid foundation in probability theory and statistics as applied to the physical sciences, engineering and related fields. It covers a broad range of numerical and analytical methods that are essential for the correct analysis of scientific data, including probability theory, distribution functions of statistics, fits to two-dimensional data and parameter estimation, Monte Carlo methods and Markov chains. Features new to this edition include: • a discussion of statistical techniques employed in business science, such as multiple regression analysis of multivariate datasets. • a new chapter on the various measures of the mean including logarithmic averages. • new chapters on systematic errors and intrinsic scatter, and on the fitting of data with bivariate errors. • a new case study and additional worked examples. • mathematical derivations and theoretical background material have been appropriately marked,to improve the readabili...
The Importance of Introductory Statistics Students Understanding Appropriate Sampling Techniques
Menil, Violeta C.
2005-01-01
In this paper the author discusses the meaning of sampling, the reasons for sampling, the Central Limit Theorem, and the different techniques of sampling. Practical and relevant examples are given to make the appropriate sampling techniques understandable to students of Introductory Statistics courses. With a thorough knowledge of sampling…
Statistical Analysis of Protein Ensembles
Máté, Gabriell; Heermann, Dieter
2014-04-01
As 3D protein-configuration data is piling up, there is an ever-increasing need for well-defined, mathematically rigorous analysis approaches, especially that the vast majority of the currently available methods rely heavily on heuristics. We propose an analysis framework which stems from topology, the field of mathematics which studies properties preserved under continuous deformations. First, we calculate a barcode representation of the molecules employing computational topology algorithms. Bars in this barcode represent different topological features. Molecules are compared through their barcodes by statistically determining the difference in the set of their topological features. As a proof-of-principle application, we analyze a dataset compiled of ensembles of different proteins, obtained from the Ensemble Protein Database. We demonstrate that our approach correctly detects the different protein groupings.
Statistical Analysis of Protein Ensembles
Directory of Open Access Journals (Sweden)
Gabriell eMáté
2014-04-01
Full Text Available As 3D protein-configuration data is piling up, there is an ever-increasing need for well-defined, mathematically rigorous analysis approaches, especially that the vast majority of the currently available methods rely heavily on heuristics. We propose an analysis framework which stems from topology, the field of mathematics which studies properties preserved under continuous deformations. First, we calculate a barcode representation of the molecules employing computational topology algorithms. Bars in this barcode represent different topological features. Molecules are compared through their barcodes by statistically determining the difference in the set of their topological features. As a proof-of-principle application, we analyze a dataset compiled of ensembles of different proteins, obtained from the Ensemble Protein Database. We demonstrate that our approach correctly detects the different protein groupings.
Statistical analysis of next generation sequencing data
Nettleton, Dan
2014-01-01
Next Generation Sequencing (NGS) is the latest high throughput technology to revolutionize genomic research. NGS generates massive genomic datasets that play a key role in the big data phenomenon that surrounds us today. To extract signals from high-dimensional NGS data and make valid statistical inferences and predictions, novel data analytic and statistical techniques are needed. This book contains 20 chapters written by prominent statisticians working with NGS data. The topics range from basic preprocessing and analysis with NGS data to more complex genomic applications such as copy number variation and isoform expression detection. Research statisticians who want to learn about this growing and exciting area will find this book useful. In addition, many chapters from this book could be included in graduate-level classes in statistical bioinformatics for training future biostatisticians who will be expected to deal with genomic data in basic biomedical research, genomic clinical trials and personalized med...
INTERNAL ENVIRONMENT ANALYSIS TECHNIQUES
National Research Council Canada - National Science Library
Caescu Stefan Claudiu; Popescu Andrei; Ploesteanu Mara Gabriela
2011-01-01
.... Objectives of the Research The main purpose of the study of the analysis techniques of the internal environment is to provide insight on those aspects that are of strategic importance to the organization...
Parametric statistical change point analysis
Chen, Jie
2000-01-01
This work is an in-depth study of the change point problem from a general point of view and a further examination of change point analysis of the most commonly used statistical models Change point problems are encountered in such disciplines as economics, finance, medicine, psychology, signal processing, and geology, to mention only several The exposition is clear and systematic, with a great deal of introductory material included Different models are presented in each chapter, including gamma and exponential models, rarely examined thus far in the literature Other models covered in detail are the multivariate normal, univariate normal, regression, and discrete models Extensive examples throughout the text emphasize key concepts and different methodologies are used, namely the likelihood ratio criterion, and the Bayesian and information criterion approaches A comprehensive bibliography and two indices complete the study
Blondeau-Patissier, David; Gower, James F. R.; Dekker, Arnold G.; Phinn, Stuart R.; Brando, Vittorio E.
2014-04-01
The need for more effective environmental monitoring of the open and coastal ocean has recently led to notable advances in satellite ocean color technology and algorithm research. Satellite ocean color sensors' data are widely used for the detection, mapping and monitoring of phytoplankton blooms because earth observation provides a synoptic view of the ocean, both spatially and temporally. Algal blooms are indicators of marine ecosystem health; thus, their monitoring is a key component of effective management of coastal and oceanic resources. Since the late 1970s, a wide variety of operational ocean color satellite sensors and algorithms have been developed. The comprehensive review presented in this article captures the details of the progress and discusses the advantages and limitations of the algorithms used with the multi-spectral ocean color sensors CZCS, SeaWiFS, MODIS and MERIS. Present challenges include overcoming the severe limitation of these algorithms in coastal waters and refining detection limits in various oceanic and coastal environments. To understand the spatio-temporal patterns of algal blooms and their triggering factors, it is essential to consider the possible effects of environmental parameters, such as water temperature, turbidity, solar radiation and bathymetry. Hence, this review will also discuss the use of statistical techniques and additional datasets derived from ecosystem models or other satellite sensors to characterize further the factors triggering or limiting the development of algal blooms in coastal and open ocean waters.
Statistical technique for analysing functional connectivity of multiple spike trains.
Masud, Mohammad Shahed; Borisyuk, Roman
2011-03-15
A new statistical technique, the Cox method, used for analysing functional connectivity of simultaneously recorded multiple spike trains is presented. This method is based on the theory of modulated renewal processes and it estimates a vector of influence strengths from multiple spike trains (called reference trains) to the selected (target) spike train. Selecting another target spike train and repeating the calculation of the influence strengths from the reference spike trains enables researchers to find all functional connections among multiple spike trains. In order to study functional connectivity an "influence function" is identified. This function recognises the specificity of neuronal interactions and reflects the dynamics of postsynaptic potential. In comparison to existing techniques, the Cox method has the following advantages: it does not use bins (binless method); it is applicable to cases where the sample size is small; it is sufficiently sensitive such that it estimates weak influences; it supports the simultaneous analysis of multiple influences; it is able to identify a correct connectivity scheme in difficult cases of "common source" or "indirect" connectivity. The Cox method has been thoroughly tested using multiple sets of data generated by the neural network model of the leaky integrate and fire neurons with a prescribed architecture of connections. The results suggest that this method is highly successful for analysing functional connectivity of simultaneously recorded multiple spike trains.
Montecarlo Techniques as a tool for teaching statistics
Bueno, FM Alexander
2014-01-01
Probability Theory and Statistics are two of the most useful mathematical fields, and also two of the most difficult to learn. In other science fields, as Physics, experimentation is an useful tool to develop students intuition, but the application of this tool to Statistics is much more difficult. In this paper we show how Monte Carlo techniques can be used to perform numerical experiments, by the use of pseudorandom numbers, and how these experiments can help to the understanding of Statistics and Physics. Monte Carlo techniques are broadly used in scientific research, but they are learnt usually in very specific curses of higher education. By the use of computer simulation these techniques can also be taught at elementary school and they can help to understand and visualise concepts as variance, mean or a probability distribution function. Finally, the use of new technologies, as Javascript and HTML is discussed.
Statistical models and methods for reliability and survival analysis
Couallier, Vincent; Huber-Carol, Catherine; Mesbah, Mounir; Huber -Carol, Catherine; Limnios, Nikolaos; Gerville-Reache, Leo
2013-01-01
Statistical Models and Methods for Reliability and Survival Analysis brings together contributions by specialists in statistical theory as they discuss their applications providing up-to-date developments in methods used in survival analysis, statistical goodness of fit, stochastic processes for system reliability, amongst others. Many of these are related to the work of Professor M. Nikulin in statistics over the past 30 years. The authors gather together various contributions with a broad array of techniques and results, divided into three parts - Statistical Models and Methods, Statistical
Statistical Analysis of Tsunami Variability
Zolezzi, Francesca; Del Giudice, Tania; Traverso, Chiara; Valfrè, Giulio; Poggi, Pamela; Parker, Eric J.
2010-05-01
similar to that seen in ground motion attenuation correlations used for seismic hazard assessment. The second issue was intra-event variability. This refers to the differences in tsunami wave run-up along a section of coast during a single event. Intra-event variability investigated directly considering field observations. The tsunami events used in the statistical evaluation were selected on the basis of the completeness and reliability of the available data. Tsunami considered for the analysis included the recent and well surveyed tsunami of Boxing Day 2004 (Great Indian Ocean Tsunami), Java 2006, Okushiri 1993, Kocaeli 1999, Messina 1908 and a case study of several historic events in Hawaii. Basic statistical analysis was performed on the field observations from these tsunamis. For events with very wide survey regions, the run-up heights have been grouped in order to maintain a homogeneous distance from the source. Where more than one survey was available for a given event, the original datasets were maintained separately to avoid combination of non-homogeneous data. The observed run-up measurements were used to evaluate the minimum, maximum, average, standard deviation and coefficient of variation for each data set. The minimum coefficient of variation was 0.12 measured for the 2004 Boxing Day tsunami at Nias Island (7 data) while the maximum is 0.98 for the Okushiri 1993 event (93 data). The average coefficient of variation is of the order of 0.45.
Advanced Statistical Signal Processing Techniques for Landmine Detection Using GPR
2014-07-12
Processing Techniques for Landmine Detection Using GPR The views, opinions and/or findings contained in this report are those of the author(s) and should not...AGENCY NAME(S) AND ADDRESS (ES) U.S. Army Research Office P.O. Box 12211 Research Triangle Park, NC 27709-2211 landmine Detection, Signal...310 Jesse Hall Columbia, MO 65211 -1230 654808 633606 ABSTRACT Advanced Statistical Signal Processing Techniques for Landmine Detection Using GPR Report
Multiscale statistical analysis of coronal solar activity
Gamborino, Diana; Martinell, Julio J
2016-01-01
Multi-filter images from the solar corona are used to obtain temperature maps which are analyzed using techniques based on proper orthogonal decomposition (POD) in order to extract dynamical and structural information at various scales. Exploring active regions before and after a solar flare and comparing them with quiet regions we show that the multiscale behavior presents distinct statistical properties for each case that can be used to characterize the level of activity in a region. Information about the nature of heat transport is also be extracted from the analysis.
Directory of Open Access Journals (Sweden)
R. Soundararajan
2015-01-01
Full Text Available Artificial Neural Network (ANN approach was used for predicting and analyzing the mechanical properties of A413 aluminum alloy produced by squeeze casting route. The experiments are carried out with different controlled input variables such as squeeze pressure, die preheating temperature, and melt temperature as per Full Factorial Design (FFD. The accounted absolute process variables produce a casting with pore-free and ideal fine grain dendritic structure resulting in good mechanical properties such as hardness, ultimate tensile strength, and yield strength. As a primary objective, a feed forward back propagation ANN model has been developed with different architectures for ensuring the definiteness of the values. The developed model along with its predicted data was in good agreement with the experimental data, inferring the valuable performance of the optimal model. From the work it was ascertained that, for castings produced by squeeze casting route, the ANN is an alternative method for predicting the mechanical properties and appropriate results can be estimated rather than measured, thereby reducing the testing time and cost. As a secondary objective, quantitative and statistical analysis was performed in order to evaluate the effect of process parameters on the mechanical properties of the castings.
Validation of Models : Statistical Techniques and Data Availability
Kleijnen, J.P.C.
1999-01-01
This paper shows which statistical techniques can be used to validate simulation models, depending on which real-life data are available. Concerning this availability three situations are distinguished (i) no data, (ii) only output data, and (iii) both input and output data. In case (i) - no real
Statistical Techniques for Efficient Indexing and Retrieval of Document Images
Bhardwaj, Anurag
2010-01-01
We have developed statistical techniques to improve the performance of document image search systems where the intermediate step of OCR based transcription is not used. Previous research in this area has largely focused on challenges pertaining to generation of small lexicons for processing handwritten documents and enhancement of poor quality…
Communication Analysis modelling techniques
España, Sergio; Pastor, Óscar; Ruiz, Marcela
2012-01-01
This report describes and illustrates several modelling techniques proposed by Communication Analysis; namely Communicative Event Diagram, Message Structures and Event Specification Templates. The Communicative Event Diagram is a business process modelling technique that adopts a communicational perspective by focusing on communicative interactions when describing the organizational work practice, instead of focusing on physical activities1; at this abstraction level, we refer to business activities as communicative events. Message Structures is a technique based on structured text that allows specifying the messages associated to communicative events. Event Specification Templates are a means to organise the requirements concerning a communicative event. This report can be useful to analysts and business process modellers in general, since, according to our industrial experience, it is possible to apply many Communication Analysis concepts, guidelines and criteria to other business process modelling notation...
Statistically tuned Gaussian background subtraction technique for UAV videos
Indian Academy of Sciences (India)
R Athi Lingam; K Senthil Kumar
2014-08-01
Background subtraction is one of the efficient techniques to segment the targets from non-informative background of a video. The traditional background subtraction technique suits for videos with static background whereas the video obtained from unmanned aerial vehicle has dynamic background. Here, we propose an algorithm with tuning factor and Gaussian update for surveillance videos that suits effectively for aerial videos. The tuning factor is optimized by extracting the statistical features of the input frames.With the optimized tuning factor and Gaussian update an adaptive Gaussian-based background subtraction technique is proposed. The algorithm involves modelling, update and subtraction phases. This running Gaussian average based background subtraction technique uses updation at both model generation phase and subtraction phase. The resultant video extracts the moving objects from the dynamic background. Sample videos of various properties such as cluttered background, small objects, moving background and multiple objects are considered for evaluation. The technique is statistically compared with frame differencing technique, temporal median method and mixture of Gaussian model and performance evaluation is done to check the effectiveness of the proposed technique after optimization for both static and dynamic videos.
Experimental Data Mining Techniques(Using Multiple Statistical Methods
Directory of Open Access Journals (Sweden)
Mustafa Zaidi
2012-05-01
Full Text Available This paper discusses the possible solutions of non-linear multivariable by experimental Data mining techniques using on orthogonal array. Taguchi method is a very useful technique to reduce the time and cost of the experiment but the ignoring all kind of interaction effects. The results are not much encouraging and motivate to study Laser cutting process of non-linear multivariable is modeled by one and two way analysis of variance also linear and non linear regression analysis. These techniques are used to explore better analysis techniques and improve the laser cutting quality by reducing process variations caused by controllable process parameters. The size of data set causes difficulties in modeling and simulation of the problem such as decision tree is useful technique but it is not able to predict better results. The results of analysis of variance are encouraging. Taguchi and regression normally optimizes input process parameters for single characteristics.
Statistical quality control through overall vibration analysis
Carnero, M. a. Carmen; González-Palma, Rafael; Almorza, David; Mayorga, Pedro; López-Escobar, Carlos
2010-05-01
The present study introduces the concept of statistical quality control in automotive wheel bearings manufacturing processes. Defects on products under analysis can have a direct influence on passengers' safety and comfort. At present, the use of vibration analysis on machine tools for quality control purposes is not very extensive in manufacturing facilities. Noise and vibration are common quality problems in bearings. These failure modes likely occur under certain operating conditions and do not require high vibration amplitudes but relate to certain vibration frequencies. The vibration frequencies are affected by the type of surface problems (chattering) of ball races that are generated through grinding processes. The purpose of this paper is to identify grinding process variables that affect the quality of bearings by using statistical principles in the field of machine tools. In addition, an evaluation of the quality results of the finished parts under different combinations of process variables is assessed. This paper intends to establish the foundations to predict the quality of the products through the analysis of self-induced vibrations during the contact between the grinding wheel and the parts. To achieve this goal, the overall self-induced vibration readings under different combinations of process variables are analysed using statistical tools. The analysis of data and design of experiments follows a classical approach, considering all potential interactions between variables. The analysis of data is conducted through analysis of variance (ANOVA) for data sets that meet normality and homoscedasticity criteria. This paper utilizes different statistical tools to support the conclusions such as chi squared, Shapiro-Wilks, symmetry, Kurtosis, Cochran, Hartlett, and Hartley and Krushal-Wallis. The analysis presented is the starting point to extend the use of predictive techniques (vibration analysis) for quality control. This paper demonstrates the existence
Method for statistical data analysis of multivariate observations
Gnanadesikan, R
1997-01-01
A practical guide for multivariate statistical techniques-- now updated and revised In recent years, innovations in computer technology and statistical methodologies have dramatically altered the landscape of multivariate data analysis. This new edition of Methods for Statistical Data Analysis of Multivariate Observations explores current multivariate concepts and techniques while retaining the same practical focus of its predecessor. It integrates methods and data-based interpretations relevant to multivariate analysis in a way that addresses real-world problems arising in many areas of inte
A Survey on Statistical Based Single Channel Speech Enhancement Techniques
Directory of Open Access Journals (Sweden)
Sunnydayal. V
2014-11-01
Full Text Available Speech enhancement is a long standing problem with various applications like hearing aids, automatic recognition and coding of speech signals. Single channel speech enhancement technique is used for enhancement of the speech degraded by additive background noises. The background noise can have an adverse impact on our ability to converse without hindrance or smoothly in very noisy environments, such as busy streets, in a car or cockpit of an airplane. Such type of noises can affect quality and intelligibility of speech. This is a survey paper and its object is to provide an overview of speech enhancement algorithms so that enhance the noisy speech signal which is corrupted by additive noise. The algorithms are mainly based on statistical based approaches. Different estimators are compared. Challenges and Opportunities of speech enhancement are also discussed. This paper helps in choosing the best statistical based technique for speech enhancement
Vapor Pressure Data Analysis and Statistics
2016-12-01
VAPOR PRESSURE DATA ANALYSIS AND STATISTICS ECBC-TR-1422 Ann Brozena RESEARCH AND TECHNOLOGY DIRECTORATE...DATE XX-12-2016 2. REPORT TYPE Final 3. DATES COVERED (From - To) Nov 2015 – Apr 2016 4. TITLE Vapor Pressure Data Analysis and Statistics 5a...1 VAPOR PRESSURE DATA ANALYSIS AND STATISTICS 1. INTRODUCTION Knowledge of the vapor pressure of materials as a function of temperature is
Web Usage Statistics: Measurement Issues and Analytical Techniques.
Bertot, John Carlo; McClure, Charles R.; Moen, William E.; Rubin, Jeffrey
1997-01-01
One means of Web use evaluation is through analysis of server-generated log files. Various log file analysis techniques and issues are presented that are related to the interpretation of log file data. Study findings indicate a number of problems; recommendations and areas needing further research are outlined. (AEF)
Statistical Analysis of Data for Timber Strengths
DEFF Research Database (Denmark)
Sørensen, John Dalsgaard
2003-01-01
. The statistical fits have generally been made using all data and the lower tail of the data. The Maximum Likelihood Method and the Least Square Technique have been used to estimate the statistical parameters in the selected distributions. The results show that the 2-parameter Weibull distribution gives the best...
Combining heuristic and statistical techniques in landslide hazard assessments
Cepeda, Jose; Schwendtner, Barbara; Quan, Byron; Nadim, Farrokh; Diaz, Manuel; Molina, Giovanni
2014-05-01
As a contribution to the Global Assessment Report 2013 - GAR2013, coordinated by the United Nations International Strategy for Disaster Reduction - UNISDR, a drill-down exercise for landslide hazard assessment was carried out by entering the results of both heuristic and statistical techniques into a new but simple combination rule. The data available for this evaluation included landslide inventories, both historical and event-based. In addition to the application of a heuristic method used in the previous editions of GAR, the availability of inventories motivated the use of statistical methods. The heuristic technique is largely based on the Mora & Vahrson method, which estimates hazard as the product of susceptibility and triggering factors, where classes are weighted based on expert judgment and experience. Two statistical methods were also applied: the landslide index method, which estimates weights of the classes for the susceptibility and triggering factors based on the evidence provided by the density of landslides in each class of the factors; and the weights of evidence method, which extends the previous technique to include both positive and negative evidence of landslide occurrence in the estimation of weights for the classes. One key aspect during the hazard evaluation was the decision on the methodology to be chosen for the final assessment. Instead of opting for a single methodology, it was decided to combine the results of the three implemented techniques using a combination rule based on a normalization of the results of each method. The hazard evaluation was performed for both earthquake- and rainfall-induced landslides. The country chosen for the drill-down exercise was El Salvador. The results indicate that highest hazard levels are concentrated along the central volcanic chain and at the centre of the northern mountains.
INTERNAL ENVIRONMENT ANALYSIS TECHNIQUES
Directory of Open Access Journals (Sweden)
Caescu Stefan Claudiu
2011-12-01
Full Text Available Theme The situation analysis, as a separate component of the strategic planning, involves collecting and analysing relevant types of information on the components of the marketing environment and their evolution on the one hand and also on the organization’s resources and capabilities on the other. Objectives of the Research The main purpose of the study of the analysis techniques of the internal environment is to provide insight on those aspects that are of strategic importance to the organization. Literature Review The marketing environment consists of two distinct components, the internal environment that is made from specific variables within the organization and the external environment that is made from variables external to the organization. Although analysing the external environment is essential for corporate success, it is not enough unless it is backed by a detailed analysis of the internal environment of the organization. The internal environment includes all elements that are endogenous to the organization, which are influenced to a great extent and totally controlled by it. The study of the internal environment must answer all resource related questions, solve all resource management issues and represents the first step in drawing up the marketing strategy. Research Methodology The present paper accomplished a documentary study of the main techniques used for the analysis of the internal environment. Results The special literature emphasizes that the differences in performance from one organization to another is primarily dependant not on the differences between the fields of activity, but especially on the differences between the resources and capabilities and the ways these are capitalized on. The main methods of analysing the internal environment addressed in this paper are: the analysis of the organizational resources, the performance analysis, the value chain analysis and the functional analysis. Implications Basically such
Institute of Scientific and Technical Information of China (English)
唐哲
2011-01-01
通过对19届世界杯相关技术的统计分析,探讨现代足球进攻、防守技术环节优秀的数据及足球比赛胜负相关因素的内在本质,旨在为足球教学、训练提供有参考价值的依据。%The article discusses the excellent data of offensive and defend technique link in modern football,and studies football game the victory or defeat related factor of inside essence of thing,through technique statistics and analysis in the 19th World Football Cup.It will become the database of the history for the World Football Cup,for the football teaching and training to provide teaching materials that have reference value.
Multivariate statistical analysis of wildfires in Portugal
Costa, Ricardo; Caramelo, Liliana; Pereira, Mário
2013-04-01
Several studies demonstrate that wildfires in Portugal present high temporal and spatial variability as well as cluster behavior (Pereira et al., 2005, 2011). This study aims to contribute to the characterization of the fire regime in Portugal with the multivariate statistical analysis of the time series of number of fires and area burned in Portugal during the 1980 - 2009 period. The data used in the analysis is an extended version of the Rural Fire Portuguese Database (PRFD) (Pereira et al, 2011), provided by the National Forest Authority (Autoridade Florestal Nacional, AFN), the Portuguese Forest Service, which includes information for more than 500,000 fire records. There are many multiple advanced techniques for examining the relationships among multiple time series at the same time (e.g., canonical correlation analysis, principal components analysis, factor analysis, path analysis, multiple analyses of variance, clustering systems). This study compares and discusses the results obtained with these different techniques. Pereira, M.G., Trigo, R.M., DaCamara, C.C., Pereira, J.M.C., Leite, S.M., 2005: "Synoptic patterns associated with large summer forest fires in Portugal". Agricultural and Forest Meteorology. 129, 11-25. Pereira, M. G., Malamud, B. D., Trigo, R. M., and Alves, P. I.: The history and characteristics of the 1980-2005 Portuguese rural fire database, Nat. Hazards Earth Syst. Sci., 11, 3343-3358, doi:10.5194/nhess-11-3343-2011, 2011 This work is supported by European Union Funds (FEDER/COMPETE - Operational Competitiveness Programme) and by national funds (FCT - Portuguese Foundation for Science and Technology) under the project FCOMP-01-0124-FEDER-022692, the project FLAIR (PTDC/AAC-AMB/104702/2008) and the EU 7th Framework Program through FUME (contract number 243888).
A Statistical Analysis of Cryptocurrencies
Directory of Open Access Journals (Sweden)
Stephen Chan
2017-05-01
Full Text Available We analyze statistical properties of the largest cryptocurrencies (determined by market capitalization, of which Bitcoin is the most prominent example. We characterize their exchange rates versus the U.S. Dollar by fitting parametric distributions to them. It is shown that returns are clearly non-normal, however, no single distribution fits well jointly to all the cryptocurrencies analysed. We find that for the most popular currencies, such as Bitcoin and Litecoin, the generalized hyperbolic distribution gives the best fit, while for the smaller cryptocurrencies the normal inverse Gaussian distribution, generalized t distribution, and Laplace distribution give good fits. The results are important for investment and risk management purposes.
Asymptotic modal analysis and statistical energy analysis
Dowell, Earl H.
1988-07-01
Statistical Energy Analysis (SEA) is defined by considering the asymptotic limit of Classical Modal Analysis, an approach called Asymptotic Modal Analysis (AMA). The general approach is described for both structural and acoustical systems. The theoretical foundation is presented for structural systems, and experimental verification is presented for a structural plate responding to a random force. Work accomplished subsequent to the grant initiation focusses on the acoustic response of an interior cavity (i.e., an aircraft or spacecraft fuselage) with a portion of the wall vibrating in a large number of structural modes. First results were presented at the ASME Winter Annual Meeting in December, 1987, and accepted for publication in the Journal of Vibration, Acoustics, Stress and Reliability in Design. It is shown that asymptotically as the number of acoustic modes excited becomes large, the pressure level in the cavity becomes uniform except at the cavity boundaries. However, the mean square pressure at the cavity corner, edge and wall is, respectively, 8, 4, and 2 times the value in the cavity interior. Also it is shown that when the portion of the wall which is vibrating is near a cavity corner or edge, the response is significantly higher.
Statistical optimisation techniques in fatigue signal editing problem
Energy Technology Data Exchange (ETDEWEB)
Nopiah, Z. M.; Osman, M. H. [Fundamental Engineering Studies Unit Faculty of Engineering and Built Environment, Universiti Kebangsaan Malaysia, 43600 UKM (Malaysia); Baharin, N.; Abdullah, S. [Department of Mechanical and Materials Engineering Faculty of Engineering and Built Environment, Universiti Kebangsaan Malaysia, 43600 UKM (Malaysia)
2015-02-03
Success in fatigue signal editing is determined by the level of length reduction without compromising statistical constraints. A great reduction rate can be achieved by removing small amplitude cycles from the recorded signal. The long recorded signal sometimes renders the cycle-to-cycle editing process daunting. This has encouraged researchers to focus on the segment-based approach. This paper discusses joint application of the Running Damage Extraction (RDE) technique and single constrained Genetic Algorithm (GA) in fatigue signal editing optimisation.. In the first section, the RDE technique is used to restructure and summarise the fatigue strain. This technique combines the overlapping window and fatigue strain-life models. It is designed to identify and isolate the fatigue events that exist in the variable amplitude strain data into different segments whereby the retention of statistical parameters and the vibration energy are considered. In the second section, the fatigue data editing problem is formulated as a constrained single optimisation problem that can be solved using GA method. The GA produces the shortest edited fatigue signal by selecting appropriate segments from a pool of labelling segments. Challenges arise due to constraints on the segment selection by deviation level over three signal properties, namely cumulative fatigue damage, root mean square and kurtosis values. Experimental results over several case studies show that the idea of solving fatigue signal editing within a framework of optimisation is effective and automatic, and that the GA is robust for constrained segment selection.
Towards a Judgement-Based Statistical Analysis
Gorard, Stephen
2006-01-01
There is a misconception among social scientists that statistical analysis is somehow a technical, essentially objective, process of decision-making, whereas other forms of data analysis are judgement-based, subjective and far from technical. This paper focuses on the former part of the misconception, showing, rather, that statistical analysis…
Multistructure Statistical Model Applied To Factor Analysis
Bentler, Peter M.
1976-01-01
A general statistical model for the multivariate analysis of mean and covariance structures is described. Matrix calculus is used to develop the statistical aspects of one new special case in detail. This special case separates the confounding of principal components and factor analysis. (DEP)
Statistical Power in Meta-Analysis
Liu, Jin
2015-01-01
Statistical power is important in a meta-analysis study, although few studies have examined the performance of simulated power in meta-analysis. The purpose of this study is to inform researchers about statistical power estimation on two sample mean difference test under different situations: (1) the discrepancy between the analytical power and…
Statistical Power in Meta-Analysis
Liu, Jin
2015-01-01
Statistical power is important in a meta-analysis study, although few studies have examined the performance of simulated power in meta-analysis. The purpose of this study is to inform researchers about statistical power estimation on two sample mean difference test under different situations: (1) the discrepancy between the analytical power and…
Kleijnen, J.P.C.
1995-01-01
This tutorial discusses what-if analysis and optimization of System Dynamics models. These problems are solved, using the statistical techniques of regression analysis and design of experiments (DOE). These issues are illustrated by applying the statistical techniques to a System Dynamics model for
Seasonal drought predictability in Portugal using statistical-dynamical techniques
Ribeiro, A. F. S.; Pires, C. A. L.
2016-08-01
Atmospheric forecasting and predictability are important to promote adaption and mitigation measures in order to minimize drought impacts. This study estimates hybrid (statistical-dynamical) long-range forecasts of the regional drought index SPI (3-months) over homogeneous regions from mainland Portugal, based on forecasts from the UKMO operational forecasting system, with lead-times up to 6 months. ERA-Interim reanalysis data is used for the purpose of building a set of SPI predictors integrating recent past information prior to the forecast launching. Then, the advantage of combining predictors with both dynamical and statistical background in the prediction of drought conditions at different lags is evaluated. A two-step hybridization procedure is performed, in which both forecasted and observed 500 hPa geopotential height fields are subjected to a PCA in order to use forecasted PCs and persistent PCs as predictors. A second hybridization step consists on a statistical/hybrid downscaling to the regional SPI, based on regression techniques, after the pre-selection of the statistically significant predictors. The SPI forecasts and the added value of combining dynamical and statistical methods are evaluated in cross-validation mode, using the R2 and binary event scores. Results are obtained for the four seasons and it was found that winter is the most predictable season, and that most of the predictive power is on the large-scale fields from past observations. The hybridization improves the downscaling based on the forecasted PCs, since they provide complementary information (though modest) beyond that of persistent PCs. These findings provide clues about the predictability of the SPI, particularly in Portugal, and may contribute to the predictability of crops yields and to some guidance on users (such as farmers) decision making process.
Statistical analysis with Excel for dummies
Schmuller, Joseph
2013-01-01
Take the mystery out of statistical terms and put Excel to work! If you need to create and interpret statistics in business or classroom settings, this easy-to-use guide is just what you need. It shows you how to use Excel's powerful tools for statistical analysis, even if you've never taken a course in statistics. Learn the meaning of terms like mean and median, margin of error, standard deviation, and permutations, and discover how to interpret the statistics of everyday life. You'll learn to use Excel formulas, charts, PivotTables, and other tools to make sense of everything fro
Directory of Open Access Journals (Sweden)
Hammad Dabo Baba
2014-01-01
Full Text Available One of the most significant step in building structure maintenance decision is the physical inspection of the facility to be maintained. The physical inspection involved cursory assessment of the structure and ratings of the identified defects based on expert evaluation. The objective of this paper is to describe present a novel approach to prioritizing the criticality of physical defects in a residential building system using multi criteria decision analysis approach. A residential building constructed in 1985 was considered in this study. Four criteria which includes; Physical Condition of the building system (PC, Effect on Asset (EA, effect on Occupants (EO and Maintenance Cost (MC are considered in the inspection. The building was divided in to nine systems regarded as alternatives. Expert's choice software was used in comparing the importance of the criteria against the main objective, whereas structured Proforma was used in quantifying the defects observed on all building systems against each criteria. The defects severity score of each building system was identified and later multiplied by the weight of the criteria and final hierarchy was derived. The final ranking indicates that, electrical system was considered the most critical system with a risk value of 0.134 while ceiling system scored the lowest risk value of 0.066. The technique is often used in prioritizing mechanical equipment for maintenance planning. However, result of this study indicates that the technique could be used in prioritizing building systems for maintenance planning
Bruck, Aaron D.
Qualitative research methods were used in a previous study to discover the goals of faculty members teaching undergraduate laboratories. Assertions about the goals and the unique characteristics of innovative lab programs were developed from categories that emerged from the interviews. The purpose of the present research was to create a survey instrument to measure the prevalence of these themes and faculty goals for undergraduate laboratories with a national sample. This was achieved through a two-stage process that utilized a pilot survey to determine the factor structure and reduce the number of survey items to a manageable size. Once the number of survey questions was reduced, the full survey was given to a national sample of undergraduate laboratory faculty. The 312 responses to the survey were then analyzed using factor analysis. Comparative analyses were conducted using analysis of variance (ANOVA). This dissertation focuses on the processes involved in the creation of this survey and the subsequent analyses of the data the survey produced. The results of these analyses and the implications of this research will also be discussed.
Asfahani, J.; Al-Hent, R.; Aissa, M.
2016-02-01
A scored lithological map including 10 radiometric units is established through applying factor analysis approach to aerial spectrometric data of Area-1, Syrian desert, which includes Ur, eU, eTh, K%, eU/eTh, eU/K%, and eTh/K%. A model of four rotated factors F1, F2, F3, and F4 is adapted for representing 234,829 data measured points in Area-1, where 86% of total data variance is interpreted. A geological scored pseudo-section derived from the lithological scored map is established and analyzed in order to show the possible stratigraphic and structural traps for uranium occurrences associated with phosphate deposits in the studied Area-1. These identified traps presented in this paper need detailed investigation and must be necessarily followed and checked by ground validations and subsurface well logging, in order to locate the anomalous uranium occurrences and explore with more confidence and certitude their characteristics as a function of depth.
Indian Academy of Sciences (India)
J Asfahani; R Al-Hent; M Aissa
2016-02-01
A scored lithological map including 10 radiometric units is established through applying factor analysis approach to aerial spectrometric data of Area-1, Syrian desert, which includes Ur, eU, eTh, K%, eU/eTh, eU/K%, and eTh/K%. A model of four rotated factors F1, F2, F3, and F4 is adapted for representing 234,829 data measured points in Area-1, where 86% of total data variance is interpreted. A geological scored pseudo-section derived from the lithological scored map is established and analyzed in order to show the possible stratigraphic and structural traps for uranium occurrences associated with phosphate deposits in the studied Area-1. These identified traps presented in this paper need detailed investigation and must be necessarily followed and checked by ground validations and subsurface well logging, in order to locate the anomalous uranium occurrences and explore with more confidence and certitude their characteristics as a function of depth.
Collados-Lara, Antonio-Juan; Pulido-Velazquez, David; Pardo-Iguzquiza, Eulogio
2017-04-01
and drought statistic of the historical data. A multi-objective analysis using basic statistics (mean, standard deviation and asymmetry coefficient) and droughts statistics (duration, magnitude and intensity) has been performed to identify which models are better in terms of goodness of fit to reproduce the historical series. The drought statistics have been obtained from the Standard Precipitation index (SPI) series using the Theory of Runs. This analysis allows discriminate the best RCM and the best combination of model and correction technique in the bias-correction method. We have also analyzed the possibilities of using different Stochastic Weather Generators to approximate the basic and droughts statistics of the historical series. These analyses have been performed in our case study in a lumped and in a distributed way in order to assess its sensibility to the spatial scale. The statistic of the future temperature series obtained with different ensemble options are quite homogeneous, but the precipitation shows a higher sensibility to the adopted method and spatial scale. The global increment in the mean temperature values are 31.79 %, 31.79 %, 31.03 % and 31.74 % for the distributed bias-correction, distributed delta-change, lumped bias-correction and lumped delta-change ensembles respectively and in the precipitation they are -25.48 %, -28.49 %, -26.42 % and -27.35% respectively. Acknowledgments: This research work has been partially supported by the GESINHIMPADAPT project (CGL2013-48424-C2-2-R) with Spanish MINECO funds. We would also like to thank Spain02 and CORDEX projects for the data provided for this study and the R package qmap.
Explorations in Statistics: The Analysis of Change
Curran-Everett, Douglas; Williams, Calvin L.
2015-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This tenth installment of "Explorations in Statistics" explores the analysis of a potential change in some physiological response. As researchers, we often express absolute change as percent change so we can…
Analysis of Preference Data Using Intermediate Test Statistic Abstract
African Journals Online (AJOL)
PROF. O. E. OSUAGWU
2013-06-01
Jun 1, 2013 ... Intermediate statistic is a link between Friedman test statistic and the multinomial statistic. The statistic is ... The null hypothesis Ho .... [7] Taplin, R.H., The Statistical Analysis of Preference Data, Applied Statistics, No. 4, pp.
Hypothesis testing and statistical analysis of microbiome
Directory of Open Access Journals (Sweden)
Yinglin Xia
2017-09-01
Full Text Available After the initiation of Human Microbiome Project in 2008, various biostatistic and bioinformatic tools for data analysis and computational methods have been developed and applied to microbiome studies. In this review and perspective, we discuss the research and statistical hypotheses in gut microbiome studies, focusing on mechanistic concepts that underlie the complex relationships among host, microbiome, and environment. We review the current available statistic tools and highlight recent progress of newly developed statistical methods and models. Given the current challenges and limitations in biostatistic approaches and tools, we discuss the future direction in developing statistical methods and models for the microbiome studies.
Statistical Analysis of English Noun Suffixes
Institute of Scientific and Technical Information of China (English)
王惠灵
2014-01-01
This study discusses the origin of English noun suffixes,including Old English,Latin,French and Greek.A statistical analysis of some typical noun suffixes is followed in two corpora,which focuses on their frequency and distribution.
Statistical Analysis Of Reconnaissance Geochemical Data From ...
African Journals Online (AJOL)
Statistical Analysis Of Reconnaissance Geochemical Data From Orle District, ... The univariate methods used include frequency distribution and cumulative ... The possible mineral potential of the area include base metals (Pb, Zn, Cu, Mo, etc.) ...
Statistical evaluation of diagnostic performance topics in ROC analysis
Zou, Kelly H; Bandos, Andriy I; Ohno-Machado, Lucila; Rockette, Howard E
2016-01-01
Statistical evaluation of diagnostic performance in general and Receiver Operating Characteristic (ROC) analysis in particular are important for assessing the performance of medical tests and statistical classifiers, as well as for evaluating predictive models or algorithms. This book presents innovative approaches in ROC analysis, which are relevant to a wide variety of applications, including medical imaging, cancer research, epidemiology, and bioinformatics. Statistical Evaluation of Diagnostic Performance: Topics in ROC Analysis covers areas including monotone-transformation techniques in parametric ROC analysis, ROC methods for combined and pooled biomarkers, Bayesian hierarchical transformation models, sequential designs and inferences in the ROC setting, predictive modeling, multireader ROC analysis, and free-response ROC (FROC) methodology. The book is suitable for graduate-level students and researchers in statistics, biostatistics, epidemiology, public health, biomedical engineering, radiology, medi...
Reproducible statistical analysis with multiple languages
DEFF Research Database (Denmark)
Lenth, Russell; Højsgaard, Søren
2011-01-01
This paper describes the system for making reproducible statistical analyses. differs from other systems for reproducible analysis in several ways. The two main differences are: (1) Several statistics programs can be in used in the same document. (2) Documents can be prepared using OpenOffice or ......This paper describes the system for making reproducible statistical analyses. differs from other systems for reproducible analysis in several ways. The two main differences are: (1) Several statistics programs can be in used in the same document. (2) Documents can be prepared using Open......Office or \\LaTeX. The main part of this paper is an example showing how to use and together in an OpenOffice text document. The paper also contains some practical considerations on the use of literate programming in statistics....
Advances in statistical models for data analysis
Minerva, Tommaso; Vichi, Maurizio
2015-01-01
This edited volume focuses on recent research results in classification, multivariate statistics and machine learning and highlights advances in statistical models for data analysis. The volume provides both methodological developments and contributions to a wide range of application areas such as economics, marketing, education, social sciences and environment. The papers in this volume were first presented at the 9th biannual meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in September 2013 at the University of Modena and Reggio Emilia, Italy.
Statistical Analysis of Data for Timber Strengths
DEFF Research Database (Denmark)
Sørensen, John Dalsgaard; Hoffmeyer, P.
Statistical analyses are performed for material strength parameters from approximately 6700 specimens of structural timber. Non-parametric statistical analyses and fits to the following distributions types have been investigated: Normal, Lognormal, 2 parameter Weibull and 3-parameter Weibull....... The statistical fits have generally been made using all data (100%) and the lower tail (30%) of the data. The Maximum Likelihood Method and the Least Square Technique have been used to estimate the statistical parameters in the selected distributions. 8 different databases are analysed. The results show that 2......-parameter Weibull (and Normal) distributions give the best fits to the data available, especially if tail fits are used whereas the LogNormal distribution generally gives poor fit and larger coefficients of variation, especially if tail fits are used....
Schmidt decomposition and multivariate statistical analysis
Bogdanov, Yu. I.; Bogdanova, N. A.; Fastovets, D. V.; Luckichev, V. F.
2016-12-01
The new method of multivariate data analysis based on the complements of classical probability distribution to quantum state and Schmidt decomposition is presented. We considered Schmidt formalism application to problems of statistical correlation analysis. Correlation of photons in the beam splitter output channels, when input photons statistics is given by compound Poisson distribution is examined. The developed formalism allows us to analyze multidimensional systems and we have obtained analytical formulas for Schmidt decomposition of multivariate Gaussian states. It is shown that mathematical tools of quantum mechanics can significantly improve the classical statistical analysis. The presented formalism is the natural approach for the analysis of both classical and quantum multivariate systems and can be applied in various tasks associated with research of dependences.
Statistics and analysis of scientific data
Bonamente, Massimiliano
2013-01-01
Statistics and Analysis of Scientific Data covers the foundations of probability theory and statistics, and a number of numerical and analytical methods that are essential for the present-day analyst of scientific data. Topics covered include probability theory, distribution functions of statistics, fits to two-dimensional datasheets and parameter estimation, Monte Carlo methods and Markov chains. Equal attention is paid to the theory and its practical application, and results from classic experiments in various fields are used to illustrate the importance of statistics in the analysis of scientific data. The main pedagogical method is a theory-then-application approach, where emphasis is placed first on a sound understanding of the underlying theory of a topic, which becomes the basis for an efficient and proactive use of the material for practical applications. The level is appropriate for undergraduates and beginning graduate students, and as a reference for the experienced researcher. Basic calculus is us...
Foundation of statistical energy analysis in vibroacoustics
Le Bot, A
2015-01-01
This title deals with the statistical theory of sound and vibration. The foundation of statistical energy analysis is presented in great detail. In the modal approach, an introduction to random vibration with application to complex systems having a large number of modes is provided. For the wave approach, the phenomena of propagation, group speed, and energy transport are extensively discussed. Particular emphasis is given to the emergence of diffuse field, the central concept of the theory.
Basics, common errors and essentials of statistical tools and techniques in anesthesiology research.
Bajwa, Sukhminder Jit Singh
2015-01-01
The statistical portion is a vital component of any research study. The research methodology and the application of statistical tools and techniques have evolved over the years and have significantly helped the research activities throughout the globe. The results and inferences are not accurately possible without proper validation with various statistical tools and tests. The evidencebased anesthesia research and practice has to incorporate statistical tools in the methodology right from the planning stage of the study itself. Though the medical fraternity is well acquainted with the significance of statistics in research, there is a lack of in-depth knowledge about the various statistical concepts and principles among majority of the researchers. The clinical impact and consequences can be serious as the incorrect analysis, conclusions, and false results may construct an artificial platform on which future research activities are replicated. The present tutorial is an attempt to make anesthesiologists aware of the various aspects of statistical methods used in evidence-based research and also to highlight the common areas where maximum number of statistical errors are committed so as to adopt better statistical practices.
Basics, common errors and essentials of statistical tools and techniques in anesthesiology research
Bajwa, Sukhminder Jit Singh
2015-01-01
The statistical portion is a vital component of any research study. The research methodology and the application of statistical tools and techniques have evolved over the years and have significantly helped the research activities throughout the globe. The results and inferences are not accurately possible without proper validation with various statistical tools and tests. The evidencebased anesthesia research and practice has to incorporate statistical tools in the methodology right from the planning stage of the study itself. Though the medical fraternity is well acquainted with the significance of statistics in research, there is a lack of in-depth knowledge about the various statistical concepts and principles among majority of the researchers. The clinical impact and consequences can be serious as the incorrect analysis, conclusions, and false results may construct an artificial platform on which future research activities are replicated. The present tutorial is an attempt to make anesthesiologists aware of the various aspects of statistical methods used in evidence-based research and also to highlight the common areas where maximum number of statistical errors are committed so as to adopt better statistical practices. PMID:26702217
Statistical and machine learning approaches for network analysis
Dehmer, Matthias
2012-01-01
Explore the multidisciplinary nature of complex networks through machine learning techniques Statistical and Machine Learning Approaches for Network Analysis provides an accessible framework for structurally analyzing graphs by bringing together known and novel approaches on graph classes and graph measures for classification. By providing different approaches based on experimental data, the book uniquely sets itself apart from the current literature by exploring the application of machine learning techniques to various types of complex networks. Comprised of chapters written by internation
The use of statistical techniques in par-level management.
Klee, W B
1994-02-01
The total quality management movement has allowed the reintroduction of statistics in the materials management workplace. Statistical methods can be applied to the par level management process with significant results.
Instant Replay: Investigating statistical Analysis in Sports
Sidhu, Gagan
2011-01-01
Technology has had an unquestionable impact on the way people watch sports. As technology has evolved, so too has the knowledge of a casual sports fan. A direct result of this evolution is the amount of statistical analysis in sport. The goal of statistical analysis in sports is a simple one: to eliminate subjective analysis. Over the past four decades, statistics have slowly pervaded the viewing experience of sports. In this paper, we analyze previous work that proposed metrics and models that seek to evaluate various aspects of sports. The unifying goal of these works is an accurate representation of either the player or sport. We also look at work that investigates certain situations and their impact on the outcome a game. We conclude this paper with the discussion of potential future work in certain areas of sport..
Statistical mechanics of sensing and communications: Insights and techniques
Energy Technology Data Exchange (ETDEWEB)
Murayama, T; Davis, P [NTT Communication Science Laboratories, NIPPON TELEGRAPH AND TELEPHONE CORPORATION, 2-4, Hikaridai, Seika-cho, ' Keihanna Science City' , Kyoto 619-0237 (Japan)], E-mail: murayama@cslab.kecl.ntt.co.jp, E-mail: davis@cslab.kecl.ntt.co.jp
2008-01-15
In this article we review a basic model for analysis of large sensor networks from the point of view of collective estimation under bandwidth constraints. We compare different sensing aggregation levels as alternative 'strategies' for collective estimation: moderate aggregation from a moderate number of sensors for which communication bandwidth is enough that data encoding can be reversible, and large scale aggregation from very many sensors - in which case communication bandwidth constraints require the use of nonreversible encoding. We show the non-trivial trade-off between sensing quality, which can be increased by increasing the number of sensors, and communication quality under bandwidth constraints, which decreases if the number of sensors is too large. From a practical standpoint, we verify that such a trade-off exists in constructively defined communications schemes. We introduce a probabilistic encoding scheme and define rate distortion models that are suitable for analysis of the large network limit. Our description shows that the methods and ideas from statistical physics can play an important role in formulating effective models for such schemes.
Statistical analysis of network data with R
Kolaczyk, Eric D
2014-01-01
Networks have permeated everyday life through everyday realities like the Internet, social networks, and viral marketing. As such, network analysis is an important growth area in the quantitative sciences, with roots in social network analysis going back to the 1930s and graph theory going back centuries. Measurement and analysis are integral components of network research. As a result, statistical methods play a critical role in network analysis. This book is the first of its kind in network research. It can be used as a stand-alone resource in which multiple R packages are used to illustrate how to conduct a wide range of network analyses, from basic manipulation and visualization, to summary and characterization, to modeling of network data. The central package is igraph, which provides extensive capabilities for studying network graphs in R. This text builds on Eric D. Kolaczyk’s book Statistical Analysis of Network Data (Springer, 2009).
Directory of Open Access Journals (Sweden)
Adrion Christine
2012-09-01
Full Text Available Abstract Background A statistical analysis plan (SAP is a critical link between how a clinical trial is conducted and the clinical study report. To secure objective study results, regulatory bodies expect that the SAP will meet requirements in pre-specifying inferential analyses and other important statistical techniques. To write a good SAP for model-based sensitivity and ancillary analyses involves non-trivial decisions on and justification of many aspects of the chosen setting. In particular, trials with longitudinal count data as primary endpoints pose challenges for model choice and model validation. In the random effects setting, frequentist strategies for model assessment and model diagnosis are complex and not easily implemented and have several limitations. Therefore, it is of interest to explore Bayesian alternatives which provide the needed decision support to finalize a SAP. Methods We focus on generalized linear mixed models (GLMMs for the analysis of longitudinal count data. A series of distributions with over- and under-dispersion is considered. Additionally, the structure of the variance components is modified. We perform a simulation study to investigate the discriminatory power of Bayesian tools for model criticism in different scenarios derived from the model setting. We apply the findings to the data from an open clinical trial on vertigo attacks. These data are seen as pilot data for an ongoing phase III trial. To fit GLMMs we use a novel Bayesian computational approach based on integrated nested Laplace approximations (INLAs. The INLA methodology enables the direct computation of leave-one-out predictive distributions. These distributions are crucial for Bayesian model assessment. We evaluate competing GLMMs for longitudinal count data according to the deviance information criterion (DIC or probability integral transform (PIT, and by using proper scoring rules (e.g. the logarithmic score. Results The instruments under study
About Statistical Analysis of Qualitative Survey Data
Directory of Open Access Journals (Sweden)
Stefan Loehnert
2010-01-01
Full Text Available Gathered data is frequently not in a numerical form allowing immediate appliance of the quantitative mathematical-statistical methods. In this paper are some basic aspects examining how quantitative-based statistical methodology can be utilized in the analysis of qualitative data sets. The transformation of qualitative data into numeric values is considered as the entrance point to quantitative analysis. Concurrently related publications and impacts of scale transformations are discussed. Subsequently, it is shown how correlation coefficients are usable in conjunction with data aggregation constrains to construct relationship modelling matrices. For illustration, a case study is referenced at which ordinal type ordered qualitative survey answers are allocated to process defining procedures as aggregation levels. Finally options about measuring the adherence of the gathered empirical data to such kind of derived aggregation models are introduced and a statistically based reliability check approach to evaluate the reliability of the chosen model specification is outlined.
The fuzzy approach to statistical analysis
Coppi, Renato; Gil, Maria A.; Kiers, Henk A. L.
2006-01-01
For the last decades, research studies have been developed in which a coalition of Fuzzy Sets Theory and Statistics has been established with different purposes. These namely are: (i) to introduce new data analysis problems in which the objective involves either fuzzy relationships or fuzzy terms;
Bayesian Statistics for Biological Data: Pedigree Analysis
Stanfield, William D.; Carlton, Matthew A.
2004-01-01
The use of Bayes' formula is applied to the biological problem of pedigree analysis to show that the Bayes' formula and non-Bayesian or "classical" methods of probability calculation give different answers. First year college students of biology can be introduced to the Bayesian statistics.
Selected papers on analysis, probability, and statistics
Nomizu, Katsumi
1994-01-01
This book presents papers that originally appeared in the Japanese journal Sugaku. The papers fall into the general area of mathematical analysis as it pertains to probability and statistics, dynamical systems, differential equations and analytic function theory. Among the topics discussed are: stochastic differential equations, spectra of the Laplacian and Schrödinger operators, nonlinear partial differential equations which generate dissipative dynamical systems, fractal analysis on self-similar sets and the global structure of analytic functions.
Multivariate statistical analysis of precipitation chemistry in Northwestern Spain
Energy Technology Data Exchange (ETDEWEB)
Prada-Sanchez, J.M.; Garcia-Jurado, I.; Gonzalez-Manteiga, W.; Fiestras-Janeiro, M.G.; Espada-Rios, M.I.; Lucas-Dominguez, T. (University of Santiago, Santiago (Spain). Faculty of Mathematics, Dept. of Statistics and Operations Research)
1993-07-01
149 samples of rainwater were collected in the proximity of a power station in northwestern Spain at three rainwater monitoring stations. The resulting data are analyzed using multivariate statistical techniques. Firstly, the Principal Component Analysis shows that there are three main sources of pollution in the area (a marine source, a rural source and an acid source). The impact from pollution from these sources on the immediate environment of the stations is studied using Factorial Discriminant Analysis. 8 refs., 7 figs., 11 tabs.
Institute of Scientific and Technical Information of China (English)
巩庆波; 徐兰君
2014-01-01
Based on the statistical analysis on attack and defense techniques of men ’ s basketball in Beijing and London Olympic Games, the overall defense and attack quality has been improved in London Olympics comparing with the Olympics in Beijing , but the behavior of Chinese men ’ s basketball team fell significantly .Using factor and cluster analysis , the technical statistic index can be divided into 4 factors:offense factor , response factor , psychological factor and physical factor .The conclusion that those 4 fac-tors are the main factors determining the scores of a team can be made .In terms of the cluster analysis , the US team can be classi-fied as a group .However , the grouping of other teams varies in those two Olympics .Chinese team with some top European teams like Spanish team will be classifies as the same group because of their apparent similarities .Therefore, the urgent problems are that we should increase our communications with these top European teams , improve our own members ’ strength quality , and raise our technical and tactical capabilities under strong antagonistic conditions .%统计分析北京、伦敦两届奥运会男篮比赛相关攻防技术指标，伦敦奥运会男篮整体攻防质量较北京奥运会有了提高，中国男篮下滑明显。运用因子与聚类分析法检验分析所得数据，把相关技术指标分为：进攻因子、反应因子、心理因子和身体因子四个因子，得出这四个因子是决定球队成绩主要因子的结论。聚类分析结果：美国队独归为一类；其他队伍两届奥运会归类有所变化；中国队与以西班牙为代表的欧洲强队归为一类，相似特征明显。加强与欧洲强队学习交流，提高队员力量素质及在强对抗条件下自身的技战术发挥能力是我们急需解决的问题。
Statistical Analysis of Thermal Analysis Margin
Garrison, Matthew B.
2011-01-01
NASA Goddard Space Flight Center requires that each project demonstrate a minimum of 5 C margin between temperature predictions and hot and cold flight operational limits. The bounding temperature predictions include worst-case environment and thermal optical properties. The purpose of this work is to: assess how current missions are performing against their pre-launch bounding temperature predictions and suggest any possible changes to the thermal analysis margin rules
Spatial Analysis Along Networks Statistical and Computational Methods
Okabe, Atsuyuki
2012-01-01
In the real world, there are numerous and various events that occur on and alongside networks, including the occurrence of traffic accidents on highways, the location of stores alongside roads, the incidence of crime on streets and the contamination along rivers. In order to carry out analyses of those events, the researcher needs to be familiar with a range of specific techniques. Spatial Analysis Along Networks provides a practical guide to the necessary statistical techniques and their computational implementation. Each chapter illustrates a specific technique, from Stochastic Point Process
The Statistical Analysis of Time Series
Anderson, T W
2011-01-01
The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences George
Al-Tufail, M; Akram, M; Haq, A
1999-03-01
The method previously used in the Toxicology Laboratories of King Faisal Specialist Hospital and Research Center for determining the zinc concentration in serum by Zeeman atomic absorption spectrometer was improved by modifying the matrix modifier and by changing the heated graphite furnace atomization (HGA) program. After trying several methods we failed to achieve the required precision and the accuracy of methods for serum zinc determination. We changed the matrix modifier to a fifty percent mixture (v/v) of 3.90 grams per liter of ammonium phosphate in Type 1 water with 0.2% nitric acid and 1.0 gram per liter of magnesium nitrate in acidic water (0.2% HNO3) with 0.1% triton X-100 was used as matrix modifier. A twenty-five fold dilution of the sample in matrix modifier was injected on the L'vov's platform of the furnace. In order to reduce the high sensitivity of Zn the furnace program was modified. The method is found very robust. The average reproducibility between inter-runs and intra-run is less than 1.59% with a high degree of accuracy. We used two levels of controls i.e. normal or low level and abnormal or high level. The linearity and the detection limit of the assay were 0.9992 and 0.010 micromol/L respectively. Average recovery of the analyte was 98.65%. The X-Bar and R charts were constructed by using Shewhart's statistical analysis technique to assess the test methodology. It was found that the assay is capable and stable for routine clinical and research analysis. The capability index (C(P)) of the assay, an indicator of the precision, was calculated.
Statistical Tools for Forensic Analysis of Toolmarks
Energy Technology Data Exchange (ETDEWEB)
David Baldwin; Max Morris; Stan Bajic; Zhigang Zhou; James Kreiser
2004-04-22
Recovery and comparison of toolmarks, footprint impressions, and fractured surfaces connected to a crime scene are of great importance in forensic science. The purpose of this project is to provide statistical tools for the validation of the proposition that particular manufacturing processes produce marks on the work-product (or tool) that are substantially different from tool to tool. The approach to validation involves the collection of digital images of toolmarks produced by various tool manufacturing methods on produced work-products and the development of statistical methods for data reduction and analysis of the images. The developed statistical methods provide a means to objectively calculate a ''degree of association'' between matches of similarly produced toolmarks. The basis for statistical method development relies on ''discriminating criteria'' that examiners use to identify features and spatial relationships in their analysis of forensic samples. The developed data reduction algorithms utilize the same rules used by examiners for classification and association of toolmarks.
Statistical Analysis of Iberian Peninsula Megaliths Orientations
González-García, A. C.
2009-08-01
Megalithic monuments have been intensively surveyed and studied from the archaeoastronomical point of view in the past decades. We have orientation measurements for over one thousand megalithic burial monuments in the Iberian Peninsula, from several different periods. These data, however, lack a sound understanding. A way to classify and start to understand such orientations is by means of statistical analysis of the data. A first attempt is done with simple statistical variables and a mere comparison between the different areas. In order to minimise the subjectivity in the process a further more complicated analysis is performed. Some interesting results linking the orientation and the geographical location will be presented. Finally I will present some models comparing the orientation of the megaliths in the Iberian Peninsula with the rising of the sun and the moon at several times of the year.
Multivariate analysis: A statistical approach for computations
Michu, Sachin; Kaushik, Vandana
2014-10-01
Multivariate analysis is a type of multivariate statistical approach commonly used in, automotive diagnosis, education evaluating clusters in finance etc and more recently in the health-related professions. The objective of the paper is to provide a detailed exploratory discussion about factor analysis (FA) in image retrieval method and correlation analysis (CA) of network traffic. Image retrieval methods aim to retrieve relevant images from a collected database, based on their content. The problem is made more difficult due to the high dimension of the variable space in which the images are represented. Multivariate correlation analysis proposes an anomaly detection and analysis method based on the correlation coefficient matrix. Anomaly behaviors in the network include the various attacks on the network like DDOs attacks and network scanning.
Statistical tools for the calibration of traffic conflicts techniques.
Oppe, S.
1982-01-01
To compare the results of various conflict techniques from different countries, an international experiment took place in Rouen in 1979. The experiment showed that, in general, with each technique the same conclusions were reached with regard to the problems of safety at two intersections in Rouen.
Surface analysis the principal techniques
Vickerman, John C
2009-01-01
This completely updated and revised second edition of Surface Analysis: The Principal Techniques, deals with the characterisation and understanding of the outer layers of substrates, how they react, look and function which are all of interest to surface scientists. Within this comprehensive text, experts in each analysis area introduce the theory and practice of the principal techniques that have shown themselves to be effective in both basic research and in applied surface analysis. Examples of analysis are provided to facilitate the understanding of this topic and to show readers how they c
Data analysis using the Gnu R system for statistical computation
Energy Technology Data Exchange (ETDEWEB)
Simone, James; /Fermilab
2011-07-01
R is a language system for statistical computation. It is widely used in statistics, bioinformatics, machine learning, data mining, quantitative finance, and the analysis of clinical drug trials. Among the advantages of R are: it has become the standard language for developing statistical techniques, it is being actively developed by a large and growing global user community, it is open source software, it is highly portable (Linux, OS-X and Windows), it has a built-in documentation system, it produces high quality graphics and it is easily extensible with over four thousand extension library packages available covering statistics and applications. This report gives a very brief introduction to R with some examples using lattice QCD simulation results. It then discusses the development of R packages designed for chi-square minimization fits for lattice n-pt correlation functions.
Wavelet analysis in ecology and epidemiology: impact of statistical tests.
Cazelles, Bernard; Cazelles, Kévin; Chavez, Mario
2014-02-06
Wavelet analysis is now frequently used to extract information from ecological and epidemiological time series. Statistical hypothesis tests are conducted on associated wavelet quantities to assess the likelihood that they are due to a random process. Such random processes represent null models and are generally based on synthetic data that share some statistical characteristics with the original time series. This allows the comparison of null statistics with those obtained from original time series. When creating synthetic datasets, different techniques of resampling result in different characteristics shared by the synthetic time series. Therefore, it becomes crucial to consider the impact of the resampling method on the results. We have addressed this point by comparing seven different statistical testing methods applied with different real and simulated data. Our results show that statistical assessment of periodic patterns is strongly affected by the choice of the resampling method, so two different resampling techniques could lead to two different conclusions about the same time series. Moreover, our results clearly show the inadequacy of resampling series generated by white noise and red noise that are nevertheless the methods currently used in the wide majority of wavelets applications. Our results highlight that the characteristics of a time series, namely its Fourier spectrum and autocorrelation, are important to consider when choosing the resampling technique. Results suggest that data-driven resampling methods should be used such as the hidden Markov model algorithm and the 'beta-surrogate' method.
Correlation techniques and measurements of wave-height statistics
Guthart, H.; Taylor, W. C.; Graf, K. A.; Douglas, D. G.
1972-01-01
Statistical measurements of wave height fluctuations have been made in a wind wave tank. The power spectral density function of temporal wave height fluctuations evidenced second-harmonic components and an f to the minus 5th power law decay beyond the second harmonic. The observations of second harmonic effects agreed very well with a theoretical prediction. From the wave statistics, surface drift currents were inferred and compared to experimental measurements with satisfactory agreement. Measurements were made of the two dimensional correlation coefficient at 15 deg increments in angle with respect to the wind vector. An estimate of the two-dimensional spatial power spectral density function was also made.
Hamer, Harold A.; Mayer, John P.; Huston, Wilber B.
1961-01-01
Results of a statistical analysis of horizontal-tail loads on a fighter airplane are presented. The data were obtained from a number of operational training missions with flight at altitudes up to about 50,000 feet and at Mach numbers up to 1.22. The analysis was performed to determine the feasibility of calculating horizontal-tail load from data on the flight conditions and airplane motions. In the analysis the calculated loads are compared with the measured loads for the different types of missions performed. The loads were calculated by two methods: a direct approach and a Monte Carlo technique. The procedures used and some of the problems associated with the data analysis are discussed. frequencies of occurrence of tail loads of given magnitudes are derived from statistical information on the flight quantities. In the direct method, a time history of tail load is calculated from time-history measurements of the flight quantities. The Monte Carlo method could be useful for extending loads information for design of prospective airplanes . For the Monte Carlo method, the The results indicate that the accuracy of loads, regardless of the method used for calculation, is largely dependent on the knowledge of the pertinent airplane aerodynamic characteristics and center-of-gravity location. In addition, reliable Monte Carlo results require an adequate sample of statistical data and a knowledge of the more important statistical dependencies between the various flight conditions and airplane motions.
Energy Technology Data Exchange (ETDEWEB)
Ren, Qingguo, E-mail: renqg83@163.com [Department of Radiology, Hua Dong Hospital of Fudan University, Shanghai 200040 (China); Dewan, Sheilesh Kumar, E-mail: sheilesh_d1@hotmail.com [Department of Geriatrics, Hua Dong Hospital of Fudan University, Shanghai 200040 (China); Li, Ming, E-mail: minli77@163.com [Department of Radiology, Hua Dong Hospital of Fudan University, Shanghai 200040 (China); Li, Jianying, E-mail: Jianying.Li@med.ge.com [CT Imaging Research Center, GE Healthcare China, Beijing (China); Mao, Dingbiao, E-mail: maodingbiao74@163.com [Department of Radiology, Hua Dong Hospital of Fudan University, Shanghai 200040 (China); Wang, Zhenglei, E-mail: Williswang_doc@yahoo.com.cn [Department of Radiology, Shanghai Electricity Hospital, Shanghai 200050 (China); Hua, Yanqing, E-mail: cjr.huayanqing@vip.163.com [Department of Radiology, Hua Dong Hospital of Fudan University, Shanghai 200040 (China)
2012-10-15
Purpose: To compare image quality and visualization of normal structures and lesions in brain computed tomography (CT) with adaptive statistical iterative reconstruction (ASIR) and filtered back projection (FBP) reconstruction techniques in different X-ray tube current–time products. Materials and methods: In this IRB-approved prospective study, forty patients (nineteen men, twenty-one women; mean age 69.5 ± 11.2 years) received brain scan at different tube current–time products (300 and 200 mAs) in 64-section multi-detector CT (GE, Discovery CT750 HD). Images were reconstructed with FBP and four levels of ASIR-FBP blending. Two radiologists (please note that our hospital is renowned for its geriatric medicine department, and these two radiologists are more experienced in chronic cerebral vascular disease than in neoplastic disease, so this research did not contain cerebral tumors but as a discussion) assessed all the reconstructed images for visibility of normal structures, lesion conspicuity, image contrast and diagnostic confidence in a blinded and randomized manner. Volume CT dose index (CTDI{sub vol}) and dose-length product (DLP) were recorded. All the data were analyzed by using SPSS 13.0 statistical analysis software. Results: There was no statistically significant difference between the image qualities at 200 mAs with 50% ASIR blending technique and 300 mAs with FBP technique (p > .05). While between the image qualities at 200 mAs with FBP and 300 mAs with FBP technique a statistically significant difference (p < .05) was found. Conclusion: ASIR provided same image quality and diagnostic ability in brain imaging with greater than 30% dose reduction compared with FBP reconstruction technique.
Statistical techniques for sampling and monitoring natural resources
Hans T. Schreuder; Richard Ernst; Hugo Ramirez-Maldonado
2004-01-01
We present the statistical theory of inventory and monitoring from a probabilistic point of view. We start with the basics and show the interrelationships between designs and estimators illustrating the methods with a small artificial population as well as with a mapped realistic population. For such applications, useful open source software is given in Appendix 4....
Velocity field statistics and tessellation techniques : Unbiased estimators of Omega
Van de Weygaert, R; Bernardeau, F; Muller,; Gottlober, S; Mucket, JP; Wambsganss, J
1998-01-01
We describe two new - stochastic-geometrical - methods to obtain reliable velocity field statistics from N-body simulations and from any general density and velocity fluctuation field sampled at a discrete set of locations. These methods, the Voronoi tessellation method and Delaunay tessellation met
GNSS Spoofing Detection Based on Signal Power Measurements: Statistical Analysis
Directory of Open Access Journals (Sweden)
V. Dehghanian
2012-01-01
Full Text Available A threat to GNSS receivers is posed by a spoofing transmitter that emulates authentic signals but with randomized code phase and Doppler values over a small range. Such spoofing signals can result in large navigational solution errors that are passed onto the unsuspecting user with potentially dire consequences. An effective spoofing detection technique is developed in this paper, based on signal power measurements and that can be readily applied to present consumer grade GNSS receivers with minimal firmware changes. An extensive statistical analysis is carried out based on formulating a multihypothesis detection problem. Expressions are developed to devise a set of thresholds required for signal detection and identification. The detection processing methods developed are further manipulated to exploit incidental antenna motion arising from user interaction with a GNSS handheld receiver to further enhance the detection performance of the proposed algorithm. The statistical analysis supports the effectiveness of the proposed spoofing detection technique under various multipath conditions.
Statistical mechanics analysis of sparse data.
Habeck, Michael
2011-03-01
Inferential structure determination uses Bayesian theory to combine experimental data with prior structural knowledge into a posterior probability distribution over protein conformational space. The posterior distribution encodes everything one can say objectively about the native structure in the light of the available data and additional prior assumptions and can be searched for structural representatives. Here an analogy is drawn between the posterior distribution and the canonical ensemble of statistical physics. A statistical mechanics analysis assesses the complexity of a structure calculation globally in terms of ensemble properties. Analogs of the free energy and density of states are introduced; partition functions evaluate the consistency of prior assumptions with data. Critical behavior is observed with dwindling restraint density, which impairs structure determination with too sparse data. However, prior distributions with improved realism ameliorate the situation by lowering the critical number of observations. An in-depth analysis of various experimentally accessible structural parameters and force field terms will facilitate a statistical approach to protein structure determination with sparse data that avoids bias as much as possible.
Statistical analysis of sleep spindle occurrences.
Panas, Dagmara; Malinowska, Urszula; Piotrowski, Tadeusz; Żygierewicz, Jarosław; Suffczyński, Piotr
2013-01-01
Spindles - a hallmark of stage II sleep - are a transient oscillatory phenomenon in the EEG believed to reflect thalamocortical activity contributing to unresponsiveness during sleep. Currently spindles are often classified into two classes: fast spindles, with a frequency of around 14 Hz, occurring in the centro-parietal region; and slow spindles, with a frequency of around 12 Hz, prevalent in the frontal region. Here we aim to establish whether the spindle generation process also exhibits spatial heterogeneity. Electroencephalographic recordings from 20 subjects were automatically scanned to detect spindles and the time occurrences of spindles were used for statistical analysis. Gamma distribution parameters were fit to each inter-spindle interval distribution, and a modified Wald-Wolfowitz lag-1 correlation test was applied. Results indicate that not all spindles are generated by the same statistical process, but this dissociation is not spindle-type specific. Although this dissociation is not topographically specific, a single generator for all spindle types appears unlikely.
Statistical error analysis of reactivity measurement
Energy Technology Data Exchange (ETDEWEB)
Thammaluckan, Sithisak; Hah, Chang Joo [KEPCO International Nuclear Graduate School, Ulsan (Korea, Republic of)
2013-10-15
After statistical analysis, it was confirmed that each group were sampled from same population. It is observed in Table 7 that the mean error decreases as core size increases. Application of bias factor obtained from this research reduces mean error further. The point kinetic model had been used to measure control rod worth without 3D spatial information of neutron flux or power distribution, which causes inaccurate result. Dynamic Control rod Reactivity Measurement (DCRM) was employed to take into account of 3D spatial information of flux in the point kinetics model. The measured bank worth probably contains some uncertainty such as methodology uncertainty and measurement uncertainty. Those uncertainties may varies with size of core and magnitude of reactivity. The goal of this research is to investigate the effect of core size and magnitude of control rod worth on the error of reactivity measurement using statistics.
Adaptive Steganography: A survey of Recent Statistical Aware Steganography Techniques
Directory of Open Access Journals (Sweden)
Manish Mahajan
2012-09-01
Full Text Available Steganography is the science that deals with hiding of secret data in some carrier media which may be image, audio, formatted text or video. The main idea behind this is to conceal the very existence of data. We will be dealing here with image steganography. Many algorithms have been proposed for this purpose in spatial & frequency domain. But in almost all the algorithms it has been noticed that as we embed the secret data in the image the certain characteristics or statistics of the image get disturbed. Based on these disturbed statistics steganalysts can get the reflection about the existence of secret data which they further decode with the help of available steganalytic tools. Steganalysis is a science of attacking the hidden data to get an authorized access. Although steganalysis is not a part of this work but it may be sometimes discussed as a part of literature. Even in steganography we are not purely concerned with spatial or frequency domain rather our main emphasis is on adaptive steganography or model based steganography. Adaptive steganography is not entirely a new branch of steganography rather it is based upon spatial & frequency domain with an additional layer of mathematical model. So here we will be dealing with adaptive steganography which take care about the important characteristics & statistics of the cover image well in advance to the embedding of secret data so that the disturbance of image statistics as mentioned earlier, which attracts the forgery or unauthorized access, can be minimized. In this survey we will analyze the various steganography algorithms which are based upon certain mathematical model or in other words algorithms which come under the category of model based steganography.
Ajorlo, Majid; Abdullah, Ramdzani B; Yusoff, Mohd Kamil; Halim, Ridzwan Abd; Hanif, Ahmad Husni Mohd; Willms, Walter D; Ebrahimian, Mahboubeh
2013-10-01
This study investigates the applicability of multivariate statistical techniques including cluster analysis (CA), discriminant analysis (DA), and factor analysis (FA) for the assessment of seasonal variations in the surface water quality of tropical pastures. The study was carried out in the TPU catchment, Kuala Lumpur, Malaysia. The dataset consisted of 1-year monitoring of 14 parameters at six sampling sites. The CA yielded two groups of similarity between the sampling sites, i.e., less polluted (LP) and moderately polluted (MP) at temporal scale. Fecal coliform (FC), NO3, DO, and pH were significantly related to the stream grouping in the dry season, whereas NH3, BOD, Escherichia coli, and FC were significantly related to the stream grouping in the rainy season. The best predictors for distinguishing clusters in temporal scale were FC, NH3, and E. coli, respectively. FC, E. coli, and BOD with strong positive loadings were introduced as the first varifactors in the dry season which indicates the biological source of variability. EC with a strong positive loading and DO with a strong negative loading were introduced as the first varifactors in the rainy season, which represents the physiochemical source of variability. Multivariate statistical techniques were effective analytical techniques for classification and processing of large datasets of water quality and the identification of major sources of water pollution in tropical pastures.
Design, data analysis and sampling techniques for clinical research
Karthik Suresh; Sanjeev V Thomas; Geetha Suresh
2011-01-01
Statistical analysis is an essential technique that enables a medical research practitioner to draw meaningful inference from their data analysis. Improper application of study design and data analysis may render insufficient and improper results and conclusion. Converting a medical problem into a statistical hypothesis with appropriate methodological and logical design and then back-translating the statistical results into relevant medical knowledge is a real challenge. This article explains...
Statistical techniques for the characterization of partially observed epidemics.
Energy Technology Data Exchange (ETDEWEB)
Safta, Cosmin; Ray, Jaideep; Crary, David (Applied Research Associates, Inc, Arlington, VA); Cheng, Karen (Applied Research Associates, Inc, Arlington, VA)
2010-11-01
Techniques appear promising to construct and integrate automated detect-and-characterize technique for epidemics - Working off biosurveillance data, and provides information on the particular/ongoing outbreak. Potential use - in crisis management and planning, resource allocation - Parameter estimation capability ideal for providing the input parameters into an agent-based model, Index Cases, Time of Infection, infection rate. Non-communicable diseases are easier than communicable ones - Small anthrax can be characterized well with 7-10 days of data, post-detection; plague takes longer, Large attacks are very easy.
On quantum statistics in data analysis
Pavlovic, Dusko
2008-01-01
Originally, quantum probability theory was developed to analyze statistical phenomena in quantum systems, where classical probability theory does not apply, because the lattice of measurable sets is not necessarily distributive. On the other hand, it is well known that the lattices of concepts, that arise in data analysis, are in general also non-distributive, albeit for completely different reasons. In his recent book, van Rijsbergen argues that many of the logical tools developed for quantum systems are also suitable for applications in information retrieval. I explore the mathematical support for this idea on an abstract vector space model, covering several forms of data analysis (information retrieval, data mining, collaborative filtering, formal concept analysis...), and roughly based on an idea from categorical quantum mechanics. It turns out that quantum (i.e., noncommutative) probability distributions arise already in this rudimentary mathematical framework. Moreover, a Bell-type inequality is formula...
Statistical Theory of the Vector Random Decrement Technique
DEFF Research Database (Denmark)
Asmussen, J. C.; Brincker, Rune; Ibrahim, S. R.
1999-01-01
The Vector Random Decrement technique has previously been introduced as an effcient method to transform ambient responses of linear structures into Vector Random Decrement functions which are equivalent to free decays of the current structure. The modal parameters can be extracted from the free d...
Statistical Theory of the Vector Random Decrement Technique
DEFF Research Database (Denmark)
Asmussen, J. C.; Brincker, Rune; Ibrahim, S. R.
1999-01-01
The Vector Random Decrement technique has previously been introduced as an effcient method to transform ambient responses of linear structures into Vector Random Decrement functions which are equivalent to free decays of the current structure. The modal parameters can be extracted from the free d...
Quantitative Techniques in Volumetric Analysis
Zimmerman, John; Jacobsen, Jerrold J.
1996-12-01
Quantitative Techniques in Volumetric Analysis is a visual library of techniques used in making volumetric measurements. This 40-minute VHS videotape is designed as a resource for introducing students to proper volumetric methods and procedures. The entire tape, or relevant segments of the tape, can also be used to review procedures used in subsequent experiments that rely on the traditional art of quantitative analysis laboratory practice. The techniques included are: Quantitative transfer of a solid with a weighing spoon Quantitative transfer of a solid with a finger held weighing bottle Quantitative transfer of a solid with a paper strap held bottle Quantitative transfer of a solid with a spatula Examples of common quantitative weighing errors Quantitative transfer of a solid from dish to beaker to volumetric flask Quantitative transfer of a solid from dish to volumetric flask Volumetric transfer pipet A complete acid-base titration Hand technique variations The conventional view of contemporary quantitative chemical measurement tends to focus on instrumental systems, computers, and robotics. In this view, the analyst is relegated to placing standards and samples on a tray. A robotic arm delivers a sample to the analysis center, while a computer controls the analysis conditions and records the results. In spite of this, it is rare to find an analysis process that does not rely on some aspect of more traditional quantitative analysis techniques, such as careful dilution to the mark of a volumetric flask. Figure 2. Transfer of a solid with a spatula. Clearly, errors in a classical step will affect the quality of the final analysis. Because of this, it is still important for students to master the key elements of the traditional art of quantitative chemical analysis laboratory practice. Some aspects of chemical analysis, like careful rinsing to insure quantitative transfer, are often an automated part of an instrumental process that must be understood by the
Statistical Hot Channel Analysis for the NBSR
Energy Technology Data Exchange (ETDEWEB)
Cuadra A.; Baek J.
2014-05-27
A statistical analysis of thermal limits has been carried out for the research reactor (NBSR) at the National Institute of Standards and Technology (NIST). The objective of this analysis was to update the uncertainties of the hot channel factors with respect to previous analysis for both high-enriched uranium (HEU) and low-enriched uranium (LEU) fuels. Although uncertainties in key parameters which enter into the analysis are not yet known for the LEU core, the current analysis uses reasonable approximations instead of conservative estimates based on HEU values. Cumulative distribution functions (CDFs) were obtained for critical heat flux ratio (CHFR), and onset of flow instability ratio (OFIR). As was done previously, the Sudo-Kaminaga correlation was used for CHF and the Saha-Zuber correlation was used for OFI. Results were obtained for probability levels of 90%, 95%, and 99.9%. As an example of the analysis, the results for both the existing reactor with HEU fuel and the LEU core show that CHFR would have to be above 1.39 to assure with 95% probability that there is no CHF. For the OFIR, the results show that the ratio should be above 1.40 to assure with a 95% probability that OFI is not reached.
MacLean, Adam L; Harrington, Heather A; Stumpf, Michael P H; Byrne, Helen M
2016-01-01
The last decade has seen an explosion in models that describe phenomena in systems medicine. Such models are especially useful for studying signaling pathways, such as the Wnt pathway. In this chapter we use the Wnt pathway to showcase current mathematical and statistical techniques that enable modelers to gain insight into (models of) gene regulation and generate testable predictions. We introduce a range of modeling frameworks, but focus on ordinary differential equation (ODE) models since they remain the most widely used approach in systems biology and medicine and continue to offer great potential. We present methods for the analysis of a single model, comprising applications of standard dynamical systems approaches such as nondimensionalization, steady state, asymptotic and sensitivity analysis, and more recent statistical and algebraic approaches to compare models with data. We present parameter estimation and model comparison techniques, focusing on Bayesian analysis and coplanarity via algebraic geometry. Our intention is that this (non-exhaustive) review may serve as a useful starting point for the analysis of models in systems medicine.
MacLean, Adam L.
2015-12-16
The last decade has seen an explosion in models that describe phenomena in systems medicine. Such models are especially useful for studying signaling pathways, such as the Wnt pathway. In this chapter we use the Wnt pathway to showcase current mathematical and statistical techniques that enable modelers to gain insight into (models of) gene regulation and generate testable predictions. We introduce a range of modeling frameworks, but focus on ordinary differential equation (ODE) models since they remain the most widely used approach in systems biology and medicine and continue to offer great potential. We present methods for the analysis of a single model, comprising applications of standard dynamical systems approaches such as nondimensionalization, steady state, asymptotic and sensitivity analysis, and more recent statistical and algebraic approaches to compare models with data. We present parameter estimation and model comparison techniques, focusing on Bayesian analysis and coplanarity via algebraic geometry. Our intention is that this (non-exhaustive) review may serve as a useful starting point for the analysis of models in systems medicine.
Statistical energy analysis of similarly coupled systems
Institute of Scientific and Technical Information of China (English)
ZHANG Jian
2002-01-01
Based on the principle of Statistical Energy Analysis (SEA) for non-conservatively coupled dynamical systems under non-correlative or correlative excitations, energy relationship between two similar SEA systems is established in the paper. The energy relationship is verified theoretically and experimentally from two similar SEA systems i.e., the structure of a coupled panel-beam and that of a coupled panel-sideframe, in the cases of conservative coupling and non-conservative coupling respectively. As an application of the method, relationship between noise power radiated from two similar cutting systems is studied. Results show that there are good agreements between the theory and the experiments, and the method is valuable to analysis of dynamical problems associated with a complicated system from that with a simple one.
STATISTICAL ANALYSIS OF PUBLIC ADMINISTRATION PAY
Directory of Open Access Journals (Sweden)
Elena I. Dobrolyubova
2014-01-01
Full Text Available This article reviews the progress achieved inimproving the pay system in public administration and outlines the key issues to be resolved.The cross-country comparisons presented inthe article suggest high differentiation in pay levels depending on position held. In fact,this differentiation in Russia exceeds one in OECD almost twofold The analysis of theinternal pay structure demonstrates that thelow share of the base pay leads to perversenature of ‘stimulation elements’ of the paysystem which in fact appear to be used mostlyfor compensation purposes. The analysis of regional statistical data demonstrates thatdespite high differentiation among regionsin terms of their revenue potential, averagepublic ofﬁcial pay is strongly correlated withthe average regional pay.
Some Bayesian statistical techniques useful in estimating frequency and density
Johnson, D.H.
1977-01-01
This paper presents some elementary applications of Bayesian statistics to problems faced by wildlife biologists. Bayesian confidence limits for frequency of occurrence are shown to be generally superior to classical confidence limits. Population density can be estimated from frequency data if the species is sparsely distributed relative to the size of the sample plot. For other situations, limits are developed based on the normal distribution and prior knowledge that the density is non-negative, which insures that the lower confidence limit is non-negative. Conditions are described under which Bayesian confidence limits are superior to those calculated with classical methods; examples are also given on how prior knowledge of the density can be used to sharpen inferences drawn from a new sample.
Wavelet and statistical analysis for melanoma classification
Nimunkar, Amit; Dhawan, Atam P.; Relue, Patricia A.; Patwardhan, Sachin V.
2002-05-01
The present work focuses on spatial/frequency analysis of epiluminesence images of dysplastic nevus and melanoma. A three-level wavelet decomposition was performed on skin-lesion images to obtain coefficients in the wavelet domain. A total of 34 features were obtained by computing ratios of the mean, variance, energy and entropy of the wavelet coefficients along with the mean and standard deviation of image intensity. An unpaired t-test for a normal distribution based features and the Wilcoxon rank-sum test for non-normal distribution based features were performed for selecting statistically correlated features. For our data set, the statistical analysis of features reduced the feature set from 34 to 5 features. For classification, the discriminant functions were computed in the feature space using the Mahanalobis distance. ROC curves were generated and evaluated for false positive fraction from 0.1 to 0.4. Most of the discrimination functions provided a true positive rate for melanoma of 93% with a false positive rate up to 21%.
The NIRS Analysis Package: noise reduction and statistical inference.
Fekete, Tomer; Rubin, Denis; Carlson, Joshua M; Mujica-Parodi, Lilianne R
2011-01-01
Near infrared spectroscopy (NIRS) is a non-invasive optical imaging technique that can be used to measure cortical hemodynamic responses to specific stimuli or tasks. While analyses of NIRS data are normally adapted from established fMRI techniques, there are nevertheless substantial differences between the two modalities. Here, we investigate the impact of NIRS-specific noise; e.g., systemic (physiological), motion-related artifacts, and serial autocorrelations, upon the validity of statistical inference within the framework of the general linear model. We present a comprehensive framework for noise reduction and statistical inference, which is custom-tailored to the noise characteristics of NIRS. These methods have been implemented in a public domain Matlab toolbox, the NIRS Analysis Package (NAP). Finally, we validate NAP using both simulated and actual data, showing marked improvement in the detection power and reliability of NIRS.
Digital Fourier analysis advanced techniques
Kido, Ken'iti
2015-01-01
This textbook is a thorough, accessible introduction to advanced digital Fourier analysis for advanced undergraduate and graduate students. Assuming knowledge of the Fast Fourier Transform, this book covers advanced topics including the Hilbert transform, cepstrum analysis, and the two-dimensional Fourier transform. Saturated with clear, coherent illustrations, "Digital Fourier Analysis - Advanced Techniques" includes practice problems and thorough Appendices. As a central feature, the book includes interactive applets (available online) that mirror the illustrations. These user-friendly applets animate concepts interactively, allowing the user to experiment with the underlying mathematics. The applet source code in Visual Basic is provided online, enabling advanced students to tweak and change the programs for more sophisticated results. A complete, intuitive guide, "Digital Fourier Analysis - Advanced Techniques" is an essential reference for students in science and engineering.
Statistical methods for the detection and analysis of radioactive sources
Klumpp, John
We consider four topics from areas of radioactive statistical analysis in the present study: Bayesian methods for the analysis of count rate data, analysis of energy data, a model for non-constant background count rate distributions, and a zero-inflated model of the sample count rate. The study begins with a review of Bayesian statistics and techniques for analyzing count rate data. Next, we consider a novel system for incorporating energy information into count rate measurements which searches for elevated count rates in multiple energy regions simultaneously. The system analyzes time-interval data in real time to sequentially update a probability distribution for the sample count rate. We then consider a "moving target" model of background radiation in which the instantaneous background count rate is a function of time, rather than being fixed. Unlike the sequential update system, this model assumes a large body of pre-existing data which can be analyzed retrospectively. Finally, we propose a novel Bayesian technique which allows for simultaneous source detection and count rate analysis. This technique is fully compatible with, but independent of, the sequential update system and moving target model.
Triangulation of Data Analysis Techniques
Directory of Open Access Journals (Sweden)
Lauri, M
2011-10-01
Full Text Available In psychology, as in other disciplines, the concepts of validity and reliability are considered essential to give an accurate interpretation of results. While in quantitative research the idea is well established, in qualitative research, validity and reliability take on a different dimension. Researchers like Miles and Huberman (1994 and Silverman (2000, 2001, have shown how these issues are addressed in qualitative research. In this paper I am proposing that the same corpus of data, in this case the transcripts of focus group discussions, can be analysed using more than one data analysis technique. I refer to this idea as ‘triangulation of data analysis techniques’ and argue that such triangulation increases the reliability of the results. If the results obtained through a particular data analysis technique, for example thematic analysis, are congruent with the results obtained by analysing the same transcripts using a different technique, for example correspondence analysis, it is reasonable to argue that the analysis and interpretation of the data is valid.
A textbook of computer based numerical and statistical techniques
Jaiswal, AK
2009-01-01
About the Book: Application of Numerical Analysis has become an integral part of the life of all the modern engineers and scientists. The contents of this book covers both the introductory topics and the more advanced topics such as partial differential equations. This book is different from many other books in a number of ways. Salient Features: Mathematical derivation of each method is given to build the students understanding of numerical analysis. A variety of solved examples are given. Computer programs for almost all numerical methods discussed have been presented in `C` langu
Statistical analysis of tourism destination competitiveness
Directory of Open Access Journals (Sweden)
Attilio Gardini
2013-05-01
Full Text Available The growing relevance of tourism industry for modern advanced economies has increased the interest among researchers and policy makers in the statistical analysis of destination competitiveness. In this paper we outline a new model of destination competitiveness based on sound theoretical grounds and we develop a statistical test of the model on sample data based on Italian tourist destination decisions and choices. Our model focuses on the tourism decision process which starts from the demand schedule for holidays and ends with the choice of a specific holiday destination. The demand schedule is a function of individual preferences and of destination positioning, while the final decision is a function of the initial demand schedule and the information concerning services for accommodation and recreation in the selected destinations. Moreover, we extend previous studies that focused on image or attributes (such as climate and scenery by paying more attention to the services for accommodation and recreation in the holiday destinations. We test the proposed model using empirical data collected from a sample of 1.200 Italian tourists interviewed in 2007 (October - December. Data analysis shows that the selection probability for the destination included in the consideration set is not proportional to the share of inclusion because the share of inclusion is determined by the brand image, while the selection of the effective holiday destination is influenced by the real supply conditions. The analysis of Italian tourists preferences underline the existence of a latent demand for foreign holidays which points out a risk of market share reduction for Italian tourism system in the global market. We also find a snow ball effect which helps the most popular destinations, mainly in the northern Italian regions.
Managing Performance Analysis with Dynamic Statistical Projection Pursuit
Energy Technology Data Exchange (ETDEWEB)
Vetter, J.S.; Reed, D.A.
2000-05-22
Computer systems and applications are growing more complex. Consequently, performance analysis has become more difficult due to the complex, transient interrelationships among runtime components. To diagnose these types of performance issues, developers must use detailed instrumentation to capture a large number of performance metrics. Unfortunately, this instrumentation may actually influence the performance analysis, leading the developer to an ambiguous conclusion. In this paper, we introduce a technique for focusing a performance analysis on interesting performance metrics. This technique, called dynamic statistical projection pursuit, identifies interesting performance metrics that the monitoring system should capture across some number of processors. By reducing the number of performance metrics, projection pursuit can limit the impact of instrumentation on the performance of the target system and can reduce the volume of performance data.
Kassomenos, P.; Vardoulakis, S.; Borge, R.; Lumbreras, J.; Papaloukas, C.; Karakitsios, S.
2010-10-01
In this study, we used and compared three different statistical clustering methods: an hierarchical, a non-hierarchical (K-means) and an artificial neural network technique (self-organizing maps (SOM)). These classification methods were applied to a 4-year dataset of 5 days kinematic back trajectories of air masses arriving in Athens, Greece at 12.00 UTC, in three different heights, above the ground. The atmospheric back trajectories were simulated with the HYSPLIT Vesion 4.7 model of National Oceanic and Atmospheric Administration (NOAA). The meteorological data used for the computation of trajectories were obtained from NOAA reanalysis database. A comparison of the three statistical clustering methods through statistical indices was attempted. It was found that all three statistical methods seem to depend to the arrival height of the trajectories, but the degree of dependence differs substantially. Hierarchical clustering showed the highest level of dependence for fast-moving trajectories to the arrival height, followed by SOM. K-means was found to be the least depended clustering technique on the arrival height. The air quality management applications of these results in relation to PM10 concentrations recorded in Athens, Greece, were also discussed. Differences of PM10 concentrations, during certain clusters, were found statistically different (at 95% confidence level) indicating that these clusters appear to be associated with long-range transportation of particulates. This study can improve the interpretation of modelled atmospheric trajectories, leading to a more reliable analysis of synoptic weather circulation patterns and their impacts on urban air quality.
Statistical analysis of sleep spindle occurrences.
Directory of Open Access Journals (Sweden)
Dagmara Panas
Full Text Available Spindles - a hallmark of stage II sleep - are a transient oscillatory phenomenon in the EEG believed to reflect thalamocortical activity contributing to unresponsiveness during sleep. Currently spindles are often classified into two classes: fast spindles, with a frequency of around 14 Hz, occurring in the centro-parietal region; and slow spindles, with a frequency of around 12 Hz, prevalent in the frontal region. Here we aim to establish whether the spindle generation process also exhibits spatial heterogeneity. Electroencephalographic recordings from 20 subjects were automatically scanned to detect spindles and the time occurrences of spindles were used for statistical analysis. Gamma distribution parameters were fit to each inter-spindle interval distribution, and a modified Wald-Wolfowitz lag-1 correlation test was applied. Results indicate that not all spindles are generated by the same statistical process, but this dissociation is not spindle-type specific. Although this dissociation is not topographically specific, a single generator for all spindle types appears unlikely.
Design of UWB pulse radio transceiver using statistical correlation technique in frequency domain
Directory of Open Access Journals (Sweden)
M. Anis
2007-06-01
Full Text Available In this paper, we propose a new technique to extract low power UWB pulse radio signals, near to noise level, using statistical correlation technique in frequency domain. The receiver consists of many narrow bandpass filters which extract energy either from transmitted UWB signal, interfering channels or noise. Transmitted UWB data can be eliminated by statistical correlation of multiple bandpass filter outputs. Super-regenerative oscillators, tuned within UWB spectrum, are designed as bandpass filters. Summers and comparators perform statistical correlation.
Radar Derived Spatial Statistics of Summer Rain. Volume 2; Data Reduction and Analysis
Konrad, T. G.; Kropfli, R. A.
1975-01-01
Data reduction and analysis procedures are discussed along with the physical and statistical descriptors used. The statistical modeling techniques are outlined and examples of the derived statistical characterization of rain cells in terms of the several physical descriptors are presented. Recommendations concerning analyses which can be pursued using the data base collected during the experiment are included.
Statistical and Managerial Techniques for Six Sigma Methodology Theory and Application
Barone, Stefano
2012-01-01
Statistical and Managerial Techniques for Six Sigma Methodology examines the methodology through illustrating the most widespread tool and techniques involved in Six Sigma application. Both managerial and statistical aspects of Six Sigma will be analyzed, allowing the reader to apply these tools in the field. This book offers an insight on variation and risk management, and focuses on the structure and organizational aspects of the Six Sigma projects. It covers six sigma methodology, basic managerial techniques, basic statistical techniques, methods for variation and risk management and advanc
Statistical Analysis of Bus Networks in India
Chatterjee, Atanu; Ramadurai, Gitakrishnan
2015-01-01
Through the past decade the field of network science has established itself as a common ground for the cross-fertilization of exciting inter-disciplinary studies which has motivated researchers to model almost every physical system as an interacting network consisting of nodes and links. Although public transport networks such as airline and railway networks have been extensively studied, the status of bus networks still remains in obscurity. In developing countries like India, where bus networks play an important role in day-to-day commutation, it is of significant interest to analyze its topological structure and answer some of the basic questions on its evolution, growth, robustness and resiliency. In this paper, we model the bus networks of major Indian cities as graphs in \\textit{L}-space, and evaluate their various statistical properties using concepts from network science. Our analysis reveals a wide spectrum of network topology with the common underlying feature of small-world property. We observe tha...
On two methods of statistical image analysis
Missimer, J; Knorr, U; Maguire, RP; Herzog, H; Seitz, RJ; Tellman, L; Leenders, KL
1999-01-01
The computerized brain atlas (CBA) and statistical parametric mapping (SPM) are two procedures for voxel-based statistical evaluation of PET activation studies. Each includes spatial standardization of image volumes, computation of a statistic, and evaluation of its significance. In addition, smooth
Application of Integration of Spatial Statistical Analysis with GIS to Regional Economic Analysis
Institute of Scientific and Technical Information of China (English)
CHEN Fei; DU Daosheng
2004-01-01
This paper summarizes a few spatial statistical analysis methods for to measuring spatial autocorrelation and spatial association, discusses the criteria for the identification of spatial association by the use of global Moran Coefficient, Local Moran and Local Geary. Furthermore, a user-friendly statistical module, combining spatial statistical analysis methods with GIS visual techniques, is developed in Arcview using Avenue. An example is also given to show the usefulness of this module in identifying and quantifying the underlying spatial association patterns between economic units.
Statistics Analysis Measures Painting of Cooling Tower
Directory of Open Access Journals (Sweden)
A. Zacharopoulou
2013-01-01
Full Text Available This study refers to the cooling tower of Megalopolis (construction 1975 and protection from corrosive environment. The maintenance of the cooling tower took place in 2008. The cooling tower was badly damaged from corrosion of reinforcement. The parabolic cooling towers (factory of electrical power are a typical example of construction, which has a special aggressive environment. The protection of cooling towers is usually achieved through organic coatings. Because of the different environmental impacts on the internal and external side of the cooling tower, a different system of paint application is required. The present study refers to the damages caused by corrosion process. The corrosive environments, the application of this painting, the quality control process, the measures and statistics analysis, and the results were discussed in this study. In the process of quality control the following measurements were taken into consideration: (1 examination of the adhesion with the cross-cut test, (2 examination of the film thickness, and (3 controlling of the pull-off resistance for concrete substrates and paintings. Finally, this study refers to the correlations of measurements, analysis of failures in relation to the quality of repair, and rehabilitation of the cooling tower. Also this study made a first attempt to apply the specific corrosion inhibitors in such a large structure.
An R package for statistical provenance analysis
Vermeesch, Pieter; Resentini, Alberto; Garzanti, Eduardo
2016-05-01
This paper introduces provenance, a software package within the statistical programming environment R, which aims to facilitate the visualisation and interpretation of large amounts of sedimentary provenance data, including mineralogical, petrographic, chemical and isotopic provenance proxies, or any combination of these. provenance comprises functions to: (a) calculate the sample size required to achieve a given detection limit; (b) plot distributional data such as detrital zircon U-Pb age spectra as Cumulative Age Distributions (CADs) or adaptive Kernel Density Estimates (KDEs); (c) plot compositional data as pie charts or ternary diagrams; (d) correct the effects of hydraulic sorting on sandstone petrography and heavy mineral composition; (e) assess the settling equivalence of detrital minerals and grain-size dependence of sediment composition; (f) quantify the dissimilarity between distributional data using the Kolmogorov-Smirnov and Sircombe-Hazelton distances, or between compositional data using the Aitchison and Bray-Curtis distances; (e) interpret multi-sample datasets by means of (classical and nonmetric) Multidimensional Scaling (MDS) and Principal Component Analysis (PCA); and (f) simplify the interpretation of multi-method datasets by means of Generalised Procrustes Analysis (GPA) and 3-way MDS. All these tools can be accessed through an intuitive query-based user interface, which does not require knowledge of the R programming language. provenance is free software released under the GPL-2 licence and will be further expanded based on user feedback.
Analysis of Variance: What Is Your Statistical Software Actually Doing?
Li, Jian; Lomax, Richard G.
2011-01-01
Users assume statistical software packages produce accurate results. In this article, the authors systematically examined Statistical Package for the Social Sciences (SPSS) and Statistical Analysis System (SAS) for 3 analysis of variance (ANOVA) designs, mixed-effects ANOVA, fixed-effects analysis of covariance (ANCOVA), and nested ANOVA. For each…
Statistical Analysis of Designed Experiments Theory and Applications
Tamhane, Ajit C
2012-01-01
A indispensable guide to understanding and designing modern experiments The tools and techniques of Design of Experiments (DOE) allow researchers to successfully collect, analyze, and interpret data across a wide array of disciplines. Statistical Analysis of Designed Experiments provides a modern and balanced treatment of DOE methodology with thorough coverage of the underlying theory and standard designs of experiments, guiding the reader through applications to research in various fields such as engineering, medicine, business, and the social sciences. The book supplies a foundation for the
Statistical energy analysis of complex structures, phase 2
Trudell, R. W.; Yano, L. I.
1980-01-01
A method for estimating the structural vibration properties of complex systems in high frequency environments was investigated. The structure analyzed was the Materials Experiment Assembly, (MEA), which is a portion of the OST-2A payload for the space transportation system. Statistical energy analysis (SEA) techniques were used to model the structure and predict the structural element response to acoustic excitation. A comparison of the intial response predictions and measured acoustic test data is presented. The conclusions indicate that: the SEA predicted the response of primary structure to acoustic excitation over a wide range of frequencies; and the contribution of mechanically induced random vibration to the total MEA is not significant.
Multi-scale statistical analysis of coronal solar activity
Gamborino, Diana; del-Castillo-Negrete, Diego; Martinell, Julio J.
2016-07-01
Multi-filter images from the solar corona are used to obtain temperature maps that are analyzed using techniques based on proper orthogonal decomposition (POD) in order to extract dynamical and structural information at various scales. Exploring active regions before and after a solar flare and comparing them with quiet regions, we show that the multi-scale behavior presents distinct statistical properties for each case that can be used to characterize the level of activity in a region. Information about the nature of heat transport is also to be extracted from the analysis.
Abd-El-Fattah, Sabry M.
2005-01-01
A Partial Least Squares Path Analysis technique was used to test the effect of students' prior experience with computers, statistical self-efficacy, and computer anxiety on their achievement in an introductory statistics course. Computer Anxiety Rating Scale and Current Statistics Self-Efficacy Scale were administered to a sample of 64 first-year…
Research on the integrative strategy of spatial statistical analysis of GIS
Xie, Zhong; Han, Qi Juan; Wu, Liang
2008-12-01
Presently, the spacial social and natural phenomenon is studied by both the GIS technique and statistics methods. However, plenty of complex practical applications restrict these research methods. The data models and technologies exploited are full of special localization. This paper firstly sums up the requirement of spacial statistical analysis. On the base of the requirement, the universal spatial statistical models are transformed into the function tools in statistical GIS system. A pyramidal structure of three layers is brought forward. Therefore, it is feasible to combine the techniques of spacial dada management, searches and visualization in GIS with the methods of processing data in the statistic analysis. It will form an integrative statistical GIS environment with the management, analysis, application and assistant decision-making of spacial statistical information.
Statistical Analysis of Bus Networks in India
2016-01-01
In this paper, we model the bus networks of six major Indian cities as graphs in L-space, and evaluate their various statistical properties. While airline and railway networks have been extensively studied, a comprehensive study on the structure and growth of bus networks is lacking. In India, where bus transport plays an important role in day-to-day commutation, it is of significant interest to analyze its topological structure and answer basic questions on its evolution, growth, robustness and resiliency. Although the common feature of small-world property is observed, our analysis reveals a wide spectrum of network topologies arising due to significant variation in the degree-distribution patterns in the networks. We also observe that these networks although, robust and resilient to random attacks are particularly degree-sensitive. Unlike real-world networks, such as Internet, WWW and airline, that are virtual, bus networks are physically constrained. Our findings therefore, throw light on the evolution of such geographically and constrained networks that will help us in designing more efficient bus networks in the future. PMID:27992590
A statistical trend analysis of ozonesonde data
Tiao, G. C.; Pedrick, J. H.; Allenby, G. M.; Reinsel, G. C.; Mateer, C. L.
1986-01-01
A detailed statistical analysis of monthly averages of ozonesond readings is performed to assess trends in ozone in the troposphere and the lower to midstratosphere. Regression time series models, which include seasonal and trend factors, are estimated for 13 stations located mainly in the midlatitudes of the Northern Hemisphere. At each station, trend estimates are calculated for 14 'fractional' Umkehr layers covering the altitude range from 0 to 33 km. For the 1970-1982 period, the main findings indicate an overall negative trend in ozonesonde data in the lower stratosphere (15-21 km) of about -0.5 percent per year, and some evidence of a positive trend in the troposphere (0-5 km) of about 0.8 percent per year. An in-depth sensitivity study of the trend estimates is performed with respect to various correction procedures used to normalize ozonesonde readings to Dobson total ozone measurements. The main results indicate that the negative trend findings in the 15- to 21-km altitude region are robust to the normalization procedures considered.
Energy Technology Data Exchange (ETDEWEB)
Almazan T, M. G.; Jimenez R, M.; Monroy G, F.; Tenorio, D. [ININ, Carretera Mexico-Toluca s/n, 52750 Ocoyoacac, Estado de Mexico (Mexico); Rodriguez G, N. L. [Instituto Mexiquense de Cultura, Subdireccion de Restauracion y Conservacion, Hidalgo poniente No. 1013, 50080 Toluca, Estado de Mexico (Mexico)
2009-07-01
The elementary composition of archaeological ceramic fragments obtained during the explorations in San Miguel Ixtapan, Mexico State, was determined by the neutron activation analysis technique. The samples irradiation was realized in the research reactor TRIGA Mark III with a neutrons flow of 1centre dot10{sup 13}ncentre dotcm{sup -2}centre dots{sup -1}. The irradiation time was of 2 hours. Previous to the acquisition of the gamma rays spectrum the samples were allowed to decay from 12 to 14 days. The analyzed elements were: Nd, Ce, Lu, Eu, Yb, Pa(Th), Tb, La, Cr, Hf, Sc, Co, Fe, Cs, Rb. The statistical treatment of the data, consistent in the group analysis and the main components analysis allowed to identify three different origins of the archaeological ceramic, designated as: local, foreign and regional. (Author)
Web-Based Statistical Sampling and Analysis
Quinn, Anne; Larson, Karen
2016-01-01
Consistent with the Common Core State Standards for Mathematics (CCSSI 2010), the authors write that they have asked students to do statistics projects with real data. To obtain real data, their students use the free Web-based app, Census at School, created by the American Statistical Association (ASA) to help promote civic awareness among school…
Developments in statistical analysis in quantitative genetics
DEFF Research Database (Denmark)
Sorensen, Daniel
2009-01-01
A remarkable research impetus has taken place in statistical genetics since the last World Conference. This has been stimulated by breakthroughs in molecular genetics, automated data-recording devices and computer-intensive statistical methods. The latter were revolutionized by the bootstrap and ...
Statistical network analysis for analyzing policy networks
DEFF Research Database (Denmark)
Robins, Garry; Lewis, Jenny; Wang, Peng
2012-01-01
and policy network methodology is the development of statistical modeling approaches that can accommodate such dependent data. In this article, we review three network statistical methods commonly used in the current literature: quadratic assignment procedures, exponential random graph models (ERGMs...... has much to offer in analyzing the policy process....
Advanced Techniques of Stress Analysis
Directory of Open Access Journals (Sweden)
Simion TATARU
2013-12-01
Full Text Available This article aims to check the stress analysis technique based on 3D models also making a comparison with the traditional technique which utilizes a model built directly into the stress analysis program. This comparison of the two methods will be made with reference to the rear fuselage of IAR-99 aircraft, structure with a high degree of complexity which allows a meaningful evaluation of both approaches. Three updated databases are envisaged: the database having the idealized model obtained using ANSYS and working directly on documentation, without automatic generation of nodes and elements (with few exceptions, the rear fuselage database (performed at this stage obtained with Pro/ ENGINEER and the one obtained by using ANSYS with the second database. Then, each of the three databases will be used according to arising necessities.The main objective is to develop the parameterized model of the rear fuselage using the computer aided design software Pro/ ENGINEER. A review of research regarding the use of virtual reality with the interactive analysis performed by the finite element method is made to show the state- of- the-art achieved in this field.
Model building techniques for analysis.
Energy Technology Data Exchange (ETDEWEB)
Walther, Howard P.; McDaniel, Karen Lynn; Keener, Donald; Cordova, Theresa Elena; Henry, Ronald C.; Brooks, Sean; Martin, Wilbur D.
2009-09-01
The practice of mechanical engineering for product development has evolved into a complex activity that requires a team of specialists for success. Sandia National Laboratories (SNL) has product engineers, mechanical designers, design engineers, manufacturing engineers, mechanical analysts and experimentalists, qualification engineers, and others that contribute through product realization teams to develop new mechanical hardware. The goal of SNL's Design Group is to change product development by enabling design teams to collaborate within a virtual model-based environment whereby analysis is used to guide design decisions. Computer-aided design (CAD) models using PTC's Pro/ENGINEER software tools are heavily relied upon in the product definition stage of parts and assemblies at SNL. The three-dimensional CAD solid model acts as the design solid model that is filled with all of the detailed design definition needed to manufacture the parts. Analysis is an important part of the product development process. The CAD design solid model (DSM) is the foundation for the creation of the analysis solid model (ASM). Creating an ASM from the DSM currently is a time-consuming effort; the turnaround time for results of a design needs to be decreased to have an impact on the overall product development. This effort can be decreased immensely through simple Pro/ENGINEER modeling techniques that summarize to the method features are created in a part model. This document contains recommended modeling techniques that increase the efficiency of the creation of the ASM from the DSM.
Time Series Analysis Based on Running Mann Whitney Z Statistics
A sensitive and objective time series analysis method based on the calculation of Mann Whitney U statistics is described. This method samples data rankings over moving time windows, converts those samples to Mann-Whitney U statistics, and then normalizes the U statistics to Z statistics using Monte-...
Techniques for Automated Performance Analysis
Energy Technology Data Exchange (ETDEWEB)
Marcus, Ryan C. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2014-09-02
The performance of a particular HPC code depends on a multitude of variables, including compiler selection, optimization flags, OpenMP pool size, file system load, memory usage, MPI configuration, etc. As a result of this complexity, current predictive models have limited applicability, especially at scale. We present a formulation of scientific codes, nodes, and clusters that reduces complex performance analysis to well-known mathematical techniques. Building accurate predictive models and enhancing our understanding of scientific codes at scale is an important step towards exascale computing.
Techniques for Automated Performance Analysis
Energy Technology Data Exchange (ETDEWEB)
Marcus, Ryan C. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2014-09-02
The performance of a particular HPC code depends on a multitude of variables, including compiler selection, optimization flags, OpenMP pool size, file system load, memory usage, MPI configuration, etc. As a result of this complexity, current predictive models have limited applicability, especially at scale. We present a formulation of scientific codes, nodes, and clusters that reduces complex performance analysis to well-known mathematical techniques. Building accurate predictive models and enhancing our understanding of scientific codes at scale is an important step towards exascale computing.
Forensic discrimination of dyed hair color: II. Multivariate statistical analysis.
Barrett, Julie A; Siegel, Jay A; Goodpaster, John V
2011-01-01
This research is intended to assess the ability of UV-visible microspectrophotometry to successfully discriminate the color of dyed hair. Fifty-five red hair dyes were analyzed and evaluated using multivariate statistical techniques including agglomerative hierarchical clustering (AHC), principal component analysis (PCA), and discriminant analysis (DA). The spectra were grouped into three classes, which were visually consistent with different shades of red. A two-dimensional PCA observations plot was constructed, describing 78.6% of the overall variance. The wavelength regions associated with the absorbance of hair and dye were highly correlated. Principal components were selected to represent 95% of the overall variance for analysis with DA. A classification accuracy of 89% was observed for the comprehensive dye set, while external validation using 20 of the dyes resulted in a prediction accuracy of 75%. Significant color loss from successive washing of hair samples was estimated to occur within 3 weeks of dye application.
Statistical Scalability Analysis of Communication Operations in Distributed Applications
Energy Technology Data Exchange (ETDEWEB)
Vetter, J S; McCracken, M O
2001-02-27
Current trends in high performance computing suggest that users will soon have widespread access to clusters of multiprocessors with hundreds, if not thousands, of processors. This unprecedented degree of parallelism will undoubtedly expose scalability limitations in existing applications, where scalability is the ability of a parallel algorithm on a parallel architecture to effectively utilize an increasing number of processors. Users will need precise and automated techniques for detecting the cause of limited scalability. This paper addresses this dilemma. First, we argue that users face numerous challenges in understanding application scalability: managing substantial amounts of experiment data, extracting useful trends from this data, and reconciling performance information with their application's design. Second, we propose a solution to automate this data analysis problem by applying fundamental statistical techniques to scalability experiment data. Finally, we evaluate our operational prototype on several applications, and show that statistical techniques offer an effective strategy for assessing application scalability. In particular, we find that non-parametric correlation of the number of tasks to the ratio of the time for individual communication operations to overall communication time provides a reliable measure for identifying communication operations that scale poorly.
Towards a Statistical Methodology to Evaluate Program Speedups and their Optimisation Techniques
Touati, Sid
2009-01-01
The community of program optimisation and analysis, code performance evaluation, parallelisation and optimising compilation has published since many decades hundreds of research and engineering articles in major conferences and journals. These articles study efficient algorithms, strategies and techniques to accelerate programs execution times, or optimise other performance metrics (MIPS, code size, energy/power, MFLOPS, etc.). Many speedups are published, but nobody is able to reproduce them exactly. The non-reproducibility of our research results is a dark point of the art, and we cannot be qualified as {\\it computer scientists} if we do not provide rigorous experimental methodology. This article provides a first effort towards a correct statistical protocol for analysing and measuring speedups. As we will see, some common mistakes are done by the community inside published articles, explaining part of the non-reproducibility of the results. Our current article is not sufficient by its own to deliver a comp...
Marzahn, Philip; Rieke-Zapp, Dirk; Ludwig, Ralf
2010-05-01
Micro scale soil surface roughness is a crucial parameter in many environmental applications. Recent soil erosion studies have shown the impact of micro topography on soil erosion rates as well as overland flow generation due to soil crusting effects. Besides the above mentioned, it is widely recognized that the backscattered signal in SAR remote sensing is strongly influenced by soil surface roughness and by regular higher order tillage patterns. However, there is an ambiguity in the appropriate measurement technique and scale for roughness studies and SAR backscatter model parametrization. While different roughness indices depend on their measurement length, no satisfying roughness parametrization and measurement technique has been found yet, introducing large uncertainty in the interpretation of the radar backscatter. In the presented study, we computed high resolution digital elevation models (DEM) using a consumer grade digital camera in the frame of photogrammetric imaging techniques to represent soil micro topography from different soil surfaces (ploughed, harrowed, seedbed and crusted) . The retrieved DEMs showed sufficient accuracy, with an RMSE of a 1.64 mm compared to high accurate reference points,. For roughness characterization, we calculated different roughness indices (RMS height (s), autocorrelation length (l), tortuosity index (TB)). In an extensive statistical investigation we show the behaviour of the roughness indices for different acquisition sizes. Compared to results from profile measurements taken from literature and profiles generated out of the dataset, results indicate,that by using a three dimensional measuring device, the calculated roughness indices are more robust against outliers and even saturate faster with increasing acquisition size. Dependent on the roughness condition, the calculated values for the RMS-height saturate for ploughed fields at 2.3 m, for harrowed fields at 2.0 m and for crusted fields at 1.2 m. Results also
McCray, Wilmon Wil L., Jr.
The research was prompted by a need to conduct a study that assesses process improvement, quality management and analytical techniques taught to students in U.S. colleges and universities undergraduate and graduate systems engineering and the computing science discipline (e.g., software engineering, computer science, and information technology) degree programs during their academic training that can be applied to quantitatively manage processes for performance. Everyone involved in executing repeatable processes in the software and systems development lifecycle processes needs to become familiar with the concepts of quantitative management, statistical thinking, process improvement methods and how they relate to process-performance. Organizations are starting to embrace the de facto Software Engineering Institute (SEI) Capability Maturity Model Integration (CMMI RTM) Models as process improvement frameworks to improve business processes performance. High maturity process areas in the CMMI model imply the use of analytical, statistical, quantitative management techniques, and process performance modeling to identify and eliminate sources of variation, continually improve process-performance; reduce cost and predict future outcomes. The research study identifies and provides a detail discussion of the gap analysis findings of process improvement and quantitative analysis techniques taught in U.S. universities systems engineering and computing science degree programs, gaps that exist in the literature, and a comparison analysis which identifies the gaps that exist between the SEI's "healthy ingredients " of a process performance model and courses taught in U.S. universities degree program. The research also heightens awareness that academicians have conducted little research on applicable statistics and quantitative techniques that can be used to demonstrate high maturity as implied in the CMMI models. The research also includes a Monte Carlo simulation optimization
Institute of Scientific and Technical Information of China (English)
Pan Jiayi; Chin-Pang Jack Cheng; Gloria T. Lau; Kincho H. Law
2008-01-01
The objective of this paper is to introduce three semi-automated approaches for ontology mapping using relatedness analysis techniques. In the architecture, engineering, and construction (AEC) industry, there exist a number of ontological standards to describe the semantics of building models. Although the standards share similar scopes of interest, the task of comparing and mapping concepts among standards is challenging due to their differences in terminologies and perspectives. Ontology mapping is therefore necessary to achieve information interoperability, which allows two or more information sources to exchange data and to re-use the data for further purposes. The attribute-based approach, corpus-based approach, and name-based approach presented in this paper adopt the statistical relatedness analysis techniques to discover related concepts from heterogeneous ontologies. A pilot study is conducted on IFC and CIS/2 ontologies to evaluate the approaches. Preliminary results show that the attribute-based approach outperforms the other two approaches in terms of precision and F-measure.
Processes and subdivisions in diogenites, a multivariate statistical analysis
Harriott, T. A.; Hewins, R. H.
1984-01-01
Multivariate statistical techniques used on diogenite orthopyroxene analyses show the relationships that occur within diogenites and the two orthopyroxenite components (class I and II) in the polymict diogenite Garland. Cluster analysis shows that only Peckelsheim is similar to Garland class I (Fe-rich) and the other diogenites resemble Garland class II. The unique diogenite Y 75032 may be related to type I by fractionation. Factor analysis confirms the subdivision and shows that Fe does not correlate with the weakly incompatible elements across the entire pyroxene composition range, indicating that igneous fractionation is not the process controlling total diogenite composition variation. The occurrence of two groups of diogenites is interpreted as the result of sampling or mixing of two main sequences of orthopyroxene cumulates with slightly different compositions.
Statistical learning analysis in neuroscience: aiming for transparency
Directory of Open Access Journals (Sweden)
Michael Hanke
2010-05-01
Full Text Available Encouraged by a rise of reciprocal interest between the machine learning and neuroscience communities, several recent studies have demonstrated the explanatory power of statistical learning techniques for the analysis of neural data. In order to facilitate a wider adoption of these methods neuroscientific research needs to ensure a maximum of transparency to allow for comprehensive evaluation of the employed procedures. We argue that such transparency requires ``neuroscience-aware'' technology for the performance of multivariate pattern analyses of neural data that can be documented in a comprehensive, yet comprehensible way. Recently, we introduced PyMVPA, a specialized Python framework for machine learning based data analysis that addresses this demand. Here we review its features and applicability to various neural data modalities.
Statistical learning analysis in neuroscience: aiming for transparency.
Hanke, Michael; Halchenko, Yaroslav O; Haxby, James V; Pollmann, Stefan
2010-01-01
Encouraged by a rise of reciprocal interest between the machine learning and neuroscience communities, several recent studies have demonstrated the explanatory power of statistical learning techniques for the analysis of neural data. In order to facilitate a wider adoption of these methods, neuroscientific research needs to ensure a maximum of transparency to allow for comprehensive evaluation of the employed procedures. We argue that such transparency requires "neuroscience-aware" technology for the performance of multivariate pattern analyses of neural data that can be documented in a comprehensive, yet comprehensible way. Recently, we introduced PyMVPA, a specialized Python framework for machine learning based data analysis that addresses this demand. Here, we review its features and applicability to various neural data modalities.
Statistical Analysis Of Data Sets Legislative Type
Directory of Open Access Journals (Sweden)
Gheorghe Săvoiu
2013-06-01
Full Text Available This paper identifies some characteristic statistical aspects of the annual legislation’s dynamics and structure in the socio-economic system that had defined Romania, over the last two decades. After a brief introduction devoted to the concepts of social and economic system (SES and societal computerized management (SCM in Romania, first section describes the indicators, the specific database and the investigative method and a second section presents some descriptive statistics on the suggestive abnormality of the data series on the legislation of the last 20 years. A final remark underlines the difficult context of Romania’s legislative adjustment to EU requirements.
Design, data analysis and sampling techniques for clinical research.
Suresh, Karthik; Thomas, Sanjeev V; Suresh, Geetha
2011-10-01
Statistical analysis is an essential technique that enables a medical research practitioner to draw meaningful inference from their data analysis. Improper application of study design and data analysis may render insufficient and improper results and conclusion. Converting a medical problem into a statistical hypothesis with appropriate methodological and logical design and then back-translating the statistical results into relevant medical knowledge is a real challenge. This article explains various sampling methods that can be appropriately used in medical research with different scenarios and challenges.
How well do test case prioritization techniques support statistical fault localization
Tse, TH; Jiang, B.; Zhang, Z; Chen, TY
2009-01-01
In continuous integration, a tight integration of test case prioritization techniques and fault-localization techniques may both expose failures faster and locate faults more effectively. Statistical fault-localization techniques use the execution information collected during testing to locate faults. Executing a small fraction of a prioritized test suite reduces the cost of testing, and yet the subsequent fault localization may suffer. This paper presents the first empirical study to examine...
Visualization methods for statistical analysis of microarray clusters
Directory of Open Access Journals (Sweden)
Li Kai
2005-05-01
Full Text Available Abstract Background The most common method of identifying groups of functionally related genes in microarray data is to apply a clustering algorithm. However, it is impossible to determine which clustering algorithm is most appropriate to apply, and it is difficult to verify the results of any algorithm due to the lack of a gold-standard. Appropriate data visualization tools can aid this analysis process, but existing visualization methods do not specifically address this issue. Results We present several visualization techniques that incorporate meaningful statistics that are noise-robust for the purpose of analyzing the results of clustering algorithms on microarray data. This includes a rank-based visualization method that is more robust to noise, a difference display method to aid assessments of cluster quality and detection of outliers, and a projection of high dimensional data into a three dimensional space in order to examine relationships between clusters. Our methods are interactive and are dynamically linked together for comprehensive analysis. Further, our approach applies to both protein and gene expression microarrays, and our architecture is scalable for use on both desktop/laptop screens and large-scale display devices. This methodology is implemented in GeneVAnD (Genomic Visual ANalysis of Datasets and is available at http://function.princeton.edu/GeneVAnD. Conclusion Incorporating relevant statistical information into data visualizations is key for analysis of large biological datasets, particularly because of high levels of noise and the lack of a gold-standard for comparisons. We developed several new visualization techniques and demonstrated their effectiveness for evaluating cluster quality and relationships between clusters.
Pestman, Wiebe R
2009-01-01
This textbook provides a broad and solid introduction to mathematical statistics, including the classical subjects hypothesis testing, normal regression analysis, and normal analysis of variance. In addition, non-parametric statistics and vectorial statistics are considered, as well as applications of stochastic analysis in modern statistics, e.g., Kolmogorov-Smirnov testing, smoothing techniques, robustness and density estimation. For students with some elementary mathematical background. With many exercises. Prerequisites from measure theory and linear algebra are presented.
Statistical analysis of protein kinase specificity determinants
DEFF Research Database (Denmark)
Kreegipuu, Andres; Blom, Nikolaj; Brunak, Søren;
1998-01-01
The site and sequence specificity of protein kinase, as well as the role of the secondary structure and surface accessibility of the phosphorylation sites on substrate proteins, was statistically analyzed. The experimental data were collected from the literature and are available on the World Wide...
Statistical Analysis in Dental Research Papers.
1983-08-08
Clinical Trials of Agents used in the Prevention and Treatment of Periodontal Diseases (5). Standards sumary data must be included statistical methods...situations under clinical conditions. Research: qualitative research, as in joint tomography , where the results and conclusions are not amenable to
Statistical analysis of medical data using SAS
Der, Geoff
2005-01-01
An Introduction to SASDescribing and Summarizing DataBasic InferenceScatterplots Correlation: Simple Regression and SmoothingAnalysis of Variance and CovarianceMultiple RegressionLogistic RegressionThe Generalized Linear ModelGeneralized Additive ModelsNonlinear Regression ModelsThe Analysis of Longitudinal Data IThe Analysis of Longitudinal Data II: Models for Normal Response VariablesThe Analysis of Longitudinal Data III: Non-Normal ResponseSurvival AnalysisAnalysis Multivariate Date: Principal Components and Cluster AnalysisReferences
Directory of Open Access Journals (Sweden)
Priya Ranganathan
2015-01-01
Full Text Available In the second part of a series on pitfalls in statistical analysis, we look at various ways in which a statistically significant study result can be expressed. We debunk some of the myths regarding the ′P′ value, explain the importance of ′confidence intervals′ and clarify the importance of including both values in a paper
Ranganathan, Priya; Pramesh, C. S.; Buyse, Marc
2015-01-01
In the second part of a series on pitfalls in statistical analysis, we look at various ways in which a statistically significant study result can be expressed. We debunk some of the myths regarding the ‘P’ value, explain the importance of ‘confidence intervals’ and clarify the importance of including both values in a paper PMID:25878958
Notes on numerical reliability of several statistical analysis programs
Landwehr, J.M.; Tasker, Gary D.
1999-01-01
This report presents a benchmark analysis of several statistical analysis programs currently in use in the USGS. The benchmark consists of a comparison between the values provided by a statistical analysis program for variables in the reference data set ANASTY and their known or calculated theoretical values. The ANASTY data set is an amendment of the Wilkinson NASTY data set that has been used in the statistical literature to assess the reliability (computational correctness) of calculated analytical results.
Fundamentals of statistical experimental design and analysis
Easterling, Robert G
2015-01-01
Professionals in all areas - business; government; the physical, life, and social sciences; engineering; medicine, etc. - benefit from using statistical experimental design to better understand their worlds and then use that understanding to improve the products, processes, and programs they are responsible for. This book aims to provide the practitioners of tomorrow with a memorable, easy to read, engaging guide to statistics and experimental design. This book uses examples, drawn from a variety of established texts, and embeds them in a business or scientific context, seasoned with a dash of humor, to emphasize the issues and ideas that led to the experiment and the what-do-we-do-next? steps after the experiment. Graphical data displays are emphasized as means of discovery and communication and formulas are minimized, with a focus on interpreting the results that software produce. The role of subject-matter knowledge, and passion, is also illustrated. The examples do not require specialized knowledge, and t...
Common misconceptions about data analysis and statistics.
Motulsky, Harvey J
2015-02-01
Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: (1) P-Hacking. This is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want. (2) Overemphasis on P values rather than on the actual size of the observed effect. (3) Overuse of statistical hypothesis testing, and being seduced by the word "significant". (4) Overreliance on standard errors, which are often misunderstood.
Critical analysis of adsorption data statistically
Kaushal, Achla; Singh, S. K.
2016-09-01
Experimental data can be presented, computed, and critically analysed in a different way using statistics. A variety of statistical tests are used to make decisions about the significance and validity of the experimental data. In the present study, adsorption was carried out to remove zinc ions from contaminated aqueous solution using mango leaf powder. The experimental data was analysed statistically by hypothesis testing applying t test, paired t test and Chi-square test to (a) test the optimum value of the process pH, (b) verify the success of experiment and (c) study the effect of adsorbent dose in zinc ion removal from aqueous solutions. Comparison of calculated and tabulated values of t and χ 2 showed the results in favour of the data collected from the experiment and this has been shown on probability charts. K value for Langmuir isotherm was 0.8582 and m value for Freundlich adsorption isotherm obtained was 0.725, both are Pearson's correlation coefficient values for Langmuir and Freundlich adsorption isotherms were obtained as 0.99 and 0.95 respectively, which show higher degree of correlation between the variables. This validates the data obtained for adsorption of zinc ions from the contaminated aqueous solution with the help of mango leaf powder.
Shiavi, Richard
2007-01-01
Introduction to Applied Statistical Signal Analysis is designed for the experienced individual with a basic background in mathematics, science, and computer. With this predisposed knowledge, the reader will coast through the practical introduction and move on to signal analysis techniques, commonly used in a broad range of engineering areas such as biomedical engineering, communications, geophysics, and speech.Introduction to Applied Statistical Signal Analysis intertwines theory and implementation with practical examples and exercises. Topics presented in detail include: mathematical
Bliefernicht, Jan; Laux, Patrick; Siegmund, Jonatan; Kunstmann, Harald
2013-04-01
The development and application of statistical techniques with a special focus on a recalibration of meteorological or hydrological forecasts to eliminate the bias between forecasts and observations has received a great deal of attention in recent years. One reason is that retrospective forecasts are nowadays available which allows for a proper training and validation of this kind of techniques. The objective of this presentation is to propose several statistical techniques with different degree of complexity and to evaluate and compare their performance for a recalibration of seasonal ensemble forecasts of monthly precipitation. The techniques selected in this study range from straightforward normal score and quantile-quantile transformation, local scaling, to more sophisticated and novel statistical techniques such as Copula-based methodology recently proposed by Laux et al. (2011). The seasonal forecasts are derived from the Climate Forecast System Version 2. This version is the current coupled ocean-atmosphere general circulation model of the U.S. National Centers for Environmental Prediction used to provide forecasts up to nine months. The CFS precipitation forecasts are compared to monthly precipitation observations from the Global Precipitation Climatology Centre. The statistical techniques are tested for semi-arid regions in West Africa and the Indian subcontinent focusing on large-scale river basins such as the Ganges and the Volta basin. In both regions seasonal precipitation forecasts are a crucial source of information for the prediction of hydro-meteorological extremes, in particular for droughts. The evaluation is done using retrospective CFS ensemble forecast from 1982 to 2009. The training of the statistical techniques is done in a cross-validation mode. The outcome of this investigation illustrates large systematic differences between forecasts and observations, in particular for the Volta basin in West Africa. The selection of straightforward
Goetz, J. N.; Brenning, A.; Petschko, H.; Leopold, P.
2015-08-01
Statistical and now machine learning prediction methods have been gaining popularity in the field of landslide susceptibility modeling. Particularly, these data driven approaches show promise when tackling the challenge of mapping landslide prone areas for large regions, which may not have sufficient geotechnical data to conduct physically-based methods. Currently, there is no best method for empirical susceptibility modeling. Therefore, this study presents a comparison of traditional statistical and novel machine learning models applied for regional scale landslide susceptibility modeling. These methods were evaluated by spatial k-fold cross-validation estimation of the predictive performance, assessment of variable importance for gaining insights into model behavior and by the appearance of the prediction (i.e. susceptibility) map. The modeling techniques applied were logistic regression (GLM), generalized additive models (GAM), weights of evidence (WOE), the support vector machine (SVM), random forest classification (RF), and bootstrap aggregated classification trees (bundling) with penalized discriminant analysis (BPLDA). These modeling methods were tested for three areas in the province of Lower Austria, Austria. The areas are characterized by different geological and morphological settings. Random forest and bundling classification techniques had the overall best predictive performances. However, the performances of all modeling techniques were for the majority not significantly different from each other; depending on the areas of interest, the overall median estimated area under the receiver operating characteristic curve (AUROC) differences ranged from 2.9 to 8.9 percentage points. The overall median estimated true positive rate (TPR) measured at a 10% false positive rate (FPR) differences ranged from 11 to 15pp. The relative importance of each predictor was generally different between the modeling methods. However, slope angle, surface roughness and plan
Koltick, David; Wang, Haoyu; Liu, Shih-Chieh; Heim, Jordan; Nistor, Jonathan
2016-03-01
Typical nuclear decay constants are measured at the accuracy level of 10-2. There are numerous reasons: tests of unconventional theories, dating of materials, and long term inventory evolution which require decay constants accuracy at a level of 10-4 to 10-5. The statistical and systematic errors associated with precision measurements of decays using the counting technique are presented. Precision requires high count rates, which introduces time dependent dead time and pile-up corrections. An approach to overcome these issues is presented by continuous recording of the detector current. Other systematic corrections include, the time dependent dead time due to background radiation, control of target motion and radiation flight path variation due to environmental conditions, and the time dependent effects caused by scattered events are presented. The incorporation of blind experimental techniques can help make measurement independent of past results. A spectrometer design and data analysis is reviewed that can accomplish these goals. The author would like to thank TechSource, Inc. and Advanced Physics Technologies, LLC. for their support in this work.
Multivariate Statistical Analysis Applied in Wine Quality Evaluation
Directory of Open Access Journals (Sweden)
Jieling Zou
2015-08-01
Full Text Available This study applies multivariate statistical approaches to wine quality evaluation. With 27 red wine samples, four factors were identified out of 12 parameters by principal component analysis, explaining 89.06% of the total variance of data. As iterative weights calculated by the BP neural network revealed little difference from weights determined by information entropy method, the latter was chosen to measure the importance of indicators. Weighted cluster analysis performs well in classifying the sample group further into two sub-clusters. The second cluster of red wine samples, compared with its first, was lighter in color, tasted thinner and had fainter bouquet. Weighted TOPSIS method was used to evaluate the quality of wine in each sub-cluster. With scores obtained, each sub-cluster was divided into three grades. On the whole, the quality of lighter red wine was slightly better than the darker category. This study shows the necessity and usefulness of multivariate statistical techniques in both wine quality evaluation and parameter selection.
Hayslett, H T
1991-01-01
Statistics covers the basic principles of Statistics. The book starts by tackling the importance and the two kinds of statistics; the presentation of sample data; the definition, illustration and explanation of several measures of location; and the measures of variation. The text then discusses elementary probability, the normal distribution and the normal approximation to the binomial. Testing of statistical hypotheses and tests of hypotheses about the theoretical proportion of successes in a binomial population and about the theoretical mean of a normal population are explained. The text the
Prefractionation techniques in proteome analysis.
Righetti, Pier Giorgio; Castagna, Annalisa; Herbert, Ben; Reymond, Frederic; Rossier, Joël S
2003-08-01
The present review deals with a number of prefractionation protocols in preparation for two-dimensional map analysis, both in the fields of chromatography and in the field of electrophoresis. In the first case, Fountoulaki's groups has reported just about any chromatographic procedure useful as a prefractionation step, including affinity, ion-exchange, and reversed-phase resins. As a result of the various enrichment steps, several hundred new species, previously undetected in unfractionated samples, could be revealed for the first time. Electrophoretic prefractionation protocols include all those electrokinetic methodologies which are performed in free solution, essentially all relying on isoelectric focusing steps. The devices here reviewed include multichamber apparatus, such as the multicompartment electrolyzer with Immobiline membranes, Off-Gel electrophoresis in a multicup device and the Rotofor, an instrument also based on a multichamber system but exploiting the conventional technique of carrier-ampholyte-focusing. Other instruments of interest are the Octopus, a continuous-flow device for isoelectric focusing in a upward flowing liquid curtain, and the Gradiflow, where different pI cuts are obtained by a multistep passage through two compartments buffered at different pH values. It is felt that this panoply of methods could offer a strong step forward in "mining below the tip of the iceberg" for detecting the "unseen proteome".
Statistical analysis of life history calendar data.
Eerola, Mervi; Helske, Satu
2016-04-01
The life history calendar is a data-collection tool for obtaining reliable retrospective data about life events. To illustrate the analysis of such data, we compare the model-based probabilistic event history analysis and the model-free data mining method, sequence analysis. In event history analysis, we estimate instead of transition hazards the cumulative prediction probabilities of life events in the entire trajectory. In sequence analysis, we compare several dissimilarity metrics and contrast data-driven and user-defined substitution costs. As an example, we study young adults' transition to adulthood as a sequence of events in three life domains. The events define the multistate event history model and the parallel life domains in multidimensional sequence analysis. The relationship between life trajectories and excess depressive symptoms in middle age is further studied by their joint prediction in the multistate model and by regressing the symptom scores on individual-specific cluster indices. The two approaches complement each other in life course analysis; sequence analysis can effectively find typical and atypical life patterns while event history analysis is needed for causal inquiries.
Statistical analysis of Contact Angle Hysteresis
Janardan, Nachiketa; Panchagnula, Mahesh
2015-11-01
We present the results of a new statistical approach to determining Contact Angle Hysteresis (CAH) by studying the nature of the triple line. A statistical distribution of local contact angles on a random three-dimensional drop is used as the basis for this approach. Drops with randomly shaped triple lines but of fixed volumes were deposited on a substrate and their triple line shapes were extracted by imaging. Using a solution developed by Prabhala et al. (Langmuir, 2010), the complete three dimensional shape of the sessile drop was generated. A distribution of the local contact angles for several such drops but of the same liquid-substrate pairs is generated. This distribution is a result of several microscopic advancing and receding processes along the triple line. This distribution is used to yield an approximation of the CAH associated with the substrate. This is then compared with measurements of CAH by means of a liquid infusion-withdrawal experiment. Static measurements are shown to be sufficient to measure quasistatic contact angle hysteresis of a substrate. The approach also points towards the relationship between microscopic triple line contortions and CAH.
A survey of image processing techniques and statistics for ballistic specimens in forensic science.
Gerules, George; Bhatia, Sanjiv K; Jackson, Daniel E
2013-06-01
This paper provides a review of recent investigations on the image processing techniques used to match spent bullets and cartridge cases. It is also, to a lesser extent, a review of the statistical methods that are used to judge the uniqueness of fired bullets and spent cartridge cases. We review 2D and 3D imaging techniques as well as many of the algorithms used to match these images. We also provide a discussion of the strengths and weaknesses of these methods for both image matching and statistical uniqueness. The goal of this paper is to be a reference for investigators and scientists working in this field.
Supermarket Analysis Based On Product Discount and Statistics
Directory of Open Access Journals (Sweden)
Komal Kumawat
2014-03-01
Full Text Available E-commerce has been growing rapidly. Its domain can provide all the right ingredients for successful data mining and it is a significant domain of data mining. E commerce refers to buying and selling of products or services over electronic systems such as internet. Various e commerce systems give discount on product and allow user to buy product online. The basic idea used here is to predict the product sale based on discount applied to the product. Our analysis concentrates on how customer behaves when discount is allotted to him. We have developed a model which finds the customer behaviour when discount is applied to the product. This paper elaborates upon how a different technique like session, click stream is used to collect user data online based on discount applied to the product and how statistics is applied to data set to see the variation in the data.
Statistical analysis of $k$-nearest neighbor collaborative recommendation
Biau, Gérard; Rouvière, Laurent; 10.1214/09-AOS759
2010-01-01
Collaborative recommendation is an information-filtering technique that attempts to present information items that are likely of interest to an Internet user. Traditionally, collaborative systems deal with situations with two types of variables, users and items. In its most common form, the problem is framed as trying to estimate ratings for items that have not yet been consumed by a user. Despite wide-ranging literature, little is known about the statistical properties of recommendation systems. In fact, no clear probabilistic model even exists which would allow us to precisely describe the mathematical forces driving collaborative filtering. To provide an initial contribution to this, we propose to set out a general sequential stochastic model for collaborative recommendation. We offer an in-depth analysis of the so-called cosine-type nearest neighbor collaborative method, which is one of the most widely used algorithms in collaborative filtering, and analyze its asymptotic performance as the number of user...
Statistical simulation and counterfactual analysis in social sciences
Directory of Open Access Journals (Sweden)
François Gélineau
2012-06-01
Full Text Available In this paper, we present statistical simulation techniques of interest in substantial interpretation of regression results. Taking stock of recent literature on causality, we argue that such techniques can operate within a counterfactual framework. To illustrate, we report findings using post-electoral data on voter turnout.
Book review: Statistical Analysis and Modelling of Spatial Point Patterns
DEFF Research Database (Denmark)
Møller, Jesper
2009-01-01
Statistical Analysis and Modelling of Spatial Point Patterns by J. Illian, A. Penttinen, H. Stoyan and D. Stoyan. Wiley (2008), ISBN 9780470014912......Statistical Analysis and Modelling of Spatial Point Patterns by J. Illian, A. Penttinen, H. Stoyan and D. Stoyan. Wiley (2008), ISBN 9780470014912...
Statistical Modelling of Wind Proles - Data Analysis and Modelling
DEFF Research Database (Denmark)
Jónsson, Tryggvi; Pinson, Pierre
The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles.......The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles....
Directory of Open Access Journals (Sweden)
Vujović Svetlana R.
2013-01-01
Full Text Available This paper illustrates the utility of multivariate statistical techniques for analysis and interpretation of water quality data sets and identification of pollution sources/factors with a view to get better information about the water quality and design of monitoring network for effective management of water resources. Multivariate statistical techniques, such as factor analysis (FA/principal component analysis (PCA and cluster analysis (CA, were applied for the evaluation of variations and for the interpretation of a water quality data set of the natural water bodies obtained during 2010 year of monitoring of 13 parameters at 33 different sites. FA/PCA attempts to explain the correlations between the observations in terms of the underlying factors, which are not directly observable. Factor analysis is applied to physico-chemical parameters of natural water bodies with the aim classification and data summation as well as segmentation of heterogeneous data sets into smaller homogeneous subsets. Factor loadings were categorized as strong and moderate corresponding to the absolute loading values of >0.75, 0.75-0.50, respectively. Four principal factors were obtained with Eigenvalues >1 summing more than 78 % of the total variance in the water data sets, which is adequate to give good prior information regarding data structure. Each factor that is significantly related to specific variables represents a different dimension of water quality. The first factor F1 accounting for 28 % of the total variance and represents the hydrochemical dimension of water quality. The second factor F2 accounting for 18% of the total variance and may be taken factor of water eutrophication. The third factor F3 accounting 17 % of the total variance and represents the influence of point sources of pollution on water quality. The fourth factor F4 accounting 13 % of the total variance and may be taken as an ecological dimension of water quality. Cluster analysis (CA is an
Classification of human colonic tissues using FTIR spectra and advanced statistical techniques
Zwielly, A.; Argov, S.; Salman, A.; Bogomolny, E.; Mordechai, S.
2010-04-01
One of the major public health hazards is colon cancer. There is a great necessity to develop new methods for early detection of cancer. If colon cancer is detected and treated early, cure rate of more than 90% can be achieved. In this study we used FTIR microscopy (MSP), which has shown a good potential in the last 20 years in the fields of medical diagnostic and early detection of abnormal tissues. Large database of FTIR microscopic spectra was acquired from 230 human colonic biopsies. Five different subgroups were included in our database, normal and cancer tissues as well as three stages of benign colonic polyps, namely, mild, moderate and severe polyps which are precursors of carcinoma. In this study we applied advanced mathematical and statistical techniques including principal component analysis (PCA) and linear discriminant analysis (LDA), on human colonic FTIR spectra in order to differentiate among the mentioned subgroups' tissues. Good classification accuracy between normal, polyps and cancer groups was achieved with approximately 85% success rate. Our results showed that there is a great potential of developing FTIR-micro spectroscopy as a simple, reagent-free viable tool for early detection of colon cancer in particular the early stages of premalignancy among the benign colonic polyps.
Statistical methods for categorical data analysis
Powers, Daniel
2008-01-01
This book provides a comprehensive introduction to methods and models for categorical data analysis and their applications in social science research. Companion website also available, at https://webspace.utexas.edu/dpowers/www/
Statistical analysis: the need, the concept, and the usage
Directory of Open Access Journals (Sweden)
Naduvilath Thomas
1998-01-01
Full Text Available In general, better understanding of the need and usage of statistics would benefit the medical community in India. This paper explains why statistical analysis is needed, and what is the conceptual basis for it. Ophthalmic data are used as examples. The concept of sampling variation is explained to further corroborate the need for statistical analysis in medical research. Statistical estimation and testing of hypothesis which form the major components of statistical inference are construed. Commonly reported univariate and multivariate statistical tests are explained in order to equip the ophthalmologist with basic knowledge of statistics for better understanding of research data. It is felt that this understanding would facilitate well designed investigations ultimately leading to higher quality practice of ophthalmology in our country.
Neiroukh, Osama
2011-01-01
A new approach for enhancing the process-variation tolerance of digital circuits is described. We extend recent advances in statistical timing analysis into an optimization framework. Our objective is to reduce the performance variance of a technology-mapped circuit where delays across elements are represented by random variables which capture the manufacturing variations. We introduce the notion of statistical critical paths, which account for both means and variances of performance variation. An optimization engine is used to size gates with a goal of reducing the timing variance along the statistical critical paths. We apply a pair of nested statistical analysis methods deploying a slower more accurate approach for tracking statistical critical paths and a fast engine for evaluation of gate size assignments. We derive a new approximation for the max operation on random variables which is deployed for the faster inner engine. Circuit optimization is carried out using a gain-based algorithm that terminates w...
Statistical analysis of concrete quality testing results
Directory of Open Access Journals (Sweden)
Jevtić Dragica
2014-01-01
Full Text Available This paper statistically investigates the testing results of compressive strength and density of control concrete specimens tested in the Laboratory for materials, Faculty of Civil Engineering, University of Belgrade, during 2012. The total number of 4420 concrete specimens were tested, which were sampled on different locations - either on concrete production site (concrete plant, or concrete placement location (construction site. To be exact, these samples were made of concrete which was produced on 15 concrete plants, i.e. placed in at 50 different reinforced concrete structures, built during 2012 by 22 different contractors. It is a known fact that the achieved values of concrete compressive strength are very important, both for quality and durability assessment of concrete inside the structural elements, as well as for calculation of their load-bearing capacity limit. Together with the compressive strength testing results, the data concerning requested (designed concrete class, matching between the designed and the achieved concrete quality, concrete density values and frequency of execution of concrete works during 2012 were analyzed.
Links to sources of cancer-related statistics, including the Surveillance, Epidemiology and End Results (SEER) Program, SEER-Medicare datasets, cancer survivor prevalence data, and the Cancer Trends Progress Report.
A STATISTICAL CORRELATION TECHNIQUE AND A NEURAL-NETWORK FOR THE MOTION CORRESPONDENCE PROBLEM
VANDEEMTER, JH; MASTEBROEK, HAK
1994-01-01
A statistical correlation technique (SCT) and two variants of a neural network are presented to solve the motion correspondence problem. Solutions of the motion correspondence problem aim to maintain the identities of individuated elements as they move. In a preprocessing stage, two snapshots of a m
Statistical Techniques Used in Published Articles: A Historical Review of Reviews
Skidmore, Susan Troncoso; Thompson, Bruce
2010-01-01
The purpose of the present study is to provide a historical account and metasynthesis of which statistical techniques are most frequently used in the fields of education and psychology. Six articles reviewing the "American Educational Research Journal" from 1969 to 1997 and five articles reviewing the psychological literature from 1948 to 2001…
Nuclear Technology. Course 26: Metrology. Module 27-7, Statistical Techniques in Metrology.
Espy, John; Selleck, Ben
This seventh in a series of eight modules for a course titled Metrology focuses on descriptive and inferential statistical techniques in metrology. The module follows a typical format that includes the following sections: (1) introduction, (2) module prerequisites, (3) objectives, (4) notes to instructor/student, (5) subject matter, (6) materials…
Novick, Melvin R.
This project is concerned with the development and implementation of some new statistical techniques that will facilitate a continuing input of information about the student to the instructional manager so that individualization of instruction can be managed effectively. The source of this informational input is typically a short…
Statistical Smoothing Methods and Image Analysis
1988-12-01
83 - 111. Rosenfeld, A. and Kak, A.C. (1982). Digital Picture Processing. Academic Press,Qrlando. Serra, J. (1982). Image Analysis and Mat hematical ...hypothesis testing. IEEE Trans. Med. Imaging, MI-6, 313-319. Wicksell, S.D. (1925) The corpuscle problem. A mathematical study of a biometric problem
Statistical inference of Minimum Rank Factor Analysis
Shapiro, A; Ten Berge, JMF
2002-01-01
For any given number of factors, Minimum Rank Factor Analysis yields optimal communalities for an observed covariance matrix in the sense that the unexplained common variance with that number of factors is minimized, subject to the constraint that both the diagonal matrix of unique variances and the
Statistical inference of Minimum Rank Factor Analysis
Shapiro, A; Ten Berge, JMF
For any given number of factors, Minimum Rank Factor Analysis yields optimal communalities for an observed covariance matrix in the sense that the unexplained common variance with that number of factors is minimized, subject to the constraint that both the diagonal matrix of unique variances and the
The Statistical Analysis of Failure Time Data
Kalbfleisch, John D
2011-01-01
Contains additional discussion and examples on left truncation as well as material on more general censoring and truncation patterns.Introduces the martingale and counting process formulation swil lbe in a new chapter.Develops multivariate failure time data in a separate chapter and extends the material on Markov and semi Markov formulations.Presents new examples and applications of data analysis.
Zheng, Jie; Harris, Marcelline R; Masci, Anna Maria; Lin, Yu; Hero, Alfred; Smith, Barry; He, Yongqun
2016-09-14
Statistics play a critical role in biological and clinical research. However, most reports of scientific results in the published literature make it difficult for the reader to reproduce the statistical analyses performed in achieving those results because they provide inadequate documentation of the statistical tests and algorithms applied. The Ontology of Biological and Clinical Statistics (OBCS) is put forward here as a step towards solving this problem. The terms in OBCS including 'data collection', 'data transformation in statistics', 'data visualization', 'statistical data analysis', and 'drawing a conclusion based on data', cover the major types of statistical processes used in basic biological research and clinical outcome studies. OBCS is aligned with the Basic Formal Ontology (BFO) and extends the Ontology of Biomedical Investigations (OBI), an OBO (Open Biological and Biomedical Ontologies) Foundry ontology supported by over 20 research communities. Currently, OBCS comprehends 878 terms, representing 20 BFO classes, 403 OBI classes, 229 OBCS specific classes, and 122 classes imported from ten other OBO ontologies. We discuss two examples illustrating how the ontology is being applied. In the first (biological) use case, we describe how OBCS was applied to represent the high throughput microarray data analysis of immunological transcriptional profiles in human subjects vaccinated with an influenza vaccine. In the second (clinical outcomes) use case, we applied OBCS to represent the processing of electronic health care data to determine the associations between hospital staffing levels and patient mortality. Our case studies were designed to show how OBCS can be used for the consistent representation of statistical analysis pipelines under two different research paradigms. Other ongoing projects using OBCS for statistical data processing are also discussed. The OBCS source code and documentation are available at: https://github.com/obcs/obcs . The Ontology
Energy Technology Data Exchange (ETDEWEB)
de Supinski, B R; Miller, B P; Liblit, B
2011-09-13
Petascale platforms with O(10{sup 5}) and O(10{sup 6}) processing cores are driving advancements in a wide range of scientific disciplines. These large systems create unprecedented application development challenges. Scalable correctness tools are critical to shorten the time-to-solution on these systems. Currently, many DOE application developers use primitive manual debugging based on printf or traditional debuggers such as TotalView or DDT. This paradigm breaks down beyond a few thousand cores, yet bugs often arise above that scale. Programmers must reproduce problems in smaller runs to analyze them with traditional tools, or else perform repeated runs at scale using only primitive techniques. Even when traditional tools run at scale, the approach wastes substantial effort and computation cycles. Continued scientific progress demands new paradigms for debugging large-scale applications. The Correctness on Petascale Systems (CoPS) project is developing a revolutionary debugging scheme that will reduce the debugging problem to a scale that human developers can comprehend. The scheme can provide precise diagnoses of the root causes of failure, including suggestions of the location and the type of errors down to the level of code regions or even a single execution point. Our fundamentally new strategy combines and expands three relatively new complementary debugging approaches. The Stack Trace Analysis Tool (STAT), a 2011 R&D 100 Award Winner, identifies behavior equivalence classes in MPI jobs and highlights behavior when elements of the class demonstrate divergent behavior, often the first indicator of an error. The Cooperative Bug Isolation (CBI) project has developed statistical techniques for isolating programming errors in widely deployed code that we will adapt to large-scale parallel applications. Finally, we are developing a new approach to parallelizing expensive correctness analyses, such as analysis of memory usage in the Memgrind tool. In the first two
Statistic analysis of millions of digital photos
Wueller, Dietmar; Fageth, Reiner
2008-02-01
The analysis of images has always been an important aspect in the quality enhancement of photographs and photographic equipment. Due to the lack of meta data it was mostly limited to images taken by experts under predefined conditions and the analysis was also done by experts or required psychophysical tests. With digital photography and the EXIF1 meta data stored in the images, a lot of information can be gained from a semiautomatic or automatic image analysis if one has access to a large number of images. Although home printing is becoming more and more popular, the European market still has a few photofinishing companies who have access to a large number of images. All printed images are stored for a certain period of time adding up to several million images on servers every day. We have utilized the images to answer numerous questions and think that these answers are useful for increasing image quality by optimizing the image processing algorithms. Test methods can be modified to fit typical user conditions and future developments can be pointed towards ideal directions.
Kotula, Paul G; Keenan, Michael R
2006-12-01
Multivariate statistical analysis methods have been applied to scanning transmission electron microscopy (STEM) energy-dispersive X-ray spectral images. The particular application of the multivariate curve resolution (MCR) technique provides a high spectral contrast view of the raw spectral image. The power of this approach is demonstrated with a microelectronics failure analysis. Specifically, an unexpected component describing a chemical contaminant was found, as well as a component consistent with a foil thickness change associated with the focused ion beam specimen preparation process. The MCR solution is compared with a conventional analysis of the same spectral image data set.
EXTREME PROGRAMMING PROJECT PERFORMANCE MANAGEMENT BY STATISTICAL EARNED VALUE ANALYSIS
Wei Lu; Li Lu
2013-01-01
As an important project type of Agile Software Development, the performance evaluation and prediction for eXtreme Programming project has significant meanings. Targeting on the short release life cycle and concurrent multitask features, a statistical earned value analysis model is proposed. Based on the traditional concept of earned value analysis, the statistical earned value analysis model introduced Elastic Net regression function and Laplacian hierarchical model to construct a Bayesian El...
An analysis of radio pulsar nulling statistics
Biggs, James D.
1992-01-01
Survival analysis methods are used to seek correlations between the fraction of null pulsars and other pulsar characteristics for an ensemble of 72 radio pulsars. The strongest correlation is found between the null fraction and the pulse period, suggesting that nulling is a manifestation of a faltering emission mechanism. Correlations are also found between the fraction of null pulses and other parameters that have a strong dependence on the pulse period. The results presented here suggest that nulling is broad-band and may ultimately be explained in terms of polar cap models of pulsar emission.
CORSSA: The Community Online Resource for Statistical Seismicity Analysis
Michael, Andrew J.; Wiemer, Stefan
2010-01-01
Statistical seismology is the application of rigorous statistical methods to earthquake science with the goal of improving our knowledge of how the earth works. Within statistical seismology there is a strong emphasis on the analysis of seismicity data in order to improve our scientific understanding of earthquakes and to improve the evaluation and testing of earthquake forecasts, earthquake early warning, and seismic hazards assessments. Given the societal importance of these applications, statistical seismology must be done well. Unfortunately, a lack of educational resources and available software tools make it difficult for students and new practitioners to learn about this discipline. The goal of the Community Online Resource for Statistical Seismicity Analysis (CORSSA) is to promote excellence in statistical seismology by providing the knowledge and resources necessary to understand and implement the best practices, so that the reader can apply these methods to their own research. This introduction describes the motivation for and vision of CORRSA. It also describes its structure and contents.
Regression Commonality Analysis: A Technique for Quantitative Theory Building
Nimon, Kim; Reio, Thomas G., Jr.
2011-01-01
When it comes to multiple linear regression analysis (MLR), it is common for social and behavioral science researchers to rely predominately on beta weights when evaluating how predictors contribute to a regression model. Presenting an underutilized statistical technique, this article describes how organizational researchers can use commonality…
Evaluation of Meterorite Amono Acid Analysis Data Using Multivariate Techniques
McDonald, G.; Storrie-Lombardi, M.; Nealson, K.
1999-01-01
The amino acid distributions in the Murchison carbonaceous chondrite, Mars meteorite ALH84001, and ice from the Allan Hills region of Antarctica are shown, using a multivariate technique known as Principal Component Analysis (PCA), to be statistically distinct from the average amino acid compostion of 101 terrestrial protein superfamilies.
Directory of Open Access Journals (Sweden)
Farooq Ahmad
2011-12-01
Full Text Available Multivariate statistical techniques such as factor analysis (FA, cluster analysis (CA and discriminant analysis (DA, were applied for the evaluation of spatial variations and the interpretation of a large complex water quality data set of three cities (Lahore, Gujranwala and Sialkot in Punjab, Pakistan. 16 parameters of water samples collected from nine different sampling stations of each city were determined. Factor analysis indicates five factors, which explained 74% of the total variance in water quality data set. Five factors are salinization, alkalinity, temperature, domestic waste and chloride, which explained 31.1%, 14.3%, 10.6%, 10.0% and 8.0% of the total variance respectively. Hierarchical cluster analysis grouped nine sampling stations of each city into three clusters, i.e., relatively less polluted (LP, and moderately polluted (MP and highly polluted (HP sites, based on the similarity of water quality characteristics. Discriminant analysis (DA identified ten significant parameters (Calcium (Ca, Ammonia, Sulphate, Sodium (Na, electrical conductivity (EC, chloride, temperature (Temp, total hardness(TH, Turbidity, which discriminate the groundwater quality of three cities, with close to 100.0% correct assignment for spatial variations. This study illustrates the benefit of multivariate statistical techniques for interpreting complex data sets in the analysis of spatial variations in water quality, and to plan for future studies.
Marinović Ruždjak, Andrea; Ruždjak, Domagoj
2015-04-01
For the evaluation of seasonal and spatial variations and the interpretation of a large and complex water quality dataset obtained during a 7-year monitoring program of the Sava River in Croatia, different multivariate statistical techniques were applied in this study. Basic statistical properties and correlations of 18 water quality parameters (variables) measured at 18 sampling sites (a total of 56,952 values) were examined. Correlations between air temperature and some water quality parameters were found in agreement with the previous studies of relationship between climatic and hydrological parameters. Principal component analysis (PCA) was used to explore the most important factors determining the spatiotemporal dynamics of the Sava River. PCA has determined a reduced number of seven principal components that explain over 75 % of the data set variance. The results revealed that parameters related to temperature and organic pollutants (CODMn and TSS) were the most important parameters contributing to water quality variation. PCA analysis of seasonal subsets confirmed this result and showed that the importance of parameters is changing from season to season. PCA of the four seasonal data subsets yielded six PCs with eigenvalues greater than one explaining 73.6 % (spring), 71.4 % (summer), 70.3 % (autumn), and 71.3 % (winter) of the total variance. To check the influence of the outliers in the data set whose distribution strongly deviates from the normal one, in addition to standard principal component analysis algorithm, two robust estimates of covariance matrix were calculated and subjected to PCA. PCA in both cases yielded seven principal components explaining 75 % of the total variance, and the results do not differ significantly from the results obtained by the standard PCA algorithm. With the implementation of robust PCA algorithm, it is demonstrated that the usage of standard algorithm is justified for data sets with small numbers of missing data
Improved statistics for genome-wide interaction analysis.
Ueki, Masao; Cordell, Heather J
2012-01-01
Recently, Wu and colleagues [1] proposed two novel statistics for genome-wide interaction analysis using case/control or case-only data. In computer simulations, their proposed case/control statistic outperformed competing approaches, including the fast-epistasis option in PLINK and logistic regression analysis under the correct model; however, reasons for its superior performance were not fully explored. Here we investigate the theoretical properties and performance of Wu et al.'s proposed statistics and explain why, in some circumstances, they outperform competing approaches. Unfortunately, we find minor errors in the formulae for their statistics, resulting in tests that have higher than nominal type 1 error. We also find minor errors in PLINK's fast-epistasis and case-only statistics, although theory and simulations suggest that these errors have only negligible effect on type 1 error. We propose adjusted versions of all four statistics that, both theoretically and in computer simulations, maintain correct type 1 error rates under the null hypothesis. We also investigate statistics based on correlation coefficients that maintain similar control of type 1 error. Although designed to test specifically for interaction, we show that some of these previously-proposed statistics can, in fact, be sensitive to main effects at one or both loci, particularly in the presence of linkage disequilibrium. We propose two new "joint effects" statistics that, provided the disease is rare, are sensitive only to genuine interaction effects. In computer simulations we find, in most situations considered, that highest power is achieved by analysis under the correct genetic model. Such an analysis is unachievable in practice, as we do not know this model. However, generally high power over a wide range of scenarios is exhibited by our joint effects and adjusted Wu statistics. We recommend use of these alternative or adjusted statistics and urge caution when using Wu et al
Jain, S; Srinath, Ms; Narendra, C; Reddy, Sn; Sindhu, A
2010-10-01
The objective of this study was to evaluate the effect of formulation variables on the release properties, floating lag time, and hardness, when developing floating tablets of Ranitidine hydrochloride, by the statistical optimization technique. The formulations were prepared based on 3(2) factorial design, with polymer ratio (HPMC 100 KM: Xanthan gum) and the amount of aerosil, as two independent formulation variables. The four dependent (response) variables considered were: percentage of drug release at the first hour, T(50%) (time taken to release 50% of the drug), floating lag time, and hardness of the tablet. The release profile data was subjected to a curve fitting analysis, to describe the release mechanism of the drug from the floating tablet. An increase in drug release was observed with an increase in the polymer ratio, and as the amount of aerosil increased, the hardness of the tablet also increased, without causing any change in the floating lag time. The desirability function was used to optimize the response variables, each having a different target, and the observed responses were in accordance with the experimental values. The results demonstrate the feasibility of the model in the development of floating tablets containing Ranitidine hydrochloride.
Ofoghi, Bahadorreza; Zeleznikow, John; Dwyer, Dan; Macmahon, Clare
2013-01-01
This article describes the utilisation of an unsupervised machine learning technique and statistical approaches (e.g., the Kolmogorov-Smirnov test) that assist cycling experts in the crucial decision-making processes for athlete selection, training, and strategic planning in the track cycling Omnium. The Omnium is a multi-event competition that will be included in the summer Olympic Games for the first time in 2012. Presently, selectors and cycling coaches make decisions based on experience and intuition. They rarely have access to objective data. We analysed both the old five-event (first raced internationally in 2007) and new six-event (first raced internationally in 2011) Omniums and found that the addition of the elimination race component to the Omnium has, contrary to expectations, not favoured track endurance riders. We analysed the Omnium data and also determined the inter-relationships between different individual events as well as between those events and the final standings of riders. In further analysis, we found that there is no maximum ranking (poorest performance) in each individual event that riders can afford whilst still winning a medal. We also found the required times for riders to finish the timed components that are necessary for medal winning. The results of this study consider the scoring system of the Omnium and inform decision-making toward successful participation in future major Omnium competitions.
A statistical analysis of UK financial networks
Chu, J.; Nadarajah, S.
2017-04-01
In recent years, with a growing interest in big or large datasets, there has been a rise in the application of large graphs and networks to financial big data. Much of this research has focused on the construction and analysis of the network structure of stock markets, based on the relationships between stock prices. Motivated by Boginski et al. (2005), who studied the characteristics of a network structure of the US stock market, we construct network graphs of the UK stock market using same method. We fit four distributions to the degree density of the vertices from these graphs, the Pareto I, Fréchet, lognormal, and generalised Pareto distributions, and assess the goodness of fit. Our results show that the degree density of the complements of the market graphs, constructed using a negative threshold value close to zero, can be fitted well with the Fréchet and lognormal distributions.
STATISTICAL BAYESIAN ANALYSIS OF EXPERIMENTAL DATA.
Directory of Open Access Journals (Sweden)
AHLAM LABDAOUI
2012-12-01
Full Text Available The Bayesian researcher should know the basic ideas underlying Bayesian methodology and the computational tools used in modern Bayesian econometrics. Some of the most important methods of posterior simulation are Monte Carlo integration, importance sampling, Gibbs sampling and the Metropolis- Hastings algorithm. The Bayesian should also be able to put the theory and computational tools together in the context of substantive empirical problems. We focus primarily on recent developments in Bayesian computation. Then we focus on particular models. Inevitably, we combine theory and computation in the context of particular models. Although we have tried to be reasonably complete in terms of covering the basic ideas of Bayesian theory and the computational tools most commonly used by the Bayesian, there is no way we can cover all the classes of models used in econometrics. We propose to the user of analysis of variance and linear regression model.
A statistical technique for defining rainfall forecast probabilities in southern Africa
Husak, G. J.; Magadzire, T.
2010-12-01
Probabilistic forecasts are currently produced by many climate centers and by just as many processes. These models use a number of inputs to generate the probability of rainfall falling in the lower/middle/upper tercile (frequently termed “below-normal”/”normal”/”above-normal”) of the historical rainfall distribution. Generation of forecasts for a season may be the result of a dynamic climate model, a statistical model, a consensus of a panel of experts, or a combination of some of the afore-mentioned techniques, among others. This last method is one most commonly accepted in Southern Africa, resulting from the Southern Africa Regional Climate Outlook Forum (SARCOF). While it has been noted that there is a reasonable chance of polygons having a dominant tercile of 60% probability or more, this has seldom been the case. Indeed, over the last four years, the SARCOF process has not produced any polygons with such a likelihood. In fact, the terciles in all of the SARCOFs since 2007 have been some combination of 40%, 35% and 25%. Discussions with SARCOF scientists suggests that the SARCOF process is currently using the probabilistic format to define relatively qualitative, rank-ordered outcomes in the format “most-likely”, “second-most likely” and “least likely” terciles. While this rank-ordered classification has its merits, it limits the sort of downstream quantitative statistical analysis that could potentially be of assistance to various decision makers. In this study we build a simple statistical model to analyze the probabilistic outcomes for the coming rainfall season, and analyze their resulting probabilities. The prediction model takes a similar approach to that already used in the SARCOF process: namely, using available historic rainfall data and SST information to create a linear regression between rainfall and SSTs, define a likely rainfall outcome, and analyze the cross-validation errors over the most recent 30 years. The cross
Online Statistical Modeling (Regression Analysis) for Independent Responses
Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus
2017-06-01
Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.
QRS DETECTION OF ECG - A STATISTICAL ANALYSIS
Directory of Open Access Journals (Sweden)
I.S. Siva Rao
2015-03-01
Full Text Available Electrocardiogram (ECG is a graphical representation generated by heart muscle. ECG plays an important role in diagnosis and monitoring of heart’s condition. The real time analyzer based on filtering, beat recognition, clustering, classification of signal with maximum few seconds delay can be done to recognize the life threatening arrhythmia. ECG signal examines and study of anatomic and physiologic facets of the entire cardiac muscle. The inceptive task for proficient scrutiny is the expulsion of noise. It is attained by the use of wavelet transform analysis. Wavelets yield temporal and spectral information concurrently and offer stretchability with a possibility of wavelet functions of different properties. This paper is concerned with the extraction of QRS complexes of ECG signals using Discrete Wavelet Transform based algorithms aided with MATLAB. By removing the inconsistent wavelet transform coefficient, denoising is done in ECG signal. In continuation, QRS complexes are identified and in which each peak can be utilized to discover the peak of separate waves like P and T with their derivatives. Here we put forth a new combinatory algorithm builded on using Pan-Tompkins' method and multi-wavelet transform.
Analysis of linear weighted order statistics CFAR algorithm
Institute of Scientific and Technical Information of China (English)
孟祥伟; 关键; 何友
2004-01-01
CFAR technique is widely used in radar targets detection fields. Traditional algorithm is cell averaging (CA),which can give a good detection performance in a relatively ideal environment. Recently, censoring technique is adopted to make the detector perform robustly. Ordered statistic (OS) and trimmed mean (TM) methods are proposed. TM methods treat the reference samples which participate in clutter power estimates equally, but this processing will not realize the effective estimates of clutter power. Therefore, in this paper a quasi best weighted (Q BW) order statistics algorithm is presented. In special cases, QBW reduces to CA and the censored mean level detector (CMLD).
Guidelines for Statistical Analysis of Percentage of Syllables Stuttered Data
Jones, Mark; Onslow, Mark; Packman, Ann; Gebski, Val
2006-01-01
Purpose: The purpose of this study was to develop guidelines for the statistical analysis of percentage of syllables stuttered (%SS) data in stuttering research. Method; Data on %SS from various independent sources were used to develop a statistical model to describe this type of data. On the basis of this model, %SS data were simulated with…
Attitudes and Achievement in Statistics: A Meta-Analysis Study
Emmioglu, Esma; Capa-Aydin, Yesim
2012-01-01
This study examined the relationships among statistics achievement and four components of attitudes toward statistics (Cognitive Competence, Affect, Value, and Difficulty) as assessed by the SATS. Meta-analysis results revealed that the size of relationships differed by the geographical region in which the studies were conducted as well as by the…
Explorations in Statistics: The Analysis of Ratios and Normalized Data
Curran-Everett, Douglas
2013-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This ninth installment of "Explorations in Statistics" explores the analysis of ratios and normalized--or standardized--data. As researchers, we compute a ratio--a numerator divided by a denominator--to compute a…
Attitudes and Achievement in Statistics: A Meta-Analysis Study
Emmioglu, Esma; Capa-Aydin, Yesim
2012-01-01
This study examined the relationships among statistics achievement and four components of attitudes toward statistics (Cognitive Competence, Affect, Value, and Difficulty) as assessed by the SATS. Meta-analysis results revealed that the size of relationships differed by the geographical region in which the studies were conducted as well as by the…
The Importance of Statistical Modeling in Data Analysis and Inference
Rollins, Derrick, Sr.
2017-01-01
Statistical inference simply means to draw a conclusion based on information that comes from data. Error bars are the most commonly used tool for data analysis and inference in chemical engineering data studies. This work demonstrates, using common types of data collection studies, the importance of specifying the statistical model for sound…
Spectral signature verification using statistical analysis and text mining
DeCoster, Mallory E.; Firpi, Alexe H.; Jacobs, Samantha K.; Cone, Shelli R.; Tzeng, Nigel H.; Rodriguez, Benjamin M.
2016-05-01
In the spectral science community, numerous spectral signatures are stored in databases representative of many sample materials collected from a variety of spectrometers and spectroscopists. Due to the variety and variability of the spectra that comprise many spectral databases, it is necessary to establish a metric for validating the quality of spectral signatures. This has been an area of great discussion and debate in the spectral science community. This paper discusses a method that independently validates two different aspects of a spectral signature to arrive at a final qualitative assessment; the textual meta-data and numerical spectral data. Results associated with the spectral data stored in the Signature Database1 (SigDB) are proposed. The numerical data comprising a sample material's spectrum is validated based on statistical properties derived from an ideal population set. The quality of the test spectrum is ranked based on a spectral angle mapper (SAM) comparison to the mean spectrum derived from the population set. Additionally, the contextual data of a test spectrum is qualitatively analyzed using lexical analysis text mining. This technique analyzes to understand the syntax of the meta-data to provide local learning patterns and trends within the spectral data, indicative of the test spectrum's quality. Text mining applications have successfully been implemented for security2 (text encryption/decryption), biomedical3 , and marketing4 applications. The text mining lexical analysis algorithm is trained on the meta-data patterns of a subset of high and low quality spectra, in order to have a model to apply to the entire SigDB data set. The statistical and textual methods combine to assess the quality of a test spectrum existing in a database without the need of an expert user. This method has been compared to other validation methods accepted by the spectral science community, and has provided promising results when a baseline spectral signature is
STATISTICAL INFERENCES FOR VARYING-COEFFICINT MODELS BASED ON LOCALLY WEIGHTED REGRESSION TECHNIQUE
Institute of Scientific and Technical Information of China (English)
梅长林; 张文修; 梁怡
2001-01-01
Some fundamental issues on statistical inferences relating to varying-coefficient regression models are addressed and studied. An exact testing procedure is proposed for checking the goodness of fit of a varying-coefficient model fired by the locally weighted regression technique versus an ordinary linear regression model. Also, an appropriate statistic for testing variation of model parameters over the locations where the observations are collected is constructed and a formal testing approach which is essential to exploring spatial non-stationarity in geography science is suggested.
Classification of Malaysia aromatic rice using multivariate statistical analysis
Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A.; Omar, O.
2015-05-01
Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC-MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.
Classification of Malaysia aromatic rice using multivariate statistical analysis
Energy Technology Data Exchange (ETDEWEB)
Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A. [School of Mechatronic Engineering, Universiti Malaysia Perlis, Kampus Pauh Putra, 02600 Arau, Perlis (Malaysia); Omar, O. [Malaysian Agriculture Research and Development Institute (MARDI), Persiaran MARDI-UPM, 43400 Serdang, Selangor (Malaysia)
2015-05-15
Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC–MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.
Development of statistical models for data analysis
Energy Technology Data Exchange (ETDEWEB)
Downham, D.Y.
2000-07-01
Incidents that cause, or could cause, injury to personnel, and that satisfy specific criteria, are reported to the Offshore Safety Division (OSD) of the Health and Safety Executive (HSE). The underlying purpose of this report is to improve ways of quantifying risk, a recommendation in Lord Cullen's report into the Piper Alpha disaster. Records of injuries and hydrocarbon releases from 1 January, 1991, to 31 March 1996, are analysed, because the reporting of incidents was standardised after 1990. Models are identified for risk assessment and some are applied. The appropriate analyses of one or two factors (or variables) are tests of uniformity or of independence. Radar graphs are used to represent some temporal variables. Cusums are applied for the analysis of incident frequencies over time, and could be applied for regular monitoring. Log-linear models for Poisson-distributed data are identified as being suitable for identifying 'non-random' combinations of more than two factors. Some questions cannot be addressed with the available data: for example, more data are needed to assess the risk of injury per employee in a time interval. If the questions are considered sufficiently important, resources could be assigned to obtain the data. Some of the main results from the analyses are as follows: the cusum analyses identified a change-point at the end of July 1993, when the reported number of injuries reduced by 40%. Injuries were more likely to occur between 8am and 12am or between 2pm and 5pm than at other times: between 2pm and 3pm the number of injuries was almost twice the average and was more than three fold the smallest. No seasonal effects in the numbers of injuries were identified. Three-day injuries occurred more frequently on the 5th, 6th and 7th days into a tour of duty than on other days. Three-day injuries occurred less frequently on the 13th and 14th days of a tour of duty. An injury classified as 'lifting or craning' was
Abbas Alkarkhi, F M; Ismail, Norli; Easa, Azhar Mat
2008-02-11
Cockles (Anadara granosa) sample obtained from two rivers in the Penang State of Malaysia were analyzed for the content of arsenic (As) and heavy metals (Cr, Cd, Zn, Cu, Pb, and Hg) using a graphite flame atomic absorption spectrometer (GF-AAS) for Cr, Cd, Zn, Cu, Pb, As and cold vapor atomic absorption spectrometer (CV-AAS) for Hg. The two locations of interest with 20 sampling points of each location were Kuala Juru (Juru River) and Bukit Tambun (Jejawi River). Multivariate statistical techniques such as multivariate analysis of variance (MANOVA) and discriminant analysis (DA) were applied for analyzing the data. MANOVA showed a strong significant difference between the two rivers in term of As and heavy metals contents in cockles. DA gave the best result to identify the relative contribution for all parameters in discriminating (distinguishing) the two rivers. It provided an important data reduction as it used only two parameters (Zn and Cd) affording more than 72% correct assignations. Results indicated that the two rivers were different in terms of As and heavy metal contents in cockle, and the major difference was due to the contribution of Zn and Cd. A positive correlation was found between discriminate functions (DF) and Zn, Cd and Cr, whereas negative correlation was exhibited with other heavy metals. Therefore, DA allowed a reduction in the dimensionality of the data set, delineating a few indicator parameters responsible for large variations in heavy metals and arsenic content. Taking into account of these results, it can be suggested that a continuous monitoring of As and heavy metals in cockles be performed in these two rivers.
Data Analysis & Statistical Methods for Command File Errors
Meshkat, Leila; Waggoner, Bruce; Bryant, Larry
2014-01-01
This paper explains current work on modeling for managing the risk of command file errors. It is focused on analyzing actual data from a JPL spaceflight mission to build models for evaluating and predicting error rates as a function of several key variables. We constructed a rich dataset by considering the number of errors, the number of files radiated, including the number commands and blocks in each file, as well as subjective estimates of workload and operational novelty. We have assessed these data using different curve fitting and distribution fitting techniques, such as multiple regression analysis, and maximum likelihood estimation to see how much of the variability in the error rates can be explained with these. We have also used goodness of fit testing strategies and principal component analysis to further assess our data. Finally, we constructed a model of expected error rates based on the what these statistics bore out as critical drivers to the error rate. This model allows project management to evaluate the error rate against a theoretically expected rate as well as anticipate future error rates.
Statistical analysis of the operating parameters which affect cupola emissions
Energy Technology Data Exchange (ETDEWEB)
Davis, J.W.; Draper, A.B.
1977-12-01
A sampling program was undertaken to determine the operating parameters which affected air pollution emission from gray iron foundry cupolas. The experimental design utilized the analysis of variance routine. Four independent variables were selected for examination on the basis of previous work reported in the literature. These were: (1) blast rate; (2) iron-coke ratio; (3) blast temperature; and (4) cupola size. The last variable was chosen since it most directly affects melt rate. Emissions from cupolas for which concern has been expressed are particle matter and carbon monoxide. The dependent variables were, therefore, particle loading, particle size distribution, and carbon monoxide concentration. Seven production foundries were visited and samples taken under conditions prescribed by the experimental plan. The data obtained from these tests were analyzed using the analysis of variance and other statistical techniques where applicable. The results indicated that blast rate, blast temperature, and cupola size affected particle emissions and the latter two also affected the particle size distribution. The particle size information was also unique in that it showed a consistent particle size distribution at all seven foundaries with a sizable fraction of the particles less than 1.0 micrometers in diameter.
Practical application and statistical analysis of titrimetric monitoring ...
African Journals Online (AJOL)
Practical application and statistical analysis of titrimetric monitoring of water and ... The resulting raw data were further processed with an Excel-based program. ... As such the type of component and the concentration can be determined.
Statistical Analysis of the Exchange Rate of Bitcoin: e0133678
National Research Council Canada - National Science Library
Jeffrey Chu; Saralees Nadarajah; Stephen Chan
2015-01-01
Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar...
Propensity Score Analysis: An Alternative Statistical Approach for HRD Researchers
Keiffer, Greggory L.; Lane, Forrest C.
2016-01-01
Purpose: This paper aims to introduce matching in propensity score analysis (PSA) as an alternative statistical approach for researchers looking to make causal inferences using intact groups. Design/methodology/approach: An illustrative example demonstrated the varying results of analysis of variance, analysis of covariance and PSA on a heuristic…
Meta analysis a guide to calibrating and combining statistical evidence
Kulinskaya, Elena; Staudte, Robert G
2008-01-01
Meta Analysis: A Guide to Calibrating and Combining Statistical Evidence acts as a source of basic methods for scientists wanting to combine evidence from different experiments. The authors aim to promote a deeper understanding of the notion of statistical evidence.The book is comprised of two parts - The Handbook, and The Theory. The Handbook is a guide for combining and interpreting experimental evidence to solve standard statistical problems. This section allows someone with a rudimentary knowledge in general statistics to apply the methods. The Theory provides the motivation, theory and results of simulation experiments to justify the methodology.This is a coherent introduction to the statistical concepts required to understand the authors' thesis that evidence in a test statistic can often be calibrated when transformed to the right scale.
Advanced data analysis in neuroscience integrating statistical and computational models
Durstewitz, Daniel
2017-01-01
This book is intended for use in advanced graduate courses in statistics / machine learning, as well as for all experimental neuroscientists seeking to understand statistical methods at a deeper level, and theoretical neuroscientists with a limited background in statistics. It reviews almost all areas of applied statistics, from basic statistical estimation and test theory, linear and nonlinear approaches for regression and classification, to model selection and methods for dimensionality reduction, density estimation and unsupervised clustering. Its focus, however, is linear and nonlinear time series analysis from a dynamical systems perspective, based on which it aims to convey an understanding also of the dynamical mechanisms that could have generated observed time series. Further, it integrates computational modeling of behavioral and neural dynamics with statistical estimation and hypothesis testing. This way computational models in neuroscience are not only explanat ory frameworks, but become powerfu...
Evaluation of drug-polymer solubility curves through formal statistical analysis
DEFF Research Database (Denmark)
Knopp, Matthias Manne; Olesen, Niels Erik; Holm, Per
2015-01-01
In this study, the influence of the preparation technique (ball milling, spray drying, and film casting) of a supersaturated amorphous dispersion on the quality of solubility determinations of indomethacin in polyvinylpyrrolidone was investigated by means of statistical analysis. After annealing......-milled mixtures was not consistent with those from spray drying and film casting, indicating fundamental differences between the preparation techniques. Through formal statistical analysis, the best combination of fit to the Flory-Huggins model and reproducibility of the measurements was analyzed. Ball milling...
Alvarez, Odalys Quevedo; Tagle, Margarita Edelia Villanueva; Pascual, Jorge L Gómez; Marín, Ma Teresa Larrea; Clemente, Ana Catalina Nuñez; Medina, Miriam Odette Cora; Palau, Raiza Rey; Alfonso, Mario Simeón Pomares
2014-10-01
Spatial and temporal variations of sediment quality in Matanzas Bay (Cuba) were studied by determining a total of 12 variables (Zn, Cu, Pb, As, Ni, Co, Al, Fe, Mn, V, CO₃²⁻, and total hydrocarbons (THC). Surface sediments were collected, annually, at eight stations during 2005-2008. Multivariate statistical techniques, such as principal component (PCA), cluster (CA), and lineal discriminant (LDA) analyses were applied for identification of the most significant variables influencing the environmental quality of sediments. Heavy metals (Zn, Cu, Pb, V, and As) and THC were the most significant species contributing to sediment quality variations during the sampling period. Concentrations of V and As were determined in sediments of this ecosystem for the first time. The variation of sediment environmental quality with the sampling period and the differentiation of samples in three groups along the bay were obtained. The usefulness of the multivariate statistical techniques employed for the environmental interpretation of a limited dataset was confirmed.
Shaikh, Muhammad Mujtaba; Memon, Abdul Jabbar; Hussain, Manzoor
2016-09-01
In this article, we describe details of the data used in the research paper "Confidence bounds for energy conservation in electric motors: An economical solution using statistical techniques" [1]. The data presented in this paper is intended to show benefits of high efficiency electric motors over the standard efficiency motors of similar rating in the industrial sector of Pakistan. We explain how the data was collected and then processed by means of formulas to show cost effectiveness of energy efficient motors in terms of three important parameters: annual energy saving, cost saving and payback periods. This data can be further used to construct confidence bounds for the parameters using statistical techniques as described in [1].
Statistical multiresolution analysis in amplitude-frequency domain
Institute of Scientific and Technical Information of China (English)
SUN Hong; GUAN Bao; Henri Maitre
2004-01-01
A concept of statistical multiresolution analysis in amplitude-frequency domain is proposed, which is to employ the wavelet transform on the statistical character of a signal in amplitude domain. In terms of the theorem of generalized ergodicity, an algorithm to estimate the transform coefficients based on the amplitude statistical multiresolution analysis (AMA) is presented. The principle of applying the AMA to Synthetic Aperture Radar (SAR) image processing is described, and the good experimental results imply that the AMA is an efficient tool for processing of speckled signals modeled by the multiplicative noise.
Energy Technology Data Exchange (ETDEWEB)
Park, Jinyong [Univ. of Arizona, Tucson, AZ (United States); Balasingham, P [Univ. of Arizona, Tucson, AZ (United States); McKenna, Sean Andrew [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Kulatilake, Pinnaduwa H.S.W. [Univ. of Arizona, Tucson, AZ (United States)
2004-09-01
Sandia National Laboratories, under contract to Nuclear Waste Management Organization of Japan (NUMO), is performing research on regional classification of given sites in Japan with respect to potential volcanic disruption using multivariate statistics and geo-statistical interpolation techniques. This report provides results obtained for hierarchical probabilistic regionalization of volcanism for the Sengan region in Japan by applying multivariate statistical techniques and geostatistical interpolation techniques on the geologic data provided by NUMO. A workshop report produced in September 2003 by Sandia National Laboratories (Arnold et al., 2003) on volcanism lists a set of most important geologic variables as well as some secondary information related to volcanism. Geologic data extracted for the Sengan region in Japan from the data provided by NUMO revealed that data are not available at the same locations for all the important geologic variables. In other words, the geologic variable vectors were found to be incomplete spatially. However, it is necessary to have complete geologic variable vectors to perform multivariate statistical analyses. As a first step towards constructing complete geologic variable vectors, the Universal Transverse Mercator (UTM) zone 54 projected coordinate system and a 1 km square regular grid system were selected. The data available for each geologic variable on a geographic coordinate system were transferred to the aforementioned grid system. Also the recorded data on volcanic activity for Sengan region were produced on the same grid system. Each geologic variable map was compared with the recorded volcanic activity map to determine the geologic variables that are most important for volcanism. In the regionalized classification procedure, this step is known as the variable selection step. The following variables were determined as most important for volcanism: geothermal gradient, groundwater temperature, heat discharge, groundwater
Scargle, J. D.
1982-12-01
Detection of a periodic signal hidden in noise is frequently a goal in astronomical data analysis. This paper does not introduce a new detection technique, but instead studies the reliability and efficiency of detection with the most commonly used technique, the periodogram, in the case where the observation times are unevenly spaced. This choice was made because, of the methods in current use, it appears to have the simplest statistical behavior. A modification of the classical definition of the periodogram is necessary in order to retain the simple statistical behavior of the evenly spaced case. With this modification, periodogram analysis and least-squares fitting of sine waves to the data are exactly equivalent. Certain difficulties with the use of the periodogram are less important than commonly believed in the case of detection of strictly periodic signals. In addition, the standard method for mitigating these difficulties (tapering) can be used just as well if the sampling is uneven. An analysis of the statistical significance of signal detections is presented, with examples
Basic statistical tools in research and data analysis
Ali, Zulfiqar; Bhaskar, S Bala
2016-01-01
Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies. The article covers a brief outline of the variables, an understanding of quantitative and qualitative variables and the measures of central tendency. An idea of the sample size estimation, power analysis and the statistical errors is given. Finally, there is a summary of parametric and non-parametric tests used for data analysis.
Basic statistical tools in research and data analysis.
Ali, Zulfiqar; Bhaskar, S Bala
2016-09-01
Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies. The article covers a brief outline of the variables, an understanding of quantitative and qualitative variables and the measures of central tendency. An idea of the sample size estimation, power analysis and the statistical errors is given. Finally, there is a summary of parametric and non-parametric tests used for data analysis.
Basic statistical tools in research and data analysis
Directory of Open Access Journals (Sweden)
Zulfiqar Ali
2016-01-01
Full Text Available Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies. The article covers a brief outline of the variables, an understanding of quantitative and qualitative variables and the measures of central tendency. An idea of the sample size estimation, power analysis and the statistical errors is given. Finally, there is a summary of parametric and non-parametric tests used for data analysis.
Numeric computation and statistical data analysis on the Java platform
Chekanov, Sergei V
2016-01-01
Numerical computation, knowledge discovery and statistical data analysis integrated with powerful 2D and 3D graphics for visualization are the key topics of this book. The Python code examples powered by the Java platform can easily be transformed to other programming languages, such as Java, Groovy, Ruby and BeanShell. This book equips the reader with a computational platform which, unlike other statistical programs, is not limited by a single programming language. The author focuses on practical programming aspects and covers a broad range of topics, from basic introduction to the Python language on the Java platform (Jython), to descriptive statistics, symbolic calculations, neural networks, non-linear regression analysis and many other data-mining topics. He discusses how to find regularities in real-world data, how to classify data, and how to process data for knowledge discoveries. The code snippets are so short that they easily fit into single pages. Numeric Computation and Statistical Data Analysis ...
Wolf, S. F.; Lipschutz, M. E.
1993-01-01
Multivariate statistical analysis techniques (linear discriminant analysis and logistic regression) can provide powerful discrimination tools which are generally unfamiliar to the planetary science community. Fall parameters were used to identify a group of 17 H chondrites (Cluster 1) that were part of a coorbital stream which intersected Earth's orbit in May, from 1855 - 1895, and can be distinguished from all other H chondrite falls. Using multivariate statistical techniques, it was demonstrated that a totally different criterion, labile trace element contents - hence thermal histories - or 13 Cluster 1 meteorites are distinguishable from those of 45 non-Cluster 1 H chondrites. Here, we focus upon the principles of multivariate statistical techniques and illustrate their application using non-meteoritic and meteoritic examples.
Attitude Exploration Using Factor Analysis Technique
Directory of Open Access Journals (Sweden)
Monika Raghuvanshi
2016-12-01
Full Text Available Attitude is a psychological variable that contains positive or negative evaluation about people or an environment. The growing generation possesses learning skills, so if positive attitude is inculcated at the right age, it might therefore become habitual. Students in the age group 14-20 years from the city of Bikaner, India, are the target population for this study. An inventory of 30Likert-type scale statements was prepared in order to measure attitude towards the environment and matters related to conservation. The primary data is collected though a structured questionnaire, using cluster sampling technique and analyzed using the IBM SPSS 23 statistical tool. Factor analysis is used to reduce 30 variables to a smaller number of more identifiable groups of variables. Results show that students “need more regulation and voluntary participation to protect the environment”, “need conservation of water and electricity”, “are concerned for undue wastage of water”, “need visible actions to protect the environment”, “need strengthening of the public transport system”, “are a little bit ignorant about the consequences of global warming”, “want prevention of water pollution by industries”, “need changing of personal habits to protect the environment”, and “don’t have firsthand experience of global warming”. Analysis revealed that nine factors obtained could explain about 58.5% variance in the attitude of secondary school students towards the environment in the city of Bikaner, India. The remaining 39.6% variance is attributed to other elements not explained by this analysis. A global campaign for improvement in attitude about environmental issues and its utility in daily lives may boost positive youth attitudes, potentially impacting worldwide. A cross-disciplinary approach may be developed by teaching along with other related disciplines such as science, economics, and social studies etc.
Quantile regression for the statistical analysis of immunological data with many non-detects
Eilers Paul HC; Röder Esther; Savelkoul Huub FJ; van Wijk Roy
2012-01-01
Abstract Background Immunological parameters are hard to measure. A well-known problem is the occurrence of values below the detection limit, the non-detects. Non-detects are a nuisance, because classical statistical analyses, like ANOVA and regression, cannot be applied. The more advanced statistical techniques currently available for the analysis of datasets with non-detects can only be used if a small percentage of the data are non-detects. Methods and results Quantile regression, a genera...
Novel application of a statistical technique, Random Forests, in a bacterial source tracking study.
Smith, Amanda; Sterba-Boatwright, Blair; Mott, Joanna
2010-07-01
In this study, data from bacterial source tracking (BST) analysis using antibiotic resistance profiles were examined using two statistical techniques, Random Forests (RF) and discriminant analysis (DA) to determine sources of fecal contamination of a Texas water body. Cow Trap and Cedar Lakes are potential oyster harvesting waters located in Brazoria County, Texas, that have been listed as impaired for bacteria on the 2004 Texas 303(d) list. Unknown source Escherichia coli were isolated from water samples collected in the study area during two sampling events. Isolates were confirmed as E. coli using carbon source utilization profiles and then analyzed via ARA, following the Kirby-Bauer disk diffusion method. Zone diameters from ARA profiles were analyzed with both DA and RF. Using a two-way classification (human vs nonhuman), both DA and RF categorized over 90% of the 299 unknown source isolates as a nonhuman source. The average rates of correct classification (ARCCs) for the library of 1172 isolates using DA and RF were 74.6% and 82.3%, respectively. ARCCs from RF ranged from 7.7 to 12.0% higher than those from DA. Rates of correct classification (RCCs) for individual sources classified with RF ranged from 23.2 to 0.2% higher than those of DA, with a mean difference of 9.0%. Additional evidence for the outperformance of DA by RF was found in the comparison of training and test set ARCCs and examination of specific disputed isolates; RF produced higher ARCCs (ranging from 8 to 13% higher) than DA for all 1000 trials (excluding the two-way classification, in which RF outperformed DA 999 out of 1000 times). This is of practical significance for analysis of bacterial source tracking data. Overall, based on both DA and RF results, migratory birds were found to be the source of the largest portion of the unknown E. coli isolates. This study is the first known published application of Random Forests in the field of BST. Copyright 2010 Elsevier Ltd. All rights reserved.
Directory of Open Access Journals (Sweden)
M.A. Delavar
2016-02-01
Full Text Available Introduction: The accumulation of heavy metals (HMs in the soil is of increasing concern due to food safety issues, potential health risks, and the detrimental effects on soil ecosystems. HMs may be considered as the most important soil pollutants, because they are not biodegradable and their physical movement through the soil profile is relatively limited. Therefore, root uptake process may provide a big chance for these pollutants to transfer from the surface soil to natural and cultivated plants, which may eventually steer them to human bodies. The general behavior of HMs in the environment, especially their bioavailability in the soil, is influenced by their origin. Hence, source apportionment of HMs may provide some essential information for better management of polluted soils to restrict the HMs entrance to the human food chain. This paper explores the applicability of multivariate statistical techniques in the identification of probable sources that can control the concentration and distribution of selected HMs in the soils surrounding the Zanjan Zinc Specialized Industrial Town (briefly Zinc Town. Materials and Methods: The area under investigation has a size of approximately 4000 ha.It is located around the Zinc Town, Zanjan province. A regular grid sampling pattern with an interval of 500 meters was applied to identify the sample location, and 184 topsoil samples (0-10 cm were collected. The soil samples were air-dried and sieved through a 2 mm polyethylene sieve and then, were digested using HNO3. The total concentrations of zinc (Zn, lead (Pb, cadmium (Cd, Nickel (Ni and copper (Cu in the soil solutions were determined via Atomic Absorption Spectroscopy (AAS. Data were statistically analyzed using the SPSS software version 17.0 for Windows. Correlation Matrix (CM, Principal Component Analyses (PCA and Factor Analyses (FA techniques were performed in order to identify the probable sources of HMs in the studied soils. Results and
Emerging Trends and Statistical Analysis in Computational Modeling in Agriculture
Directory of Open Access Journals (Sweden)
Sunil Kumar
2015-03-01
Full Text Available In this paper the authors have tried to describe emerging trend in computational modelling used in the sphere of agriculture. Agricultural computational modelling with the use of intelligence techniques for computing the agricultural output by providing minimum input data to lessen the time through cutting down the multi locational field trials and also the labours and other inputs is getting momentum. Development of locally suitable integrated farming systems (IFS is the utmost need of the day, particularly in India where about 95% farms are under small and marginal holding size. Optimization of the size and number of the various enterprises to the desired IFS model for a particular set of agro-climate is essential components of the research to sustain the agricultural productivity for not only filling the stomach of the bourgeoning population of the country, but also to enhance the nutritional security and farms return for quality life. Review of literature pertaining to emerging trends in computational modelling applied in field of agriculture is done and described below for the purpose of understanding its trends mechanism behavior and its applications. Computational modelling is increasingly effective for designing and analysis of the system. Computa-tional modelling is an important tool to analyses the effect of different scenarios of climate and management options on the farming systems and its interaction among themselves. Further, authors have also highlighted the applications of computational modeling in integrated farming system, crops, weather, soil, climate, horticulture and statistical used in agriculture which can show the path to the agriculture researcher and rural farming community to replace some of the traditional techniques.
Statistical Analysis of Resistivity Anomalies Caused by Underground Caves
Frid, V.; Averbach, A.; Frid, M.; Dudkinski, D.; Liskevich, G.
2017-03-01
Geophysical prospecting of underground caves being performed on a construction site is often still a challenging procedure. Estimation of a likelihood level of an anomaly found is frequently a mandatory requirement of a project principal due to necessity of risk/safety assessment. However, the methodology of such estimation is not hitherto developed. Aiming to put forward such a methodology the present study (being performed as a part of an underground caves mapping prior to the land development on the site area) consisted of application of electrical resistivity tomography (ERT) together with statistical analysis utilized for the likelihood assessment of underground anomalies located. The methodology was first verified via a synthetic modeling technique and applied to the in situ collected ERT data and then crossed referenced with intrusive investigations (excavation and drilling) for the data verification. The drilling/excavation results showed that the proper discovering of underground caves can be done if anomaly probability level is not lower than 90 %. Such a probability value was shown to be consistent with the modeling results. More than 30 underground cavities were discovered on the site utilizing the methodology.
Directory of Open Access Journals (Sweden)
VIMALA C.
2015-05-01
Full Text Available In recent years, speech technology has become a vital part of our daily lives. Various techniques have been proposed for developing Automatic Speech Recognition (ASR system and have achieved great success in many applications. Among them, Template Matching techniques like Dynamic Time Warping (DTW, Statistical Pattern Matching techniques such as Hidden Markov Model (HMM and Gaussian Mixture Models (GMM, Machine Learning techniques such as Neural Networks (NN, Support Vector Machine (SVM, and Decision Trees (DT are most popular. The main objective of this paper is to design and develop a speaker-independent isolated speech recognition system for Tamil language using the above speech recognition techniques. The background of ASR system, the steps involved in ASR, merits and demerits of the conventional and machine learning algorithms and the observations made based on the experiments are presented in this paper. For the above developed system, highest word recognition accuracy is achieved with HMM technique. It offered 100% accuracy during training process and 97.92% for testing process.
Innovative Techniques Simplify Vibration Analysis
2010-01-01
In the early years of development, Marshall Space Flight Center engineers encountered challenges related to components in the space shuttle main engine. To assess the problems, they evaluated the effects of vibration and oscillation. To enhance the method of vibration signal analysis, Marshall awarded Small Business Innovation Research (SBIR) contracts to AI Signal Research, Inc. (ASRI), in Huntsville, Alabama. ASRI developed a software package called PC-SIGNAL that NASA now employs on a daily basis, and in 2009, the PKP-Module won Marshall s Software of the Year award. The technology is also used in many industries: aircraft and helicopter, rocket engine manufacturing, transportation, and nuclear power."
Simulation Experiments in Practice : Statistical Design and Regression Analysis
Kleijnen, J.P.C.
2007-01-01
In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. The goal of this article is to change these traditional, naïve methods of design and analysis, because statistical theory proves that more information is obta
Simulation Experiments in Practice : Statistical Design and Regression Analysis
Kleijnen, J.P.C.
2007-01-01
In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. Statistical theory proves that more information is obtained when applying Design Of Experiments (DOE) and linear regression analysis. Unfortunately, classic t
Statistical Analysis of Processes of Bankruptcy is in Ukraine
Berest Marina Nikolaevna
2012-01-01
The statistical analysis of processes of bankruptcy in Ukraine is conducted. Quantitative and high-quality indexes, characterizing efficiency of functioning of institute of bankruptcy of enterprises, are analyzed; the analysis of processes, related to bankruptcy of enterprises, being in state administration is conducted.
HistFitter software framework for statistical data analysis
Baak, M.; Côte, D.; Koutsman, A.; Lorenz, J.; Short, D.
2015-01-01
We present a software framework for statistical data analysis, called HistFitter, that has been used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the Large Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searches for supersymmetric particles performed by ATLAS. HistFitter is a programmable and flexible framework to build, book-keep, fit, interpret and present results of data models of nearly arbitrary complexity. Starting from an object-oriented configuration, defined by users, the framework builds probability density functions that are automatically fitted to data and interpreted with statistical tests. A key innovation of HistFitter is its design, which is rooted in core analysis strategies of particle physics. The concepts of control, signal and validation regions are woven into its very fabric. These are progressively treated with statistically rigorous built-in methods. Being capable of working with mu...
A Divergence Statistics Extension to VTK for Performance Analysis.
Energy Technology Data Exchange (ETDEWEB)
Pebay, Philippe Pierre; Bennett, Janine Camille
2015-02-01
This report follows the series of previous documents ([PT08, BPRT09b, PT09, BPT09, PT10, PB13], where we presented the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k -means, order and auto-correlative statistics engines which we developed within the Visualization Tool Kit ( VTK ) as a scalable, parallel and versatile statistics package. We now report on a new engine which we developed for the calculation of divergence statistics, a concept which we hereafter explain and whose main goal is to quantify the discrepancy, in a stasticial manner akin to measuring a distance, between an observed empirical distribution and a theoretical, "ideal" one. The ease of use of the new diverence statistics engine is illustrated by the means of C++ code snippets. Although this new engine does not yet have a parallel implementation, it has already been applied to HPC performance analysis, of which we provide an example.
A Divergence Statistics Extension to VTK for Performance Analysis
Energy Technology Data Exchange (ETDEWEB)
Pebay, Philippe Pierre [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Bennett, Janine Camille [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2015-02-01
This report follows the series of previous documents ([PT08, BPRT09b, PT09, BPT09, PT10, PB13], where we presented the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k -means, order and auto-correlative statistics engines which we developed within the Visualization Tool Kit ( VTK ) as a scalable, parallel and versatile statistics package. We now report on a new engine which we developed for the calculation of divergence statistics, a concept which we hereafter explain and whose main goal is to quantify the discrepancy, in a stasticial manner akin to measuring a distance, between an observed empirical distribution and a theoretical, "ideal" one. The ease of use of the new diverence statistics engine is illustrated by the means of C++ code snippets. Although this new engine does not yet have a parallel implementation, it has already been applied to HPC performance analysis, of which we provide an example.
Longitudinal data analysis a handbook of modern statistical methods
Fitzmaurice, Garrett; Verbeke, Geert; Molenberghs, Geert
2008-01-01
Although many books currently available describe statistical models and methods for analyzing longitudinal data, they do not highlight connections between various research threads in the statistical literature. Responding to this void, Longitudinal Data Analysis provides a clear, comprehensive, and unified overview of state-of-the-art theory and applications. It also focuses on the assorted challenges that arise in analyzing longitudinal data. After discussing historical aspects, leading researchers explore four broad themes: parametric modeling, nonparametric and semiparametric methods, joint
BaTMAn: Bayesian Technique for Multi-image Analysis
Casado, J; García-Benito, R; Guidi, G; Choudhury, O S; Bellocchi, E; Sánchez, S; Díaz, A I
2016-01-01
This paper describes the Bayesian Technique for Multi-image Analysis (BaTMAn), a novel image segmentation technique based on Bayesian statistics, whose main purpose is to characterize an astronomical dataset containing spatial information and perform a tessellation based on the measurements and errors provided as input. The algorithm will iteratively merge spatial elements as long as they are statistically consistent with carrying the same information (i.e. signal compatible with being identical within the errors). We illustrate its operation and performance with a set of test cases that comprises both synthetic and real Integral-Field Spectroscopic (IFS) data. Our results show that the segmentations obtained by BaTMAn adapt to the underlying structure of the data, regardless of the precise details of their morphology and the statistical properties of the noise. The quality of the recovered signal represents an improvement with respect to the input, especially in those regions where the signal is actually con...
A novel statistic for genome-wide interaction analysis.
Wu, Xuesen; Dong, Hua; Luo, Li; Zhu, Yun; Peng, Gang; Reveille, John D; Xiong, Momiao
2010-09-23
Although great progress in genome-wide association studies (GWAS) has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide interaction analysis. To meet challenges raised by genome-wide interactional analysis, we have developed a novel statistic for testing interaction between two loci (either linked or unlinked). The null distribution and the type I error rates of the new statistic for testing interaction are validated using simulations. Extensive power studies show that the developed statistic has much higher power to detect interaction than classical logistic regression. The results identified 44 and 211 pairs of SNPs showing significant evidence of interactions with FDRanalysis is a valuable tool for finding remaining missing heritability unexplained by the current GWAS, and the developed novel statistic is able to search significant interaction between SNPs across the genome. Real data analysis showed that the results of genome-wide interaction analysis can be replicated in two independent studies.
Comparing Techniques for Certified Static Analysis
Cachera, David; Pichardie, David
2009-01-01
A certified static analysis is an analysis whose semantic validity has been formally proved correct with a proof assistant. The recent increasing interest in using proof assistants for mechanizing programming language metatheory has given rise to several approaches for certification of static analysis. We propose a panorama of these techniques and compare their respective strengths and weaknesses.
Classification Techniques for Multivariate Data Analysis.
1980-03-28
analysis among biologists, botanists, and ecologists, while some social scientists may refer "typology". Other frequently encountered terms are pattern...the determinantal equation: lB -XW 0 (42) 49 The solutions X. are the eigenvalues of the matrix W-1 B 1 as in discriminant analysis. There are t non...Statistical Package for Social Sciences (SPSS) (14) subprogram FACTOR was used for the principal components analysis. It is designed both for the factor
Gaitán Fernández, E.; García Moreno, R.; Pino Otín, M. R.; Ribalaygua Batalla, J.
2012-04-01
Climate and soil are two of the most important limiting factors for agricultural production. Nowadays climate change has been documented in many geographical locations affecting different cropping systems. The General Circulation Models (GCM) has become important tools to simulate the more relevant aspects of the climate expected for the XXI century in the frame of climatic change. These models are able to reproduce the general features of the atmospheric dynamic but their low resolution (about 200 Km) avoids a proper simulation of lower scale meteorological effects. Downscaling techniques allow overcoming this problem by adapting the model outcomes to local scale. In this context, FIC (Fundación para la Investigación del Clima) has developed a statistical downscaling technique based on a two step analogue methods. This methodology has been broadly tested on national and international environments leading to excellent results on future climate models. In a collaboration project, this statistical downscaling technique was applied to predict future scenarios for the grape growing systems in Spain. The application of such model is very important to predict expected climate for the different growing crops, mainly for grape, where the success of different varieties are highly related to climate and soil. The model allowed the implementation of agricultural conservation practices in the crop production, detecting highly sensible areas to negative impacts produced by any modification of climate in the different regions, mainly those protected with protected designation of origin, and the definition of new production areas with optimal edaphoclimatic conditions for the different varieties.
Statistical Performance Analysis of an Ant-Colony Optimisation Application in S-Net
MacKenzie, K.; Hölzenspies, P.K.F.; Hammond, K.; Kirner, R.; Nguyen, V.T.N.; te Boekhorst, R.; Grelck, C.; Poss, R.; Verstraaten, M.; Grelck, C.; Hammond, K.; Scholz, S.B.
2013-01-01
We consider an ant-colony optimsation problem implemented on a multicore system as a collection of asynchronous streamprocessing components under the control of the S-NET coordination language. Statistical analysis and visualisation techniques are used to study the behaviour of the application, and
Statistical analysis and optimization of igbt manufacturing flow
Directory of Open Access Journals (Sweden)
Baranov V. V.
2015-02-01
Full Text Available The use of computer simulation, design and optimization of power electronic devices formation technological processes can significantly reduce development time, improve the accuracy of calculations, choose the best options for implementation based on strict mathematical analysis. One of the most common power electronic devices is isolated gate bipolar transistor (IGBT, which combines the advantages of MOSFET and bipolar transistor. The achievement of high requirements for these devices is only possible by optimizing device design and manufacturing process parameters. Therefore important and necessary step in the modern cycle of IC design and manufacturing is to carry out the statistical analysis. Procedure of the IGBT threshold voltage optimization was realized. Through screening experiments according to the Plackett-Burman design the most important input parameters (factors that have the greatest impact on the output characteristic was detected. The coefficients of the approximation polynomial adequately describing the relationship between the input parameters and investigated output characteristics ware determined. Using the calculated approximation polynomial, a series of multiple, in a cycle of Monte Carlo, calculations to determine the spread of threshold voltage values at selected ranges of input parameters deviation were carried out. Combinations of input process parameters values were determined randomly by a normal distribution within a given range of changes. The procedure of IGBT process parameters optimization consist a mathematical problem of determining the value range of the input significant structural and technological parameters providing the change of the IGBT threshold voltage in a given interval. The presented results demonstrate the effectiveness of the proposed optimization techniques.
Directory of Open Access Journals (Sweden)
Land Walker H
2011-01-01
Full Text Available Abstract Background When investigating covariate interactions and group associations with standard regression analyses, the relationship between the response variable and exposure may be difficult to characterize. When the relationship is nonlinear, linear modeling techniques do not capture the nonlinear information content. Statistical learning (SL techniques with kernels are capable of addressing nonlinear problems without making parametric assumptions. However, these techniques do not produce findings relevant for epidemiologic interpretations. A simulated case-control study was used to contrast the information embedding characteristics and separation boundaries produced by a specific SL technique with logistic regression (LR modeling representing a parametric approach. The SL technique was comprised of a kernel mapping in combination with a perceptron neural network. Because the LR model has an important epidemiologic interpretation, the SL method was modified to produce the analogous interpretation and generate odds ratios for comparison. Results The SL approach is capable of generating odds ratios for main effects and risk factor interactions that better capture nonlinear relationships between exposure variables and outcome in comparison with LR. Conclusions The integration of SL methods in epidemiology may improve both the understanding and interpretation of complex exposure/disease relationships.
Complexity of software trustworthiness and its dynamical statistical analysis methods
Institute of Scientific and Technical Information of China (English)
ZHENG ZhiMing; MA ShiLong; LI Wei; JIANG Xin; WEI Wei; MA LiLi; TANG ShaoTing
2009-01-01
Developing trusted softwares has become an important trend and a natural choice in the development of software technology and applications.At present,the method of measurement and assessment of software trustworthiness cannot guarantee safe and reliable operations of software systems completely and effectively.Based on the dynamical system study,this paper interprets the characteristics of behaviors of software systems and the basic scientific problems of software trustworthiness complexity,analyzes the characteristics of complexity of software trustworthiness,and proposes to study the software trustworthiness measurement in terms of the complexity of software trustworthiness.Using the dynamical statistical analysis methods,the paper advances an invariant-measure based assessment method of software trustworthiness by statistical indices,and hereby provides a dynamical criterion for the untrustworthiness of software systems.By an example,the feasibility of the proposed dynamical statistical analysis method in software trustworthiness measurement is demonstrated using numerical simulations and theoretical analysis.
TV content analysis techniques and applications
Kompatsiaris, Yiannis
2012-01-01
The rapid advancement of digital multimedia technologies has not only revolutionized the production and distribution of audiovisual content, but also created the need to efficiently analyze TV programs to enable applications for content managers and consumers. Leaving no stone unturned, TV Content Analysis: Techniques and Applications provides a detailed exploration of TV program analysis techniques. Leading researchers and academics from around the world supply scientifically sound treatment of recent developments across the related subject areas--including systems, architectures, algorithms,
A statistical framework for differential network analysis from microarray data
Directory of Open Access Journals (Sweden)
Datta Somnath
2010-02-01
Full Text Available Abstract Background It has been long well known that genes do not act alone; rather groups of genes act in consort during a biological process. Consequently, the expression levels of genes are dependent on each other. Experimental techniques to detect such interacting pairs of genes have been in place for quite some time. With the advent of microarray technology, newer computational techniques to detect such interaction or association between gene expressions are being proposed which lead to an association network. While most microarray analyses look for genes that are differentially expressed, it is of potentially greater significance to identify how entire association network structures change between two or more biological settings, say normal versus diseased cell types. Results We provide a recipe for conducting a differential analysis of networks constructed from microarray data under two experimental settings. At the core of our approach lies a connectivity score that represents the strength of genetic association or interaction between two genes. We use this score to propose formal statistical tests for each of following queries: (i whether the overall modular structures of the two networks are different, (ii whether the connectivity of a particular set of "interesting genes" has changed between the two networks, and (iii whether the connectivity of a given single gene has changed between the two networks. A number of examples of this score is provided. We carried out our method on two types of simulated data: Gaussian networks and networks based on differential equations. We show that, for appropriate choices of the connectivity scores and tuning parameters, our method works well on simulated data. We also analyze a real data set involving normal versus heavy mice and identify an interesting set of genes that may play key roles in obesity. Conclusions Examining changes in network structure can provide valuable information about the
Towards proper sampling and statistical analysis of defects
Directory of Open Access Journals (Sweden)
Cetin Ali
2014-06-01
Full Text Available Advancements in applied statistics with great relevance to defect sampling and analysis are presented. Three main issues are considered; (i proper handling of multiple defect types, (ii relating sample data originating from polished inspection surfaces (2D to finite material volumes (3D, and (iii application of advanced extreme value theory in statistical analysis of block maximum data. Original and rigorous, but practical mathematical solutions are presented. Finally, these methods are applied to make prediction regarding defect sizes in a steel alloy containing multiple defect types.
Adaptive strategy for the statistical analysis of connectomes.
Directory of Open Access Journals (Sweden)
Djalel Eddine Meskaldji
Full Text Available We study an adaptive statistical approach to analyze brain networks represented by brain connection matrices of interregional connectivity (connectomes. Our approach is at a middle level between a global analysis and single connections analysis by considering subnetworks of the global brain network. These subnetworks represent either the inter-connectivity between two brain anatomical regions or by the intra-connectivity within the same brain anatomical region. An appropriate summary statistic, that characterizes a meaningful feature of the subnetwork, is evaluated. Based on this summary statistic, a statistical test is performed to derive the corresponding p-value. The reformulation of the problem in this way reduces the number of statistical tests in an orderly fashion based on our understanding of the problem. Considering the global testing problem, the p-values are corrected to control the rate of false discoveries. Finally, the procedure is followed by a local investigation within the significant subnetworks. We contrast this strategy with the one based on the individual measures in terms of power. We show that this strategy has a great potential, in particular in cases where the subnetworks are well defined and the summary statistics are properly chosen. As an application example, we compare structural brain connection matrices of two groups of subjects with a 22q11.2 deletion syndrome, distinguished by their IQ scores.
Statistical analysis of lightning electric field measured under Malaysian condition
Salimi, Behnam; Mehranzamir, Kamyar; Abdul-Malek, Zulkurnain
2014-02-01
Lightning is an electrical discharge during thunderstorms that can be either within clouds (Inter-Cloud), or between clouds and ground (Cloud-Ground). The Lightning characteristics and their statistical information are the foundation for the design of lightning protection system as well as for the calculation of lightning radiated fields. Nowadays, there are various techniques to detect lightning signals and to determine various parameters produced by a lightning flash. Each technique provides its own claimed performances. In this paper, the characteristics of captured broadband electric fields generated by cloud-to-ground lightning discharges in South of Malaysia are analyzed. A total of 130 cloud-to-ground lightning flashes from 3 separate thunderstorm events (each event lasts for about 4-5 hours) were examined. Statistical analyses of the following signal parameters were presented: preliminary breakdown pulse train time duration, time interval between preliminary breakdowns and return stroke, multiplicity of stroke, and percentages of single stroke only. The BIL model is also introduced to characterize the lightning signature patterns. Observations on the statistical analyses show that about 79% of lightning signals fit well with the BIL model. The maximum and minimum of preliminary breakdown time duration of the observed lightning signals are 84 ms and 560 us, respectively. The findings of the statistical results show that 7.6% of the flashes were single stroke flashes, and the maximum number of strokes recorded was 14 multiple strokes per flash. A preliminary breakdown signature in more than 95% of the flashes can be identified.
Controlling intrinsic alignments in weak lensing statistics: The nulling and boosting techniques
Joachimi, B
2010-01-01
The intrinsic alignment of galaxies constitutes the major astrophysical source of systematic errors in surveys of weak gravitational lensing by the large-scale structure. We discuss the principles, summarise the implementation, and highlight the performance of two model-independent methods that control intrinsic alignment signals in weak lensing data: the nulling technique which eliminates intrinsic alignments to ensure unbiased constraints on cosmology, and the boosting technique which extracts intrinsic alignments and hence allows one to further study this contribution. Making only use of the characteristic dependence on redshift of the signals, both approaches are robust, but reduce the statistical power due to the similar redshift scaling of intrinsic alignment and lensing signals.
Performance of Statistical Temporal Downscaling Techniques of Wind Speed Data Over Aegean Sea
Gokhan Guler, Hasan; Baykal, Cuneyt; Ozyurt, Gulizar; Kisacik, Dogan
2016-04-01
Wind speed data is a key input for many meteorological and engineering applications. Many institutions provide wind speed data with temporal resolutions ranging from one hour to twenty four hours. Higher temporal resolution is generally required for some applications such as reliable wave hindcasting studies. One solution to generate wind data at high sampling frequencies is to use statistical downscaling techniques to interpolate values of the finer sampling intervals from the available data. In this study, the major aim is to assess temporal downscaling performance of nine statistical interpolation techniques by quantifying the inherent uncertainty due to selection of different techniques. For this purpose, hourly 10-m wind speed data taken from 227 data points over Aegean Sea between 1979 and 2010 having a spatial resolution of approximately 0.3 degrees are analyzed from the National Centers for Environmental Prediction (NCEP) The Climate Forecast System Reanalysis database. Additionally, hourly 10-m wind speed data of two in-situ measurement stations between June, 2014 and June, 2015 are considered to understand effect of dataset properties on the uncertainty generated by interpolation technique. In this study, nine statistical interpolation techniques are selected as w0 (left constant) interpolation, w6 (right constant) interpolation, averaging step function interpolation, linear interpolation, 1D Fast Fourier Transform interpolation, 2nd and 3rd degree Lagrange polynomial interpolation, cubic spline interpolation, piecewise cubic Hermite interpolating polynomials. Original data is down sampled to 6 hours (i.e. wind speeds at 0th, 6th, 12th and 18th hours of each day are selected), then 6 hourly data is temporally downscaled to hourly data (i.e. the wind speeds at each hour between the intervals are computed) using nine interpolation technique, and finally original data is compared with the temporally downscaled data. A penalty point system based on
Statistical Error analysis of Nucleon-Nucleon phenomenological potentials
Perez, R Navarro; Arriola, E Ruiz
2014-01-01
Nucleon-Nucleon potentials are commonplace in nuclear physics and are determined from a finite number of experimental data with limited precision sampling the scattering process. We study the statistical assumptions implicit in the standard least squares fitting procedure and apply, along with more conventional tests, a tail sensitive quantile-quantile test as a simple and confident tool to verify the normality of residuals. We show that the fulfilment of normality tests is linked to a judicious and consistent selection of a nucleon-nucleon database. These considerations prove crucial to a proper statistical error analysis and uncertainty propagation. We illustrate these issues by analyzing about 8000 proton-proton and neutron-proton scattering published data. This enables the construction of potentials meeting all statistical requirements necessary for statistical uncertainty estimates in nuclear structure calculations.
BaTMAn: Bayesian Technique for Multi-image Analysis
Casado, J.; Ascasibar, Y.; García-Benito, R.; Guidi, G.; Choudhury, O. S.; Bellocchi, E.; Sánchez, S. F.; Díaz, A. I.
2016-12-01
Bayesian Technique for Multi-image Analysis (BaTMAn) characterizes any astronomical dataset containing spatial information and performs a tessellation based on the measurements and errors provided as input. The algorithm iteratively merges spatial elements as long as they are statistically consistent with carrying the same information (i.e. identical signal within the errors). The output segmentations successfully adapt to the underlying spatial structure, regardless of its morphology and/or the statistical properties of the noise. BaTMAn identifies (and keeps) all the statistically-significant information contained in the input multi-image (e.g. an IFS datacube). The main aim of the algorithm is to characterize spatially-resolved data prior to their analysis.
Statistical techniques using NURE airborne geophysical data and NURE geochemical data
Campbell, Katherine
Some standard techniques in multivariate analysis are used to describe the relationships among remotely sensed observations (Landsat and airborne geophysical data) and between these variables and hydrogeochemical and stream sediment analyses. Gray-level pictures of such factors make the analytic results more accessible and easier to interpret.
Statistical Techniques Complement UML When Developing Domain Models of Complex Dynamical Biosystems
Timmis, Jon; Qwarnstrom, Eva E.
2016-01-01
Computational modelling and simulation is increasingly being used to complement traditional wet-lab techniques when investigating the mechanistic behaviours of complex biological systems. In order to ensure computational models are fit for purpose, it is essential that the abstracted view of biology captured in the computational model, is clearly and unambiguously defined within a conceptual model of the biological domain (a domain model), that acts to accurately represent the biological system and to document the functional requirements for the resultant computational model. We present a domain model of the IL-1 stimulated NF-κB signalling pathway, which unambiguously defines the spatial, temporal and stochastic requirements for our future computational model. Through the development of this model, we observe that, in isolation, UML is not sufficient for the purpose of creating a domain model, and that a number of descriptive and multivariate statistical techniques provide complementary perspectives, in particular when modelling the heterogeneity of dynamics at the single-cell level. We believe this approach of using UML to define the structure and interactions within a complex system, along with statistics to define the stochastic and dynamic nature of complex systems, is crucial for ensuring that conceptual models of complex dynamical biosystems, which are developed using UML, are fit for purpose, and unambiguously define the functional requirements for the resultant computational model. PMID:27571414
Statistical Techniques Complement UML When Developing Domain Models of Complex Dynamical Biosystems.
Williams, Richard A; Timmis, Jon; Qwarnstrom, Eva E
2016-01-01
Computational modelling and simulation is increasingly being used to complement traditional wet-lab techniques when investigating the mechanistic behaviours of complex biological systems. In order to ensure computational models are fit for purpose, it is essential that the abstracted view of biology captured in the computational model, is clearly and unambiguously defined within a conceptual model of the biological domain (a domain model), that acts to accurately represent the biological system and to document the functional requirements for the resultant computational model. We present a domain model of the IL-1 stimulated NF-κB signalling pathway, which unambiguously defines the spatial, temporal and stochastic requirements for our future computational model. Through the development of this model, we observe that, in isolation, UML is not sufficient for the purpose of creating a domain model, and that a number of descriptive and multivariate statistical techniques provide complementary perspectives, in particular when modelling the heterogeneity of dynamics at the single-cell level. We believe this approach of using UML to define the structure and interactions within a complex system, along with statistics to define the stochastic and dynamic nature of complex systems, is crucial for ensuring that conceptual models of complex dynamical biosystems, which are developed using UML, are fit for purpose, and unambiguously define the functional requirements for the resultant computational model.
INVERSE FILTERING TECHNIQUES IN SPEECH ANALYSIS
African Journals Online (AJOL)
Dr Obe
most convenient example, have been devised for obtaining waveforms related ... computer to speech analysis led to important elaborations of ... techniques of fast Fourier transformer (FFT) and. Analysis by ... the first three formants F1, F2, F3 to be made. Using the ... introduced and demonstrated to be a powerful tool for the ...
A novel statistic for genome-wide interaction analysis.
Directory of Open Access Journals (Sweden)
Xuesen Wu
2010-09-01
Full Text Available Although great progress in genome-wide association studies (GWAS has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide interaction analysis. To meet challenges raised by genome-wide interactional analysis, we have developed a novel statistic for testing interaction between two loci (either linked or unlinked. The null distribution and the type I error rates of the new statistic for testing interaction are validated using simulations. Extensive power studies show that the developed statistic has much higher power to detect interaction than classical logistic regression. The results identified 44 and 211 pairs of SNPs showing significant evidence of interactions with FDR<0.001 and 0.001
A Statistical Analysis of Cointegration for I(2) Variables
DEFF Research Database (Denmark)
Johansen, Søren
1995-01-01
be conducted using the ¿ sup2/sup distribution. It is shown to what extent inference on the cointegration ranks can be conducted using the tables already prepared for the analysis of cointegration of I(1) variables. New tables are needed for the test statistics to control the size of the tests. This paper...
Advanced Statistical and Data Analysis Tools for Astrophysics
Kashyap, V.; Scargle, Jeffrey D. (Technical Monitor)
2001-01-01
The goal of the project is to obtain, derive, and develop statistical and data analysis tools that would be of use in the analyses of high-resolution, high-sensitivity data that are becoming available with new instruments. This is envisioned as a cross-disciplinary effort with a number of collaborators.
A New Statistic for Variable Selection in Questionnaire Analysis
Institute of Scientific and Technical Information of China (English)
ZHANG Jun-hua; FANG Wei-wu
2001-01-01
In this paper, a new statistic is proposed for variable selection which is one of the important problems in analysis of questionnaire data. Contrasting to other methods, the approach introduced here can be used not only for two groups of samples but can also be easily generalized to the multi-group case.
Measures of radioactivity: a tool for understanding statistical data analysis
Montalbano, Vera
2012-01-01
A learning path on radioactivity in the last class of high school is presented. An introduction to radioactivity and nuclear phenomenology is followed by measurements of natural radioactivity. Background and weak sources are monitored for days or weeks. The data are analyzed in order to understand the importance of statistical analysis in modern physics.
Statistical Analysis of Hypercalcaemia Data related to Transferability
DEFF Research Database (Denmark)
Frølich, Anne; Nielsen, Bo Friis
2005-01-01
In this report we describe statistical analysis related to a study of hypercalcaemia carried out in the Copenhagen area in the ten year period from 1984 to 1994. Results from the study have previously been publised in a number of papers [3, 4, 5, 6, 7, 8, 9] and in various abstracts and posters...
AstroStat - A VO Tool for Statistical Analysis
Kembhavi, Ajit K; Kale, Tejas; Jagade, Santosh; Vibhute, Ajay; Garg, Prerak; Vaghmare, Kaustubh; Navelkar, Sharmad; Agrawal, Tushar; Nandrekar, Deoyani; Shaikh, Mohasin
2015-01-01
AstroStat is an easy-to-use tool for performing statistical analysis on data. It has been designed to be compatible with Virtual Observatory (VO) standards thus enabling it to become an integral part of the currently available collection of VO tools. A user can load data in a variety of formats into AstroStat and perform various statistical tests using a menu driven interface. Behind the scenes, all analysis is done using the public domain statistical software - R and the output returned is presented in a neatly formatted form to the user. The analyses performable include exploratory tests, visualizations, distribution fitting, correlation & causation, hypothesis testing, multivariate analysis and clustering. The tool is available in two versions with identical interface and features - as a web service that can be run using any standard browser and as an offline application. AstroStat will provide an easy-to-use interface which can allow for both fetching data and performing power statistical analysis on ...
Introduction to Statistics and Data Analysis With Computer Applications I.
Morris, Carl; Rolph, John
This document consists of unrevised lecture notes for the first half of a 20-week in-house graduate course at Rand Corporation. The chapter headings are: (1) Histograms and descriptive statistics; (2) Measures of dispersion, distance and goodness of fit; (3) Using JOSS for data analysis; (4) Binomial distribution and normal approximation; (5)…
The Statistical Package for the Social Sciences (SPSS) as an adjunct to pharmacokinetic analysis.
Mather, L E; Austin, K L
1983-01-01
Computer techniques for numerical analysis are well known to pharmacokineticists. Powerful techniques for data file management have been developed by social scientists but have, in general, been ignored by pharmacokineticists because of their apparent lack of ability to interface with pharmacokinetic programs. Extensive use has been made of the Statistical Package for the Social Sciences (SPSS) for its data handling capabilities, but at the same time, techniques have been developed within SPSS to interface with pharmacokinetic programs of the users' choice and to carry out a variety of user-defined pharmacokinetic tasks within SPSS commands, apart from the expected variety of statistical tasks. Because it is based on a ubiquitous package, this methodology has all of the benefits of excellent documentation, interchangeability between different types and sizes of machines and true portability of techniques and data files. An example is given of the total management of a pharmacokinetic study previously reported in the literature by the authors.
Investigation of Weibull statistics in fracture analysis of cast aluminum
Holland, F. A., Jr.; Zaretsky, E. V.
1989-01-01
The fracture strengths of two large batches of A357-T6 cast aluminum coupon specimens were compared by using two-parameter Weibull analysis. The minimum number of these specimens necessary to find the fracture strength of the material was determined. The applicability of three-parameter Weibull analysis was also investigated. A design methodolgy based on the combination of elementary stress analysis and Weibull statistical analysis is advanced and applied to the design of a spherical pressure vessel shell. The results from this design methodology are compared with results from the applicable ASME pressure vessel code.
Investigation of Weibull statistics in fracture analysis of cast aluminum
Holland, F. A., Jr.; Zaretsky, E. V.
1989-01-01
The fracture strengths of two large batches of A357-T6 cast aluminum coupon specimens were compared by using two-parameter Weibull analysis. The minimum number of these specimens necessary to find the fracture strength of the material was determined. The applicability of three-parameter Weibull analysis was also investigated. A design methodolgy based on the combination of elementary stress analysis and Weibull statistical analysis is advanced and applied to the design of a spherical pressure vessel shell. The results from this design methodology are compared with results from the applicable ASME pressure vessel code.
Muhammad, Said; Tahir Shah, M; Khan, Sardar
2010-10-01
The present study was conducted in Kohistan region, where mafic and ultramafic rocks (Kohistan island arc and Indus suture zone) and metasedimentary rocks (Indian plate) are exposed. Water samples were collected from the springs, streams and Indus river and analyzed for physical parameters, anions, cations and arsenic (As(3+), As(5+) and arsenic total). The water quality in Kohistan region was evaluated by comparing the physio-chemical parameters with permissible limits set by Pakistan environmental protection agency and world health organization. Most of the studied parameters were found within their respective permissible limits. However in some samples, the iron and arsenic concentrations exceeded their permissible limits. For health risk assessment of arsenic, the average daily dose, hazards quotient (HQ) and cancer risk were calculated by using statistical formulas. The values of HQ were found >1 in the samples collected from Jabba, Dubair, while HQ values were samples. This level of contamination should have low chronic risk and medium cancer risk when compared with US EPA guidelines. Furthermore, the inter-dependence of physio-chemical parameters and pollution load was also calculated by using multivariate statistical techniques like one-way ANOVA, correlation analysis, regression analysis, cluster analysis and principle component analysis.
McLoughlin, M. Padraig M. M.
2008-01-01
The author of this paper submits the thesis that learning requires doing; only through inquiry is learning achieved, and hence this paper proposes a programme of use of a modified Moore method in a Probability and Mathematical Statistics (PAMS) course sequence to teach students PAMS. Furthermore, the author of this paper opines that set theory…
HistFitter software framework for statistical data analysis
Baak, M.; Besjes, G. J.; Côté, D.; Koutsman, A.; Lorenz, J.; Short, D.
2015-04-01
We present a software framework for statistical data analysis, called HistFitter, that has been used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the Large Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searches for supersymmetric particles performed by ATLAS. HistFitter is a programmable and flexible framework to build, book-keep, fit, interpret and present results of data models of nearly arbitrary complexity. Starting from an object-oriented configuration, defined by users, the framework builds probability density functions that are automatically fit to data and interpreted with statistical tests. Internally HistFitter uses the statistics packages RooStats and HistFactory. A key innovation of HistFitter is its design, which is rooted in analysis strategies of particle physics. The concepts of control, signal and validation regions are woven into its fabric. These are progressively treated with statistically rigorous built-in methods. Being capable of working with multiple models at once that describe the data, HistFitter introduces an additional level of abstraction that allows for easy bookkeeping, manipulation and testing of large collections of signal hypotheses. Finally, HistFitter provides a collection of tools to present results with publication quality style through a simple command-line interface.
Institute of Scientific and Technical Information of China (English)
郑力燕; 于宏兵; 王启山
2016-01-01
Multivariate statistical techniques, such as cluster analysis (CA), discriminant analysis (DA), principal component analysis (PCA) and factor analysis (FA), were applied to evaluate and interpret the surface water quality data sets of the Second Songhua River (SSHR) basin in China, obtained during two years (2012−2013) of monitoring of 10 physicochemical parameters at 15 different sites. The results showed that most of physicochemical parameters varied significantly among the sampling sites. Three significant groups, highly polluted (HP), moderately polluted (MP) and less polluted (LP), of sampling sites were obtained through Hierarchical agglomerative CA on the basis of similarity of water quality characteristics. DA identified pH, F, DO, NH3-N, COD and VPhs were the most important parameters contributing to spatial variations of surface water quality. However, DA did not give a considerable data reduction (40%reduction). PCA/FA resulted in three, three and four latent factors explaining 70%, 62%and 71%of the total variance in water quality data sets of HP, MP and LP regions, respectively. FA revealed that the SSHR water chemistry was strongly affected by anthropogenic activities (point sources: industrial effluents and wastewater treatment plants; non-point sources:domestic sewage, livestock operations and agricultural activities) and natural processes (seasonal effect, and natural inputs). PCA/FA in the whole basin showed the best results for data reduction because it used only two parameters (about 80%reduction) as the most important parameters to explain 72%of the data variation. Thus, this work illustrated the utility of multivariate statistical techniques for analysis and interpretation of datasets and, in water quality assessment, identification of pollution sources/factors and understanding spatial variations in water quality for effective stream water quality management.
Common pitfalls in statistical analysis: Linear regression analysis.
Aggarwal, Rakesh; Ranganathan, Priya
2017-01-01
In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis.
Common pitfalls in statistical analysis: Linear regression analysis
Directory of Open Access Journals (Sweden)
Rakesh Aggarwal
2017-01-01
Full Text Available In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis.
Statistical analysis of absorptive laser damage in dielectric thin films
Energy Technology Data Exchange (ETDEWEB)
Budgor, A.B.; Luria-Budgor, K.F.
1978-09-11
The Weibull distribution arises as an example of the theory of extreme events. It is commonly used to fit statistical data arising in the failure analysis of electrical components and in DC breakdown of materials. This distribution is employed to analyze time-to-damage and intensity-to-damage statistics obtained when irradiating thin film coated samples of SiO/sub 2/, ZrO/sub 2/, and Al/sub 2/O/sub 3/ with tightly focused laser beams. The data used is furnished by Milam. The fit to the data is excellent; and least squared correlation coefficients greater than 0.9 are often obtained.
Comparison of Statistical Models for Regional Crop Trial Analysis
Institute of Scientific and Technical Information of China (English)
ZHANG Qun-yuan; KONG Fan-ling
2002-01-01
Based on the review and comparison of main statistical analysis models for estimating varietyenvironment cell means in regional crop trials, a new statistical model, LR-PCA composite model was proposed, and the predictive precision of these models were compared by cross validation of an example data. Results showed that the order of model precision was LR-PCA model ＞ AMMI model ＞ PCA model ＞ Treatment Means (TM) model ＞ Linear Regression (LR) model ＞ Additive Main Effects ANOVA model. The precision gain factor of LR-PCA model was 1.55, increasing by 8.4% compared with AMMI.
Network similarity and statistical analysis of earthquake seismic data
Deyasi, Krishanu; Banerjee, Anirban
2016-01-01
We study the structural similarity of earthquake networks constructed from seismic catalogs of different geographical regions. A hierarchical clustering of underlying undirected earthquake networks is shown using Jensen-Shannon divergence in graph spectra. The directed nature of links indicates that each earthquake network is strongly connected, which motivates us to study the directed version statistically. Our statistical analysis of each earthquake region identifies the hub regions. We calculate the conditional probability of the forthcoming occurrences of earthquakes in each region. The conditional probability of each event has been compared with their stationary distribution.
[Some basic aspects in statistical analysis of visual acuity data].
Ren, Ze-Qin
2007-06-01
All visual acuity charts used currently have their own shortcomings. Therefore, it is difficult for ophthalmologists to evaluate visual acuity data. Many problems present in the use of statistical methods for handling visual acuity data in clinical research. The quantitative relationship between visual acuity and visual angle varied in different visual acuity charts. The type of visual acuity and visual angle are different from each other. Therefore, different statistical methods should be used for different data sources. A correct understanding and analysis of visual acuity data could be obtained only after the elucidation of these aspects.
Network similarity and statistical analysis of earthquake seismic data
Deyasi, Krishanu; Chakraborty, Abhijit; Banerjee, Anirban
2017-09-01
We study the structural similarity of earthquake networks constructed from seismic catalogs of different geographical regions. A hierarchical clustering of underlying undirected earthquake networks is shown using Jensen-Shannon divergence in graph spectra. The directed nature of links indicates that each earthquake network is strongly connected, which motivates us to study the directed version statistically. Our statistical analysis of each earthquake region identifies the hub regions. We calculate the conditional probability of the forthcoming occurrences of earthquakes in each region. The conditional probability of each event has been compared with their stationary distribution.
Statistical analysis and interpolation of compositional data in materials science.
Pesenson, Misha Z; Suram, Santosh K; Gregoire, John M
2015-02-01
Compositional data are ubiquitous in chemistry and materials science: analysis of elements in multicomponent systems, combinatorial problems, etc., lead to data that are non-negative and sum to a constant (for example, atomic concentrations). The constant sum constraint restricts the sampling space to a simplex instead of the usual Euclidean space. Since statistical measures such as mean and standard deviation are defined for the Euclidean space, traditional correlation studies, multivariate analysis, and hypothesis testing may lead to erroneous dependencies and incorrect inferences when applied to compositional data. Furthermore, composition measurements that are used for data analytics may not include all of the elements contained in the material; that is, the measurements may be subcompositions of a higher-dimensional parent composition. Physically meaningful statistical analysis must yield results that are invariant under the number of composition elements, requiring the application of specialized statistical tools. We present specifics and subtleties of compositional data processing through discussion of illustrative examples. We introduce basic concepts, terminology, and methods required for the analysis of compositional data and utilize them for the spatial interpolation of composition in a sputtered thin film. The results demonstrate the importance of this mathematical framework for compositional data analysis (CDA) in the fields of materials science and chemistry.
Feature-Based Statistical Analysis of Combustion Simulation Data
Energy Technology Data Exchange (ETDEWEB)
Bennett, J; Krishnamoorthy, V; Liu, S; Grout, R; Hawkes, E; Chen, J; Pascucci, V; Bremer, P T
2011-11-18
We present a new framework for feature-based statistical analysis of large-scale scientific data and demonstrate its effectiveness by analyzing features from Direct Numerical Simulations (DNS) of turbulent combustion. Turbulent flows are ubiquitous and account for transport and mixing processes in combustion, astrophysics, fusion, and climate modeling among other disciplines. They are also characterized by coherent structure or organized motion, i.e. nonlocal entities whose geometrical features can directly impact molecular mixing and reactive processes. While traditional multi-point statistics provide correlative information, they lack nonlocal structural information, and hence, fail to provide mechanistic causality information between organized fluid motion and mixing and reactive processes. Hence, it is of great interest to capture and track flow features and their statistics together with their correlation with relevant scalar quantities, e.g. temperature or species concentrations. In our approach we encode the set of all possible flow features by pre-computing merge trees augmented with attributes, such as statistical moments of various scalar fields, e.g. temperature, as well as length-scales computed via spectral analysis. The computation is performed in an efficient streaming manner in a pre-processing step and results in a collection of meta-data that is orders of magnitude smaller than the original simulation data. This meta-data is sufficient to support a fully flexible and interactive analysis of the features, allowing for arbitrary thresholds, providing per-feature statistics, and creating various global diagnostics such as Cumulative Density Functions (CDFs), histograms, or time-series. We combine the analysis with a rendering of the features in a linked-view browser that enables scientists to interactively explore, visualize, and analyze the equivalent of one terabyte of simulation data. We highlight the utility of this new framework for combustion
A COMPARATIVE STATISTICAL ANALYSIS OF RICE CULTIVARS DATA
Directory of Open Access Journals (Sweden)
Mugemangango Cyprien
2012-12-01
Full Text Available In this paper, rice cultivars data have been analysed by three different statisticaltechniques viz. Split-plot analysis in RBD, two-factor factorial analysis in RBD and analysis oftwo-way classified data with several observations per cell. The powers of the tests under differentmethods of analysis have been calculated. The method of two-way classified data with severalobservations per cell is found better followed by two-factor factorial technique in RBD and splitplot analysis for analyzing the given data.
Visual and statistical analysis of {sup 18}F-FDG PET in primary progressive aphasia
Energy Technology Data Exchange (ETDEWEB)
Matias-Guiu, Jordi A.; Moreno-Ramos, Teresa; Garcia-Ramos, Rocio; Fernandez-Matarrubia, Marta; Oreja-Guevara, Celia; Matias-Guiu, Jorge [Hospital Clinico San Carlos, Department of Neurology, Madrid (Spain); Cabrera-Martin, Maria Nieves; Perez-Castejon, Maria Jesus; Rodriguez-Rey, Cristina; Ortega-Candil, Aida; Carreras, Jose Luis [San Carlos Health Research Institute (IdISSC) Complutense University of Madrid, Department of Nuclear Medicine, Hospital Clinico San Carlos, Madrid (Spain)
2015-05-01
Diagnosing progressive primary aphasia (PPA) and its variants is of great clinical importance, and fluorodeoxyglucose (FDG) positron emission tomography (PET) may be a useful diagnostic technique. The purpose of this study was to evaluate interobserver variability in the interpretation of FDG PET images in PPA as well as the diagnostic sensitivity and specificity of the technique. We also aimed to compare visual and statistical analyses of these images. There were 10 raters who analysed 44 FDG PET scans from 33 PPA patients and 11 controls. Five raters analysed the images visually, while the other five used maps created using Statistical Parametric Mapping software. Two spatial normalization procedures were performed: global mean normalization and cerebellar normalization. Clinical diagnosis was considered the gold standard. Inter-rater concordance was moderate for visual analysis (Fleiss' kappa 0.568) and substantial for statistical analysis (kappa 0.756-0.881). Agreement was good for all three variants of PPA except for the nonfluent/agrammatic variant studied with visual analysis. The sensitivity and specificity of each rater's diagnosis of PPA was high, averaging 87.8 and 89.9 % for visual analysis and 96.9 and 90.9 % for statistical analysis using global mean normalization, respectively. In cerebellar normalization, sensitivity was 88.9 % and specificity 100 %. FDG PET demonstrated high diagnostic accuracy for the diagnosis of PPA and its variants. Inter-rater concordance was higher for statistical analysis, especially for the nonfluent/agrammatic variant. These data support the use of FDG PET to evaluate patients with PPA and show that statistical analysis methods are particularly useful for identifying the nonfluent/agrammatic variant of PPA. (orig.)
Directory of Open Access Journals (Sweden)
Petrus H.A.J.M van Gelder
2013-05-01
Full Text Available The dam-break induced loads and their effects on buildings are of vital importance for assessing the vulnerability of buildings in flood-prone areas. A comprehensive methodology, for risk assessment of buildings subject to flooding, is nevertheless still missing. This research aims to take a step forward by following previous research. To this aim, (1 five statistical procedures including: simple correlation analysis, multiple linear regression model, stepwise multiple linear regression model, principal component analysis and cluster analysis are used to study relationship between mean normalized force on structure and other related variables; (2 a new and efficient variable that can take into account both the shape of the structure and flow conditions is proposed; (3 a new and practical formula for predicting the mean normalized force is suggested for different types of obstacles, which is missing in the previous research.
STATISTICAL ANALYSYS OF THE SCFE OF A BRAZILAN MINERAL COAL
Directory of Open Access Journals (Sweden)
DARIVA Cláudio
1997-01-01
Full Text Available The influence of some process variables on the productivity of the fractions (liquid yield times fraction percent obtained from SCFE of a Brazilian mineral coal using isopropanol and ethanol as primary solvents is analyzed using statistical techniques. A full factorial 23 experimental design was adopted to investigate the effects of process variables (temperature, pressure and cosolvent concentration on the extraction products. The extracts were analyzed by the Preparative Liquid Chromatography-8 fractions method (PLC-8, a reliable, non destructive solvent fractionation method, especially developed for coal-derived liquids. Empirical statistical modeling was carried out in order to reproduce the experimental data. Correlations obtained were always greater than 0.98. Four specific process criteria were used to allow process optimization. Results obtained show that it is not possible to maximize both extract productivity and purity (through the minimization of heavy fraction content simultaneously by manipulating the mentioned process variables.
Applications of electrochemical techniques in mineral analysis.
Niu, Yusheng; Sun, Fengyue; Xu, Yuanhong; Cong, Zhichao; Wang, Erkang
2014-09-01
This review, covering reports published in recent decade from 2004 to 2013, shows how electrochemical (EC) techniques such as voltammetry, electrochemical impedance spectroscopy, potentiometry, coulometry, etc., have made significant contributions in the analysis of minerals such as clay, sulfide, oxide, and oxysalt. It was discussed based on the classifications of both the types of the used EC techniques and kinds of the analyzed minerals. Furthermore, minerals as electrode modification materials for EC analysis have also been summarized. Accordingly, research vacancies and future development trends in these areas are discussed.
Wang, Yi; Ma, Xiang; Wen, Ya-Dong; Zou, Quan; Wang, Jun; Tu, Jia-Run; Cai, Wen-Sheng; Shao, Xue-Guang
2013-05-01
Near infrared diffusive reflectance spectroscopy has been applied in on-site or on-line analysis due to its characteristics of fastness, non-destruction and the feasibility for real complex sample analysis. The present work reported a real-time monitoring method for industrial production by using near infrared spectroscopic technique and multivariate statistical process analysis. In the method, the real-time near infrared spectra of the materials are collected on the production line, and then the evaluation of the production process can be achieved by a statistic Hotelling T2 calculated with the established model. In this work, principal component analysis (PCA) is adopted for building the model, and the statistic is calculated by projecting the real-time spectra onto the PCA model. With an application of the method in a practical production, it was demonstrated that a real-time evaluation of the variations in the production can be realized by investigating the changes in the statistic, and the comparison of the products in different batches can be achieved by further statistics of the statistic. Therefore, the proposed method may provide a practical way for quality insurance of production processes.
Statistical Analysis of SAR Sea Clutter for Classification Purposes
Directory of Open Access Journals (Sweden)
Jaime Martín-de-Nicolás
2014-09-01
Full Text Available Statistical analysis of radar clutter has always been one of the topics, where more effort has been put in the last few decades. These studies were usually focused on finding the statistical models that better fitted the clutter distribution; however, the goal of this work is not the modeling of the clutter, but the study of the suitability of the statistical parameters to carry out a sea state classification. In order to achieve this objective and provide some relevance to this study, an important set of maritime and coastal Synthetic Aperture Radar data is considered. Due to the nature of the acquisition of data by SAR sensors, speckle noise is inherent to these data, and a specific study of how this noise affects the clutter distribution is also performed in this work. In pursuit of a sense of wholeness, a thorough study of the most suitable statistical parameters, as well as the most adequate classifier is carried out, achieving excellent results in terms of classification success rates. These concluding results confirm that a sea state classification is not only viable, but also successful using statistical parameters different from those of the best modeling distribution and applying a speckle filter, which allows a better characterization of the parameters used to distinguish between different sea states.
Statistical analysis of the precision of the Match method
Directory of Open Access Journals (Sweden)
R. Lehmann
2005-05-01
Full Text Available The Match method quantifies chemical ozone loss in the polar stratosphere. The basic idea consists in calculating the forward trajectory of an air parcel that has been probed by an ozone measurement (e.g., by an ozone sonde or satellite and finding a second ozone measurement close to this trajectory. Such an event is called a ''match''. A rate of chemical ozone destruction can be obtained by a statistical analysis of several tens of such match events. Information on the uncertainty of the calculated rate can be inferred from the scatter of the ozone mixing ratio difference (second measurement minus first measurement associated with individual matches. A standard analysis would assume that the errors of these differences are statistically independent. However, this assumption may be violated because different matches can share a common ozone measurement, so that the errors associated with these match events become statistically dependent. Taking this effect into account, we present an analysis of the uncertainty of the final Match result. It has been applied to Match data from the Arctic winters 1995, 1996, 2000, and 2003. For these ozone-sonde Match studies the effect of the error correlation on the uncertainty estimates is rather small: compared to a standard error analysis, the uncertainty estimates increase by 15% on average. However, the effect is more pronounced for typical satellite Match analyses: for an Antarctic satellite Match study (2003, the uncertainty estimates increase by 60% on average.
Kwon, Heejin; Cho, Jinhan; Oh, Jongyeong; Kim, Dongwon; Cho, Junghyun; Kim, Sanghyun; Lee, Sangyun; Lee, Jihyun
2015-10-01
To investigate whether reduced radiation dose abdominal CT images reconstructed with adaptive statistical iterative reconstruction V (ASIR-V) compromise the depiction of clinically competent features when compared with the currently used routine radiation dose CT images reconstructed with ASIR. 27 consecutive patients (mean body mass index: 23.55 kg m(-2) underwent CT of the abdomen at two time points. At the first time point, abdominal CT was scanned at 21.45 noise index levels of automatic current modulation at 120 kV. Images were reconstructed with 40% ASIR, the routine protocol of Dong-A University Hospital. At the second time point, follow-up scans were performed at 30 noise index levels. Images were reconstructed with filtered back projection (FBP), 40% ASIR, 30% ASIR-V, 50% ASIR-V and 70% ASIR-V for the reduced radiation dose. Both quantitative and qualitative analyses of image quality were conducted. The CT dose index was also recorded. At the follow-up study, the mean dose reduction relative to the currently used common radiation dose was 35.37% (range: 19-49%). The overall subjective image quality and diagnostic acceptability of the 50% ASIR-V scores at the reduced radiation dose were nearly identical to those recorded when using the initial routine-dose CT with 40% ASIR. Subjective ratings of the qualitative analysis revealed that of all reduced radiation dose CT series reconstructed, 30% ASIR-V and 50% ASIR-V were associated with higher image quality with lower noise and artefacts as well as good sharpness when compared with 40% ASIR and FBP. However, the sharpness score at 70% ASIR-V was considered to be worse than that at 40% ASIR. Objective image noise for 50% ASIR-V was 34.24% and 46.34% which was lower than 40% ASIR and FBP. Abdominal CT images reconstructed with ASIR-V facilitate radiation dose reductions of to 35% when compared with the ASIR. This study represents the first clinical research experiment to use ASIR-V, the newest version of
JAWS data collection, analysis highlights, and microburst statistics
Mccarthy, J.; Roberts, R.; Schreiber, W.
1983-01-01
Organization, equipment, and the current status of the Joint Airport Weather Studies project initiated in relation to the microburst phenomenon are summarized. Some data collection techniques and preliminary statistics on microburst events recorded by Doppler radar are discussed as well. Radar studies show that microbursts occur much more often than expected, with majority of the events being potentially dangerous to landing or departing aircraft. Seventy events were registered, with the differential velocities ranging from 10 to 48 m/s; headwind/tailwind velocity differentials over 20 m/s are considered seriously hazardous. It is noted that a correlation is yet to be established between the velocity differential and incoherent radar reflectivity.
Baek, Seung Yeb; Bae, Dong Ho
2011-02-01
Gas welding is a very important and useful technology in the fabrication of railroad cars and commercial vehicle structures. However, since the fatigue strength of gas-welded joints is considerably lower than that of the base of material due to stress concentration at the weld, the fatigue strength assessment of gas-welded joints is very important for the reliability and durability of railroad cars and establishment of criteria for long-life fatigue design. In this study, after evaluating the fatigue strength using a simulated specimen that satisfies not only the structural characteristics but also the mechanical condition of the actual structure, the fatigue design criteria are determined and applied to the fatigue design of the gas welded body structure. To save time and cost for the fatigue design, we investigated an accelerated life-prediction using a probabilistic statistics technique based on the theory of statistical reliability. The (Δσ a )R-Nf relationship was obtained from actual fatigue test data, including welding residual stress. On the basis of these results, the (Δσa)R-(Nf)ALP relationship that was derived from statistical probability analysis was compared with the actual fatigue test data. Therefore, it is expected that the accelerated life prediction will provide a useful method of determining the criteria for fatigue design and predicting a specific target life.
New Statistical Techniques in the Measurement of the inclusive Top Pair Production Cross Section
Franc, Jiří; Štěpánek, Michal; Kůs, Václav
2014-01-01
We present several different types of multivariate statistical techniques used in the measurement of the inclusive top pair production cross section in $p \\bar{p}$-collisions at $\\sqrt{s} = 1.96 \\text{TeV}$ employing the full RunII data ($9.7\\textrm{fb}^{-1}$) collected with the D0 detector at the Fermilab Tevatron Collider. We consider the final state of the top quark pair decays containing one electron or muon and at least two jets. We proceed various statistical homogeneity tests such as Anderson - Darling, Kolmogorov - Smirnov, and $\\varphi$-divergences tests to determine, which variables have good data-MC agreement, as well as a good separation power. We adjusted all tests for using weighted empirical distribution functions. Further we separate $t\\bar{t}$ signal from the background by the application of Generalized Linear Models, Gaussian Mixture Models), Neural Networks with Switching Units and confront them with familiar methods from ROOT TMVA package such as Boosted Decision Trees, and Multi-layer Per...
Publishing nutrition research: a review of multivariate techniques--part 2: analysis of variance.
Harris, Jeffrey E; Sheean, Patricia M; Gleason, Philip M; Bruemmer, Barbara; Boushey, Carol
2012-01-01
This article is the eighth in a series exploring the importance of research design, statistical analysis, and epidemiology in nutrition and dietetics research, and the second in a series focused on multivariate statistical analytical techniques. The purpose of this review is to examine the statistical technique, analysis of variance (ANOVA), from its simplest to multivariate applications. Many dietetics practitioners are familiar with basic ANOVA, but less informed of the multivariate applications such as multiway ANOVA, repeated-measures ANOVA, analysis of covariance, multiple ANOVA, and multiple analysis of covariance. The article addresses all these applications and includes hypothetical and real examples from the field of dietetics.
Review and classification of variability analysis techniques with clinical applications.
Bravi, Andrea; Longtin, André; Seely, Andrew J E
2011-10-10
Analysis of patterns of variation of time-series, termed variability analysis, represents a rapidly evolving discipline with increasing applications in different fields of science. In medicine and in particular critical care, efforts have focussed on evaluating the clinical utility of variability. However, the growth and complexity of techniques applicable to this field have made interpretation and understanding of variability more challenging. Our objective is to provide an updated review of variability analysis techniques suitable for clinical applications. We review more than 70 variability techniques, providing for each technique a brief description of the underlying theory and assumptions, together with a summary of clinical applications. We propose a revised classification for the domains of variability techniques, which include statistical, geometric, energetic, informational, and invariant. We discuss the process of calculation, often necessitating a mathematical transform of the time-series. Our aims are to summarize a broad literature, promote a shared vocabulary that would improve the exchange of ideas, and the analyses of the results between different studies. We conclude with challenges for the evolving science of variability analysis.
Statistical analysis of highly correlated systems in biology and physics
Martin, Hector Garcia
In this dissertation, I present my work on the statistical study of highly correlated systems in three fields of science: ecology, microbial ecology and physics. I propose an explanation for how the highly correlated distribution of species individuals, and an abundance distribution commonly observed in ecological systems, give rise to a power law dependence between a given area and the number of unique species it harbors. This is one of the oldest known ecological patterns: the power-law Species Area Rule. As a natural extension of my studies in ecology, I have undertaken both theoretical research and field work in the developing field of microbial ecology. In particular, I participated in a multidisciplinary study of the impact of microbes on the formation of macroscopic calcium carbonate terraces at Yellowstone National Park Hot Springs. I have used ecological techniques to characterize the biodiversity of our study site and developed a new bootstrap method for extracting abundance information from clone libraries. This has singled out the most abundant microorganisms and paved the way for future studies of the possible non-passive role of microorganisms in carbonate precipitation. The third part of my thesis uses statistical techniques to explore the correlations in rotating Bose-Einstein condensates. I have used finite difference techniques to solve the Gross-Pitaevskii equation in order to obtain the structure of a vortex in a lattice. Surprisingly, I have found that, in order to understand this structure, it is necessary to add a correction to the Gross-Pitaevskii equation which introduces a dependence on the particle scattering length. I have also used Path Integral Monte Carlo techniques to explore the limit of rapid rotations, where the Gross-Pitaevskii equation is no longer valid. Interestingly, the Gross-Pitaevskii equation seems to be valid for much higher densities than expected if properly renormalized. I show that, in accord with the prediction of
Towards Advanced Data Analysis by Combining Soft Computing and Statistics
Gil, María; Sousa, João; Verleysen, Michel
2013-01-01
Soft computing, as an engineering science, and statistics, as a classical branch of mathematics, emphasize different aspects of data analysis. Soft computing focuses on obtaining working solutions quickly, accepting approximations and unconventional approaches. Its strength lies in its flexibility to create models that suit the needs arising in applications. In addition, it emphasizes the need for intuitive and interpretable models, which are tolerant to imprecision and uncertainty. Statistics is more rigorous and focuses on establishing objective conclusions based on experimental data by analyzing the possible situations and their (relative) likelihood. It emphasizes the need for mathematical methods and tools to assess solutions and guarantee performance. Combining the two fields enhances the robustness and generalizability of data analysis methods, while preserving the flexibility to solve real-world problems efficiently and intuitively.
Statistical Analysis of Ship Collisions with Bridges in China Waterway
Institute of Scientific and Technical Information of China (English)
DAI Tong-yu; NIE Wu; LIU Ying-jie; WANG Li-ping
2002-01-01
Having carried out investigations on ship collision accidents with bridges in waterway in China, a database of ship collision with bridge (SCB) is developed in this paper. It includes detailed information about more than 200 accidents near ship's waterways in the last four decades, in which ships collided with the bridges. Based on the information a statistical analysis is presented tentatively. The increase in frequency of ship collision with bridges appears, and the accident quantity of the barge system is more than that of single ship. The main reason of all the factors for ship collision with bridge is the human errors, which takes up 70%. The quantity of the accidents happened during flooding period shows over 3～6 times compared with the period from March to June in a year. The probability follows the normal distribution according to statistical analysis. Visibility, span between piers also have an effect on the frequency of the accidents.
Collagen morphology and texture analysis: from statistics to classification
Mostaço-Guidolin, Leila B.; Ko, Alex C.-T.; Wang, Fei; Xiang, Bo; Hewko, Mark; Tian, Ganghong; Major, Arkady; Shiomi, Masashi; Sowa, Michael G.
2013-07-01
In this study we present an image analysis methodology capable of quantifying morphological changes in tissue collagen fibril organization caused by pathological conditions. Texture analysis based on first-order statistics (FOS) and second-order statistics such as gray level co-occurrence matrix (GLCM) was explored to extract second-harmonic generation (SHG) image features that are associated with the structural and biochemical changes of tissue collagen networks. Based on these extracted quantitative parameters, multi-group classification of SHG images was performed. With combined FOS and GLCM texture values, we achieved reliable classification of SHG collagen images acquired from atherosclerosis arteries with >90% accuracy, sensitivity and specificity. The proposed methodology can be applied to a wide range of conditions involving collagen re-modeling, such as in skin disorders, different types of fibrosis and muscular-skeletal diseases affecting ligaments and cartilage.
Statistics in experimental design, preprocessing, and analysis of proteomics data.
Jung, Klaus
2011-01-01
High-throughput experiments in proteomics, such as 2-dimensional gel electrophoresis (2-DE) and mass spectrometry (MS), yield usually high-dimensional data sets of expression values for hundreds or thousands of proteins which are, however, observed on only a relatively small number of biological samples. Statistical methods for the planning and analysis of experiments are important to avoid false conclusions and to receive tenable results. In this chapter, the most frequent experimental designs for proteomics experiments are illustrated. In particular, focus is put on studies for the detection of differentially regulated proteins. Furthermore, issues of sample size planning, statistical analysis of expression levels as well as methods for data preprocessing are covered.
Directory of Open Access Journals (Sweden)
Z. Pasandidehfard
2014-03-01
Full Text Available Nonpoint source (NPS pollution is a major surface water contaminant commonly caused by agricultural runoff. The purpose of this study was to assess seasonal variation in water quality parameters in Gorganrood watershed (Golestan Province, Iran. It also tried to clarify the effects of agricultural practices and NPS pollution on them. Water quality parameters including potassium, sodium, pH, water flow rate, total dissolved solids (TDS, electrical conductivity (EC, hardness, sulfate, bicarbonate, chlorine, magnesium, and calcium ions during 1966-2010 were evaluated using multivariate statistical techniques. Multivariate analysis of variance (MANOVA was implemented to determine the significance of differences between mean seasonal values. Discriminant analysis (DA was also carried out to identify correlations between seasons and the water quality parameters. Parameters of water quality index were measured through principal component analysis (PCA and factor analysis (FA. Based on the results of statistical tests, climate (freezing, weathering and rainfall and human activities such as agriculture had crucial effects on water quality. The most important parameters in differentiation between seasons in descending order were potassium, pH, carbonic acid, calcium, and magnesium. According to load factor analysis, chlorine, calcium, and potassium were the most important parameters in spring and summer, indicating the application of fertilizers (especially potassium chloride fertilizer and existence of NPS pollution during these seasons. In the next stage, the months during which crops had excessive water requirements were detected using CROPWAT software. Almost all water requirements of the area’s major crops, i.e. cotton, rice, soya, wheat, and oat, happen in the late spring until mid/late summer. According to our findings, agricultural practices had a great impact on water pollution. Results of analysis with CROPWAT software also confirmed this
Multivariate Analysis Techniques for Optimal Vision System Design
DEFF Research Database (Denmark)
Sharifzadeh, Sara
used in this thesis are described. The methodological strategies are outlined including sparse regression and pre-processing based on feature selection and extraction methods, supervised versus unsupervised analysis and linear versus non-linear approaches. One supervised feature selection algorithm......The present thesis considers optimization of the spectral vision systems used for quality inspection of food items. The relationship between food quality, vision based techniques and spectral signature are described. The vision instruments for food analysis as well as datasets of the food items...... (SSPCA) and DCT based characterization of the spectral diffused reflectance images for wavelength selection and discrimination. These methods together with some other state-of-the-art statistical and mathematical analysis techniques are applied on datasets of different food items; meat, diaries, fruits...
Statistical Analysis of the Exchange Rate of Bitcoin.
Chu, Jeffrey; Nadarajah, Saralees; Chan, Stephen
2015-01-01
Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate.
Lifetime statistics of quantum chaos studied by a multiscale analysis
Di Falco, A.
2012-04-30
In a series of pump and probe experiments, we study the lifetime statistics of a quantum chaotic resonator when the number of open channels is greater than one. Our design embeds a stadium billiard into a two dimensional photonic crystal realized on a silicon-on-insulator substrate. We calculate resonances through a multiscale procedure that combines energy landscape analysis and wavelet transforms. Experimental data is found to follow the universal predictions arising from random matrix theory with an excellent level of agreement.
Statistical analysis on reliability and serviceability of caterpillar tractor
Institute of Scientific and Technical Information of China (English)
WANG Jinwu; LIU Jiafu; XU Zhongxiang
2007-01-01
For further understanding reliability and serviceability of tractor and to furnish scientific and technical theories, based on the promotion and application of it, the following experiments and statistical analysis on reliability (reliability and MTBF) serviceability (service and MTTR) of Donfanghong-1002 and Dongfanghong-802 were conducted. The result showed that the intervals of average troubles of these two tractors were 182.62 h and 160.2 h, respectively, and the weakest assembly of them was engine part.
Common pitfalls in statistical analysis: Odds versus risk
Ranganathan, Priya; Aggarwal, Rakesh; Pramesh, C. S.
2015-01-01
In biomedical research, we are often interested in quantifying the relationship between an exposure and an outcome. “Odds” and “Risk” are the most common terms which are used as measures of association between variables. In this article, which is the fourth in the series of common pitfalls in statistical analysis, we explain the meaning of risk and odds and the difference between the two. PMID:26623395
Statistical Analysis of the Exchange Rate of Bitcoin
Chu, Jeffrey; Nadarajah, Saralees; Chan, Stephen
2015-01-01
Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate. PMID:26222702
Statistical Analysis of the Exchange Rate of Bitcoin.
Directory of Open Access Journals (Sweden)
Jeffrey Chu
Full Text Available Bitcoin, the first electronic payment system, is becoming a popular currency. We provide a statistical analysis of the log-returns of the exchange rate of Bitcoin versus the United States Dollar. Fifteen of the most popular parametric distributions in finance are fitted to the log-returns. The generalized hyperbolic distribution is shown to give the best fit. Predictions are given for future values of the exchange rate.
Statistically-constrained shallow text marking: techniques, evaluation paradigm and results
Murphy, Brian; Vogel, Carl
2007-02-01
We present three natural language marking strategies based on fast and reliable shallow parsing techniques, and on widely available lexical resources: lexical substitution, adjective conjunction swaps, and relativiser switching. We test these techniques on a random sample of the British National Corpus. Individual candidate marks are checked for goodness of structural and semantic fit, using both lexical resources, and the web as a corpus. A representative sample of marks is given to 25 human judges to evaluate for acceptability and preservation of meaning. This establishes a correlation between corpus based felicity measures and perceived quality, and makes qualified predictions. Grammatical acceptability correlates with our automatic measure strongly (Pearson's r = 0.795, p = 0.001), allowing us to account for about two thirds of variability in human judgements. A moderate but statistically insignificant (Pearson's r = 0.422, p = 0.356) correlation is found with judgements of meaning preservation, indicating that the contextual window of five content words used for our automatic measure may need to be extended.
Al-Kindi, Khalifa M; Kwan, Paul; R Andrew, Nigel; Welch, Mitchell
2017-01-01
In order to understand the distribution and prevalence of Ommatissus lybicus (Hemiptera: Tropiduchidae) as well as analyse their current biographical patterns and predict their future spread, comprehensive and detailed information on the environmental, climatic, and agricultural practices are essential. The spatial analytical techniques such as Remote Sensing and Spatial Statistics Tools, can help detect and model spatial links and correlations between the presence, absence and density of O. lybicus in response to climatic, environmental, and human factors. The main objective of this paper is to review remote sensing and relevant analytical techniques that can be applied in mapping and modelling the habitat and population density of O. lybicus. An exhaustive search of related literature revealed that there are very limited studies linking location-based infestation levels of pests like the O. lybicus with climatic, environmental, and human practice related variables. This review also highlights the accumulated knowledge and addresses the gaps in this area of research. Furthermore, it makes recommendations for future studies, and gives suggestions on monitoring and surveillance methods in designing both local and regional level integrated pest management strategies of palm tree and other affected cultivated crops.
Alternative Analysis Techniques for Needs and Needs Documentation Techniques,
1980-06-20
Have you previously participated in a brainwriting session! a. Yes b. No 9. Have you previously participated in the Nominal Group Technique process...brainstorming technique for future sessions. Strongly I Strongly disagree !I agree 8. It was easy to present my views using the brainwriting technique...Strongly!i Strongly disagree I , agree * 9. I was satisfied with the brainwriting technique. Strongly i Strongly disagree __ agree . 10. I recommend using
Sensory cut-off point obtained from survival analysis statistics
Garitta, Lorena; Langohr, Klaus; Gómez Melis, Guadalupe; Hough, Guillermo; Beeren, Cindy
2015-01-01
In the present work we applied interval-censored survival analysis techniques to estimate sensory cut-off points based on consumer’s decision to accept or reject food products taking into account the inherent variability in sensory measurements. We compared the values obtained using this survival analysis methodology with those obtained by applying a previous regression based method. Cut-off point (COP) estimations were made for acid flavor in yogurt, strawberry flavor in a strawberry flav...
Benson, Nsikak U; Asuquo, Francis E; Williams, Akan B; Essien, Joseph P; Ekong, Cyril I; Akpabio, Otobong; Olajire, Abaas A
2016-01-01
Trace metals (Cd, Cr, Cu, Ni and Pb) concentrations in benthic sediments were analyzed through multi-step fractionation scheme to assess the levels and sources of contamination in estuarine, riverine and freshwater ecosystems in Niger Delta (Nigeria). The degree of contamination was assessed using the individual contamination factors (ICF) and global contamination factor (GCF). Multivariate statistical approaches including principal component analysis (PCA), cluster analysis and correlation test were employed to evaluate the interrelationships and associated sources of contamination. The spatial distribution of metal concentrations followed the pattern Pb>Cu>Cr>Cd>Ni. Ecological risk index by ICF showed significant potential mobility and bioavailability for Cu, Cu and Ni. The ICF contamination trend in the benthic sediments at all studied sites was Cu>Cr>Ni>Cd>Pb. The principal component and agglomerative clustering analyses indicate that trace metals contamination in the ecosystems was influenced by multiple pollution sources.
Directory of Open Access Journals (Sweden)
Nsikak U Benson
Full Text Available Trace metals (Cd, Cr, Cu, Ni and Pb concentrations in benthic sediments were analyzed through multi-step fractionation scheme to assess the levels and sources of contamination in estuarine, riverine and freshwater ecosystems in Niger Delta (Nigeria. The degree of contamination was assessed using the individual contamination factors (ICF and global contamination factor (GCF. Multivariate statistical approaches including principal component analysis (PCA, cluster analysis and correlation test were employed to evaluate the interrelationships and associated sources of contamination. The spatial distribution of metal concentrations followed the pattern Pb>Cu>Cr>Cd>Ni. Ecological risk index by ICF showed significant potential mobility and bioavailability for Cu, Cu and Ni. The ICF contamination trend in the benthic sediments at all studied sites was Cu>Cr>Ni>Cd>Pb. The principal component and agglomerative clustering analyses indicate that trace metals contamination in the ecosystems was influenced by multiple pollution sources.
Benson, Nsikak U.; Asuquo, Francis E.; Williams, Akan B.; Essien, Joseph P.; Ekong, Cyril I.; Akpabio, Otobong; Olajire, Abaas A.
2016-01-01
Trace metals (Cd, Cr, Cu, Ni and Pb) concentrations in benthic sediments were analyzed through multi-step fractionation scheme to assess the levels and sources of contamination in estuarine, riverine and freshwater ecosystems in Niger Delta (Nigeria). The degree of contamination was assessed using the individual contamination factors (ICF) and global contamination factor (GCF). Multivariate statistical approaches including principal component analysis (PCA), cluster analysis and correlation test were employed to evaluate the interrelationships and associated sources of contamination. The spatial distribution of metal concentrations followed the pattern Pb>Cu>Cr>Cd>Ni. Ecological risk index by ICF showed significant potential mobility and bioavailability for Cu, Cu and Ni. The ICF contamination trend in the benthic sediments at all studied sites was Cu>Cr>Ni>Cd>Pb. The principal component and agglomerative clustering analyses indicate that trace metals contamination in the ecosystems was influenced by multiple pollution sources. PMID:27257934
Enlow, Elizabeth M; Kennedy, Jennifer L; Nieuwland, Alexander A; Hendrix, James E; Morgan, Stephen L
2005-08-01
Nylons are an important class of synthetic polymers, from an industrial, as well as forensic, perspective. A spectroscopic method, such as Fourier transform infrared (FT-IR) spectroscopy, is necessary to determine the nylon subclasses (e. g., nylon 6 or nylon 6,6). Library searching using absolute difference and absolute derivative difference algorithms gives inconsistent results for identifying nylon subclasses. The objective of this study was to evaluate the usefulness of peak ratio analysis and multivariate statistics for the identification of nylon subclasses using attenuated total reflection (ATR) spectral data. Many nylon subclasses could not be distinguished by the peak ratio of the N-H vibrational stretch to the sp(3) C-H(2) vibrational stretch intensities. Linear discriminant analysis, however, provided a graphical visualization of differences between nylon subclasses and was able to correctly classify a set of 270 spectra from eight different subclasses with 98.5% cross-validated accuracy.
Statistical methods of SNP data analysis with applications
Bulinski, Alexander; Shashkin, Alexey; Yaskov, Pavel
2011-01-01
Various statistical methods important for genetic analysis are considered and developed. Namely, we concentrate on the multifactor dimensionality reduction, logic regression, random forests and stochastic gradient boosting. These methods and their new modifications, e.g., the MDR method with "independent rule", are used to study the risk of complex diseases such as cardiovascular ones. The roles of certain combinations of single nucleotide polymorphisms and external risk factors are examined. To perform the data analysis concerning the ischemic heart disease and myocardial infarction the supercomputer SKIF "Chebyshev" of the Lomonosov Moscow State University was employed.
SAS and R data management, statistical analysis, and graphics
Kleinman, Ken
2009-01-01
An All-in-One Resource for Using SAS and R to Carry out Common TasksProvides a path between languages that is easier than reading complete documentationSAS and R: Data Management, Statistical Analysis, and Graphics presents an easy way to learn how to perform an analytical task in both SAS and R, without having to navigate through the extensive, idiosyncratic, and sometimes unwieldy software documentation. The book covers many common tasks, such as data management, descriptive summaries, inferential procedures, regression analysis, and the creation of graphics, along with more complex applicat
NEW TECHNIQUES USED IN AUTOMATED TEXT ANALYSIS
Directory of Open Access Journals (Sweden)
M. I strate
2010-12-01
Full Text Available Automated analysis of natural language texts is one of the most important knowledge discovery tasks for any organization. According to Gartner Group, almost 90% of knowledge available at an organization today is dispersed throughout piles of documents buried within unstructured text. Analyzing huge volumes of textual information is often involved in making informed and correct business decisions. Traditional analysis methods based on statistics fail to help processing unstructured texts and the society is in search of new technologies for text analysis. There exist a variety of approaches to the analysis of natural language texts, but most of them do not provide results that could be successfully applied in practice. This article concentrates on recent ideas and practical implementations in this area.
Statistical Analysis of 30 Years Rainfall Data: A Case Study
Arvind, G.; Ashok Kumar, P.; Girish Karthi, S.; Suribabu, C. R.
2017-07-01
Rainfall is a prime input for various engineering design such as hydraulic structures, bridges and culverts, canals, storm water sewer and road drainage system. The detailed statistical analysis of each region is essential to estimate the relevant input value for design and analysis of engineering structures and also for crop planning. A rain gauge station located closely in Trichy district is selected for statistical analysis where agriculture is the prime occupation. The daily rainfall data for a period of 30 years is used to understand normal rainfall, deficit rainfall, Excess rainfall and Seasonal rainfall of the selected circle headquarters. Further various plotting position formulae available is used to evaluate return period of monthly, seasonally and annual rainfall. This analysis will provide useful information for water resources planner, farmers and urban engineers to assess the availability of water and create the storage accordingly. The mean, standard deviation and coefficient of variation of monthly and annual rainfall was calculated to check the rainfall variability. From the calculated results, the rainfall pattern is found to be erratic. The best fit probability distribution was identified based on the minimum deviation between actual and estimated values. The scientific results and the analysis paved the way to determine the proper onset and withdrawal of monsoon results which were used for land preparation and sowing.
HistFitter: a flexible framework for statistical data analysis
Besjes, G J; Côté, D; Koutsman, A; Lorenz, J M; Short, D
2015-01-01
HistFitter is a software framework for statistical data analysis that has been used extensively in the ATLAS Collaboration to analyze data of proton-proton collisions produced by the Large Hadron Collider at CERN. Most notably, HistFitter has become a de-facto standard in searches for supersymmetric particles since 2012, with some usage for Exotic and Higgs boson physics. HistFitter coherently combines several statistics tools in a programmable and flexible framework that is capable of bookkeeping hundreds of data models under study using thousands of generated input histograms.HistFitter interfaces with the statistics tools HistFactory and RooStats to construct parametric models and to perform statistical tests of the data, and extends these tools in four key areas. The key innovations are to weave the concepts of control, validation and signal regions into the very fabric of HistFitter, and to treat these with rigorous methods. Multiple tools to visualize and interpret the results through a simple configura...
Detailed Analysis of the Interoccurrence Time Statistics in Seismic Activity
Tanaka, Hiroki; Aizawa, Yoji
2017-02-01
The interoccurrence time statistics of seismiciry is studied theoretically as well as numerically by taking into account the conditional probability and the correlations among many earthquakes in different magnitude levels. It is known so far that the interoccurrence time statistics is well approximated by the Weibull distribution, but the more detailed information about the interoccurrence times can be obtained from the analysis of the conditional probability. Firstly, we propose the Embedding Equation Theory (EET), where the conditional probability is described by two kinds of correlation coefficients; one is the magnitude correlation and the other is the inter-event time correlation. Furthermore, the scaling law of each correlation coefficient is clearly determined from the numerical data-analysis carrying out with the Preliminary Determination of Epicenter (PDE) Catalog and the Japan Meteorological Agency (JMA) Catalog. Secondly, the EET is examined to derive the magnitude dependence of the interoccurrence time statistics and the multi-fractal relation is successfully formulated. Theoretically we cannot prove the universality of the multi-fractal relation in seismic activity; nevertheless, the theoretical results well reproduce all numerical data in our analysis, where several common features or the invariant aspects are clearly observed. Especially in the case of stationary ensembles the multi-fractal relation seems to obey an invariant curve, furthermore in the case of non-stationary (moving time) ensembles for the aftershock regime the multi-fractal relation seems to satisfy a certain invariant curve at any moving times. It is emphasized that the multi-fractal relation plays an important role to unify the statistical laws of seismicity: actually the Gutenberg-Richter law and the Weibull distribution are unified in the multi-fractal relation, and some universality conjectures regarding the seismicity are briefly discussed.
Validation of statistical models for creep rupture by parametric analysis
Energy Technology Data Exchange (ETDEWEB)
Bolton, J., E-mail: john.bolton@uwclub.net [65, Fisher Ave., Rugby, Warks CV22 5HW (United Kingdom)
2012-01-15
Statistical analysis is an efficient method for the optimisation of any candidate mathematical model of creep rupture data, and for the comparative ranking of competing models. However, when a series of candidate models has been examined and the best of the series has been identified, there is no statistical criterion to determine whether a yet more accurate model might be devised. Hence there remains some uncertainty that the best of any series examined is sufficiently accurate to be considered reliable as a basis for extrapolation. This paper proposes that models should be validated primarily by parametric graphical comparison to rupture data and rupture gradient data. It proposes that no mathematical model should be considered reliable for extrapolation unless the visible divergence between model and data is so small as to leave no apparent scope for further reduction. This study is based on the data for a 12% Cr alloy steel used in BS PD6605:1998 to exemplify its recommended statistical analysis procedure. The models considered in this paper include a) a relatively simple model, b) the PD6605 recommended model and c) a more accurate model of somewhat greater complexity. - Highlights: Black-Right-Pointing-Pointer The paper discusses the validation of creep rupture models derived from statistical analysis. Black-Right-Pointing-Pointer It demonstrates that models can be satisfactorily validated by a visual-graphic comparison of models to data. Black-Right-Pointing-Pointer The method proposed utilises test data both as conventional rupture stress and as rupture stress gradient. Black-Right-Pointing-Pointer The approach is shown to be more reliable than a well-established and widely used method (BS PD6605).
PHOTOGRAMMETRIC TECHNIQUES FOR ROAD SURFACE ANALYSIS
Directory of Open Access Journals (Sweden)
V. A. Knyaz
2016-06-01
Full Text Available The quality and condition of a road surface is of great importance for convenience and safety of driving. So the investigations of the behaviour of road materials in laboratory conditions and monitoring of existing roads are widely fulfilled for controlling a geometric parameters and detecting defects in the road surface. Photogrammetry as accurate non-contact measuring method provides powerful means for solving different tasks in road surface reconstruction and analysis. The range of dimensions concerned in road surface analysis can have great variation from tenths of millimetre to hundreds meters and more. So a set of techniques is needed to meet all requirements of road parameters estimation. Two photogrammetric techniques for road surface analysis are presented: for accurate measuring of road pavement and for road surface reconstruction based on imagery obtained from unmanned aerial vehicle. The first technique uses photogrammetric system based on structured light for fast and accurate surface 3D reconstruction and it allows analysing the characteristics of road texture and monitoring the pavement behaviour. The second technique provides dense 3D model road suitable for road macro parameters estimation.
Photogrammetric Techniques for Road Surface Analysis
Knyaz, V. A.; Chibunichev, A. G.
2016-06-01
The quality and condition of a road surface is of great importance for convenience and safety of driving. So the investigations of the behaviour of road materials in laboratory conditions and monitoring of existing roads are widely fulfilled for controlling a geometric parameters and detecting defects in the road surface. Photogrammetry as accurate non-contact measuring method provides powerful means for solving different tasks in road surface reconstruction and analysis. The range of dimensions concerned in road surface analysis can have great variation from tenths of millimetre to hundreds meters and more. So a set of techniques is needed to meet all requirements of road parameters estimation. Two photogrammetric techniques for road surface analysis are presented: for accurate measuring of road pavement and for road surface reconstruction based on imagery obtained from unmanned aerial vehicle. The first technique uses photogrammetric system based on structured light for fast and accurate surface 3D reconstruction and it allows analysing the characteristics of road texture and monitoring the pavement behaviour. The second technique provides dense 3D model road suitable for road macro parameters estimation.
The Effects of Statistical Analysis Software and Calculators on Statistics Achievement
Christmann, Edwin P.
2009-01-01
This study compared the effects of microcomputer-based statistical software and hand-held calculators on the statistics achievement of university males and females. The subjects, 73 graduate students enrolled in univariate statistics classes at a public comprehensive university, were randomly assigned to groups that used either microcomputer-based…
Directory of Open Access Journals (Sweden)
Qing Gu
2016-03-01
Full Text Available Qiandao Lake (Xin’an Jiang reservoir plays a significant role in drinking water supply for eastern China, and it is an attractive tourist destination. Three multivariate statistical methods were comprehensively applied to assess the spatial and temporal variations in water quality as well as potential pollution sources in Qiandao Lake. Data sets of nine parameters from 12 monitoring sites during 2010–2013 were obtained for analysis. Cluster analysis (CA was applied to classify the 12 sampling sites into three groups (Groups A, B and C and the 12 monitoring months into two clusters (April-July, and the remaining months. Discriminant analysis (DA identified Secchi disc depth, dissolved oxygen, permanganate index and total phosphorus as the significant variables for distinguishing variations of different years, with 79.9% correct assignments. Dissolved oxygen, pH and chlorophyll-a were determined to discriminate between the two sampling periods classified by CA, with 87.8% correct assignments. For spatial variation, DA identified Secchi disc depth and ammonia nitrogen as the significant discriminating parameters, with 81.6% correct assignments. Principal component analysis (PCA identified organic pollution, nutrient pollution, domestic sewage, and agricultural and surface runoff as the primary pollution sources, explaining 84.58%, 81.61% and 78.68% of the total variance in Groups A, B and C, respectively. These results demonstrate the effectiveness of integrated use of CA, DA and PCA for reservoir water quality evaluation and could assist managers in improving water resources management.
Multivariate statistical analysis a high-dimensional approach
Serdobolskii, V
2000-01-01
In the last few decades the accumulation of large amounts of in formation in numerous applications. has stimtllated an increased in terest in multivariate analysis. Computer technologies allow one to use multi-dimensional and multi-parametric models successfully. At the same time, an interest arose in statistical analysis with a de ficiency of sample data. Nevertheless, it is difficult to describe the recent state of affairs in applied multivariate methods as satisfactory. Unimprovable (dominating) statistical procedures are still unknown except for a few specific cases. The simplest problem of estimat ing the mean vector with minimum quadratic risk is unsolved, even for normal distributions. Commonly used standard linear multivari ate procedures based on the inversion of sample covariance matrices can lead to unstable results or provide no solution in dependence of data. Programs included in standard statistical packages cannot process 'multi-collinear data' and there are no theoretical recommen ...
Self-Contained Statistical Analysis of Gene Sets
Cannon, Judy L.; Ricoy, Ulises M.; Johnson, Christopher
2016-01-01
Microarrays are a powerful tool for studying differential gene expression. However, lists of many differentially expressed genes are often generated, and unraveling meaningful biological processes from the lists can be challenging. For this reason, investigators have sought to quantify the statistical probability of compiled gene sets rather than individual genes. The gene sets typically are organized around a biological theme or pathway. We compute correlations between different gene set tests and elect to use Fisher’s self-contained method for gene set analysis. We improve Fisher’s differential expression analysis of a gene set by limiting the p-value of an individual gene within the gene set to prevent a small percentage of genes from determining the statistical significance of the entire set. In addition, we also compute dependencies among genes within the set to determine which genes are statistically linked. The method is applied to T-ALL (T-lineage Acute Lymphoblastic Leukemia) to identify differentially expressed gene sets between T-ALL and normal patients and T-ALL and AML (Acute Myeloid Leukemia) patients. PMID:27711232
Agriculture, population growth, and statistical analysis of the radiocarbon record.
Zahid, H Jabran; Robinson, Erick; Kelly, Robert L
2016-01-26
The human population has grown significantly since the onset of the Holocene about 12,000 y ago. Despite decades of research, the factors determining prehistoric population growth remain uncertain. Here, we examine measurements of the rate of growth of the prehistoric human population based on statistical analysis of the radiocarbon record. We find that, during most of the Holocene, human populations worldwide grew at a long-term annual rate of 0.04%. Statistical analysis of the radiocarbon record shows that transitioning farming societies experienced the same rate of growth as contemporaneous foraging societies. The same rate of growth measured for populations dwelling in a range of environments and practicing a variety of subsistence strategies suggests that the global climate and/or endogenous biological factors, not adaptability to local environment or subsistence practices, regulated the long-term growth of the human population during most of the Holocene. Our results demonstrate that statistical analyses of large ensembles of radiocarbon dates are robust and valuable for quantitatively investigating the demography of prehistoric human populations worldwide.
Muhammad, Wazir; Lee, Sang Hoon
2013-01-01
Detailed comparisons of the predictions of the Relativistic Form Factors (RFFs) and Modified Form Factors (MFFs) and their advantages and shortcomings in calculating elastic scattering cross sections can be found in the literature. However, the issues related to their implementation in the Monte Carlo (MC) sampling for coherently scattered photons is still under discussion. Secondly, the linear interpolation technique (LIT) is a popular method to draw the integrated values of squared RFFs/MFFs (i.e. A(Z, v(i)²)) over squared momentum transfer (v(i)² = v(1)²,......, v(59)²). In the current study, the role/issues of RFFs/MFFs and LIT in the MC sampling for the coherent scattering were analyzed. The results showed that the relative probability density curves sampled on the basis of MFFs are unable to reveal any extra scientific information as both the RFFs and MFFs produced the same MC sampled curves. Furthermore, no relationship was established between the multiple small peaks and irregular step shapes (i.e. statistical noise) in the PDFs and either RFFs or MFFs. In fact, the noise in the PDFs appeared due to the use of LIT. The density of the noise depends upon the interval length between two consecutive points in the input data table of A(Z, v(i)²) and has no scientific background. The probability density function curves became smoother as the interval lengths were decreased. In conclusion, these statistical noises can be efficiently removed by introducing more data points in the A(Z, v(i)²) data tables.
Statistical wind analysis for near-space applications
Roney, Jason A.
2007-09-01
Statistical wind models were developed based on the existing observational wind data for near-space altitudes between 60 000 and 100 000 ft (18 30 km) above ground level (AGL) at two locations, Akon, OH, USA, and White Sands, NM, USA. These two sites are envisioned as playing a crucial role in the first flights of high-altitude airships. The analysis shown in this paper has not been previously applied to this region of the stratosphere for such an application. Standard statistics were compiled for these data such as mean, median, maximum wind speed, and standard deviation, and the data were modeled with Weibull distributions. These statistics indicated, on a yearly average, there is a lull or a “knee” in the wind between 65 000 and 72 000 ft AGL (20 22 km). From the standard statistics, trends at both locations indicated substantial seasonal variation in the mean wind speed at these heights. The yearly and monthly statistical modeling indicated that Weibull distributions were a reasonable model for the data. Forecasts and hindcasts were done by using a Weibull model based on 2004 data and comparing the model with the 2003 and 2005 data. The 2004 distribution was also a reasonable model for these years. Lastly, the Weibull distribution and cumulative function were used to predict the 50%, 95%, and 99% winds, which are directly related to the expected power requirements of a near-space station-keeping airship. These values indicated that using only the standard deviation of the mean may underestimate the operational conditions.
A Statistical Analysis of Lunisolar-Earthquake Connections
Rüegg, Christian Michael-André
2012-11-01
Despite over a century of study, the relationship between lunar cycles and earthquakes remains controversial and difficult to quantitatively investigate. Perhaps as a consequence, major earthquakes around the globe are frequently followed by "prediction claim", using lunar cycles, that generate media furore and pressure scientists to provide resolute answers. The 2010-2011 Canterbury earthquakes in New Zealand were no exception; significant media attention was given to lunar derived earthquake predictions by non-scientists, even though the predictions were merely "opinions" and were not based on any statistically robust temporal or causal relationships. This thesis provides a framework for studying lunisolar earthquake temporal relationships by developing replicable statistical methodology based on peer reviewed literature. Notable in the methodology is a high accuracy ephemeris, called ECLPSE, designed specifically by the author for use on earthquake catalogs and a model for performing phase angle analysis.
Statistical analysis of subjective preferences for video enhancement
Woods, Russell L.; Satgunam, PremNandhini; Bronstad, P. Matthew; Peli, Eli
2010-02-01
Measuring preferences for moving video quality is harder than for static images due to the fleeting and variable nature of moving video. Subjective preferences for image quality can be tested by observers indicating their preference for one image over another. Such pairwise comparisons can be analyzed using Thurstone scaling (Farrell, 1999). Thurstone (1927) scaling is widely used in applied psychology, marketing, food tasting and advertising research. Thurstone analysis constructs an arbitrary perceptual scale for the items that are compared (e.g. enhancement levels). However, Thurstone scaling does not determine the statistical significance of the differences between items on that perceptual scale. Recent papers have provided inferential statistical methods that produce an outcome similar to Thurstone scaling (Lipovetsky and Conklin, 2004). Here, we demonstrate that binary logistic regression can analyze preferences for enhanced video.
ATP binding to a multisubunit enzyme: statistical thermodynamics analysis
Zhang, Yunxin
2012-01-01
Due to inter-subunit communication, multisubunit enzymes usually hydrolyze ATP in a concerted fashion. However, so far the principle of this process remains poorly understood. In this study, from the viewpoint of statistical thermodynamics, a simple model is presented. In this model, we assume that the binding of ATP will change the potential of the corresponding enzyme subunit, and the degree of this change depends on the state of its adjacent subunits. The probability of enzyme in a given state satisfies the Boltzmann's distribution. Although it looks much simple, this model can fit the recent experimental data of chaperonin TRiC/CCT well. From this model, the dominant state of TRiC/CCT can be obtained. This study provided a new way to understand biophysical processes by statistical thermodynamics analysis.
Statistical analysis of effective singular values in matrix rank determination
Konstantinides, Konstantinos; Yao, Kung
1988-01-01
A major problem in using SVD (singular-value decomposition) as a tool in determining the effective rank of a perturbed matrix is that of distinguishing between significantly small and significantly large singular values to the end, conference regions are derived for the perturbed singular values of matrices with noisy observation data. The analysis is based on the theories of perturbations of singular values and statistical significance test. Threshold bounds for perturbation due to finite-precision and i.i.d. random models are evaluated. In random models, the threshold bounds depend on the dimension of the matrix, the noisy variance, and predefined statistical level of significance. Results applied to the problem of determining the effective order of a linear autoregressive system from the approximate rank of a sample autocorrelation matrix are considered. Various numerical examples illustrating the usefulness of these bounds and comparisons to other previously known approaches are given.
Statistical methods for data analysis in particle physics
Lista, Luca
2015-01-01
This concise set of course-based notes provides the reader with the main concepts and tools to perform statistical analysis of experimental data, in particular in the field of high-energy physics (HEP). First, an introduction to probability theory and basic statistics is given, mainly as reminder from advanced undergraduate studies, yet also in view to clearly distinguish the Frequentist versus Bayesian approaches and interpretations in subsequent applications. More advanced concepts and applications are gradually introduced, culminating in the chapter on upper limits as many applications in HEP concern hypothesis testing, where often the main goal is to provide better and better limits so as to be able to distinguish eventually between competing hypotheses or to rule out some of them altogether. Many worked examples will help newcomers to the field and graduate students to understand the pitfalls in applying theoretical concepts to actual data
[Statistical analysis of DNA sequences nearby splicing sites].
Korzinov, O M; Astakhova, T V; Vlasov, P K; Roĭtberg, M A
2008-01-01
Recognition of coding regions within eukaryotic genomes is one of oldest but yet not solved problems of bioinformatics. New high-accuracy methods of splicing sites recognition are needed to solve this problem. A question of current interest is to identify specific features of nucleotide sequences nearby splicing sites and recognize sites in sequence context. We performed a statistical analysis of human genes fragment database and revealed some characteristics of nucleotide sequences in splicing sites neighborhood. Frequencies of all nucleotides and dinucleotides in splicing sites environment were computed and nucleotides and dinucleotides with extremely high\\low occurrences were identified. Statistical information obtained in this work can be used in further development of the methods of splicing sites annotation and exon-intron structure recognition.
An Application of Multivariate Statistical Analysis for Query-Driven Visualization
Energy Technology Data Exchange (ETDEWEB)
Gosink, Luke J.; Garth, Christoph; Anderson, John C.; Bethel, E. Wes; Joy, Kenneth I.
2010-03-01
Abstract?Driven by the ability to generate ever-larger, increasingly complex data, there is an urgent need in the scientific community for scalable analysis methods that can rapidly identify salient trends in scientific data. Query-Driven Visualization (QDV) strategies are among the small subset of techniques that can address both large and highly complex datasets. This paper extends the utility of QDV strategies with a statistics-based framework that integrates non-parametric distribution estimation techniques with a new segmentation strategy to visually identify statistically significant trends and features within the solution space of a query. In this framework, query distribution estimates help users to interactively explore their query's solution and visually identify the regions where the combined behavior of constrained variables is most important, statistically, to their inquiry. Our new segmentation strategy extends the distribution estimation analysis by visually conveying the individual importance of each variable to these regions of high statistical significance. We demonstrate the analysis benefits these two strategies provide and show how they may be used to facilitate the refinement of constraints over variables expressed in a user's query. We apply our method to datasets from two different scientific domains to demonstrate its broad applicability.
On Statistical Analysis of Neuroimages with Imperfect Registration
Kim, Won Hwa; Ravi, Sathya N.; Johnson, Sterling C.; Okonkwo, Ozioma C.; Singh, Vikas
2016-01-01
A variety of studies in neuroscience/neuroimaging seek to perform statistical inference on the acquired brain image scans for diagnosis as well as understanding the pathological manifestation of diseases. To do so, an important first step is to register (or co-register) all of the image data into a common coordinate system. This permits meaningful comparison of the intensities at each voxel across groups (e.g., diseased versus healthy) to evaluate the effects of the disease and/or use machine learning algorithms in a subsequent step. But errors in the underlying registration make this problematic, they either decrease the statistical power or make the follow-up inference tasks less effective/accurate. In this paper, we derive a novel algorithm which offers immunity to local errors in the underlying deformation field obtained from registration procedures. By deriving a deformation invariant representation of the image, the downstream analysis can be made more robust as if one had access to a (hypothetical) far superior registration procedure. Our algorithm is based on recent work on scattering transform. Using this as a starting point, we show how results from harmonic analysis (especially, non-Euclidean wavelets) yields strategies for designing deformation and additive noise invariant representations of large 3-D brain image volumes. We present a set of results on synthetic and real brain images where we achieve robust statistical analysis even in the presence of substantial deformation errors; here, standard analysis procedures significantly under-perform and fail to identify the true signal. PMID:27042168
STATISTICAL ANALYSIS OF TANK 19F FLOOR SAMPLE RESULTS
Energy Technology Data Exchange (ETDEWEB)
Harris, S.
2010-09-02
Representative sampling has been completed for characterization of the residual material on the floor of Tank 19F as per the statistical sampling plan developed by Harris and Shine. Samples from eight locations have been obtained from the tank floor and two of the samples were archived as a contingency. Six samples, referred to in this report as the current scrape samples, have been submitted to and analyzed by SRNL. This report contains the statistical analysis of the floor sample analytical results to determine if further data are needed to reduce uncertainty. Included are comparisons with the prior Mantis samples results to determine if they can be pooled with the current scrape samples to estimate the upper 95% confidence limits (UCL95%) for concentration. Statistical analysis revealed that the Mantis and current scrape sample results are not compatible. Therefore, the Mantis sample results were not used to support the quantification of analytes in the residual material. Significant spatial variability among the current scrape sample results was not found. Constituent concentrations were similar between the North and South hemispheres as well as between the inner and outer regions of the tank floor. The current scrape sample results from all six samples fall within their 3-sigma limits. In view of the results from numerous statistical tests, the data were pooled from all six current scrape samples. As such, an adequate sample size was provided for quantification of the residual material on the floor of Tank 19F. The uncertainty is quantified in this report by an UCL95% on each analyte concentration. The uncertainty in analyte concentration was calculated as a function of the number of samples, the average, and the standard deviation of the analytical results. The UCL95% was based entirely on the six current scrape sample results (each averaged across three analytical determinations).
BATMAN: Bayesian Technique for Multi-image Analysis
Casado, J.; Ascasibar, Y.; García-Benito, R.; Guidi, G.; Choudhury, O. S.; Bellocchi, E.; Sánchez, S. F.; Díaz, A. I.
2016-12-01
This paper describes the Bayesian Technique for Multi-image Analysis (BATMAN), a novel image-segmentation technique based on Bayesian statistics that characterizes any astronomical dataset containing spatial information and performs a tessellation based on the measurements and errors provided as input. The algorithm iteratively merges spatial elements as long as they are statistically consistent with carrying the same information (i.e. identical signal within the errors). We illustrate its operation and performance with a set of test cases including both synthetic and real Integral-Field Spectroscopic data. The output segmentations adapt to the underlying spatial structure, regardless of its morphology and/or the statistical properties of the noise. The quality of the recovered signal represents an improvement with respect to the input, especially in regions with low signal-to-noise ratio. However, the algorithm may be sensitive to small-scale random fluctuations, and its performance in presence of spatial gradients is limited. Due to these effects, errors may be underestimated by as much as a factor of two. Our analysis reveals that the algorithm prioritizes conservation of all the statistically-significant information over noise reduction, and that the precise choice of the input data has a crucial impact on the results. Hence, the philosophy of BATMAN is not to be used as a `black box' to improve the signal-to-noise ratio, but as a new approach to characterize spatially-resolved data prior to its analysis. The source code is publicly available at http://astro.ft.uam.es/SELGIFS/BaTMAn.
Consolidity analysis for fully fuzzy functions, matrices, probability and statistics
Directory of Open Access Journals (Sweden)
Walaa Ibrahim Gabr
2015-03-01
Full Text Available The paper presents a comprehensive review of the know-how for developing the systems consolidity theory for modeling, analysis, optimization and design in fully fuzzy environment. The solving of systems consolidity theory included its development for handling new functions of different dimensionalities, fuzzy analytic geometry, fuzzy vector analysis, functions of fuzzy complex variables, ordinary differentiation of fuzzy functions and partial fraction of fuzzy polynomials. On the other hand, the handling of fuzzy matrices covered determinants of fuzzy matrices, the eigenvalues of fuzzy matrices, and solving least-squares fuzzy linear equations. The approach demonstrated to be also applicable in a systematic way in handling new fuzzy probabilistic and statistical problems. This included extending the conventional probabilistic and statistical analysis for handling fuzzy random data. Application also covered the consolidity of fuzzy optimization problems. Various numerical examples solved have demonstrated that the new consolidity concept is highly effective in solving in a compact form the propagation of fuzziness in linear, nonlinear, multivariable and dynamic problems with different types of complexities. Finally, it is demonstrated that the implementation of the suggested fuzzy mathematics can be easily embedded within normal mathematics through building special fuzzy functions library inside the computational Matlab Toolbox or using other similar software languages.
GIS-BASED SPATIAL STATISTICAL ANALYSIS OF COLLEGE GRADUATES EMPLOYMENT
Directory of Open Access Journals (Sweden)
R. Tang
2012-07-01
Full Text Available It is urgently necessary to be aware of the distribution and employment status of college graduates for proper allocation of human resources and overall arrangement of strategic industry. This study provides empirical evidence regarding the use of geocoding and spatial analysis in distribution and employment status of college graduates based on the data from 2004–2008 Wuhan Municipal Human Resources and Social Security Bureau, China. Spatio-temporal distribution of employment unit were analyzed with geocoding using ArcGIS software, and the stepwise multiple linear regression method via SPSS software was used to predict the employment and to identify spatially associated enterprise and professionals demand in the future. The results show that the enterprises in Wuhan east lake high and new technology development zone increased dramatically from 2004 to 2008, and tended to distributed southeastward. Furthermore, the models built by statistical analysis suggest that the specialty of graduates major in has an important impact on the number of the employment and the number of graduates engaging in pillar industries. In conclusion, the combination of GIS and statistical analysis which helps to simulate the spatial distribution of the employment status is a potential tool for human resource development research.
Gis-Based Spatial Statistical Analysis of College Graduates Employment
Tang, R.
2012-07-01
It is urgently necessary to be aware of the distribution and employment status of college graduates for proper allocation of human resources and overall arrangement of strategic industry. This study provides empirical evidence regarding the use of geocoding and spatial analysis in distribution and employment status of college graduates based on the data from 2004-2008 Wuhan Municipal Human Resources and Social Security Bureau, China. Spatio-temporal distribution of employment unit were analyzed with geocoding using ArcGIS software, and the stepwise multiple linear regression method via SPSS software was used to predict the employment and to identify spatially associated enterprise and professionals demand in the future. The results show that the enterprises in Wuhan east lake high and new technology development zone increased dramatically from 2004 to 2008, and tended to distributed southeastward. Furthermore, the models built by statistical analysis suggest that the specialty of graduates major in has an important impact on the number of the employment and the number of graduates engaging in pillar industries. In conclusion, the combination of GIS and statistical analysis which helps to simulate the spatial distribution of the employment status is a potential tool for human resource development research.
The features of Drosophila core promoters revealed by statistical analysis
Directory of Open Access Journals (Sweden)
Trifonov Edward N
2006-06-01
Full Text Available Abstract Background Experimental investigation of transcription is still a very labor- and time-consuming process. Only a few transcription initiation scenarios have been studied in detail. The mechanism of interaction between basal machinery and promoter, in particular core promoter elements, is not known for the majority of identified promoters. In this study, we reveal various transcription initiation mechanisms by statistical analysis of 3393 nonredundant Drosophila promoters. Results Using Drosophila-specific position-weight matrices, we identified promoters containing TATA box, Initiator, Downstream Promoter Element (DPE, and Motif Ten Element (MTE, as well as core elements discovered in Human (TFIIB Recognition Element (BRE and Downstream Core Element (DCE. Promoters utilizing known synergetic combinations of two core elements (TATA_Inr, Inr_MTE, Inr_DPE, and DPE_MTE were identified. We also establish the existence of promoters with potentially novel synergetic combinations: TATA_DPE and TATA_MTE. Our analysis revealed several motifs with the features of promoter elements, including possible novel core promoter element(s. Comparison of Human and Drosophila showed consistent percentages of promoters with TATA, Inr, DPE, and synergetic combinations thereof, as well as most of the same functional and mutual positions of the core elements. No statistical evidence of MTE utilization in Human was found. Distinct nucleosome positioning in particular promoter classes was revealed. Conclusion We present lists of promoters that potentially utilize the aforementioned elements/combinations. The number of these promoters is two orders of magnitude larger than the number of promoters in which transcription initiation was experimentally studied. The sequences are ready to be experimentally tested or used for further statistical analysis. The developed approach may be utilized for other species.
The development of human behavior analysis techniques
Energy Technology Data Exchange (ETDEWEB)
Lee, Jung Woon; Lee, Yong Hee; Park, Geun Ok; Cheon, Se Woo; Suh, Sang Moon; Oh, In Suk; Lee, Hyun Chul; Park, Jae Chang
1997-07-01
In this project, which is to study on man-machine interaction in Korean nuclear power plants, we developed SACOM (Simulation Analyzer with a Cognitive Operator Model), a tool for the assessment of task performance in the control rooms using software simulation, and also develop human error analysis and application techniques. SACOM was developed to assess operator`s physical workload, workload in information navigation at VDU workstations, and cognitive workload in procedural tasks. We developed trip analysis system including a procedure based on man-machine interaction analysis system including a procedure based on man-machine interaction analysis and a classification system. We analyzed a total of 277 trips occurred from 1978 to 1994 to produce trip summary information, and for 79 cases induced by human errors time-lined man-machine interactions. The INSTEC, a database system of our analysis results, was developed. The MARSTEC, a multimedia authoring and representation system for trip information, was also developed, and techniques for human error detection in human factors experiments were established. (author). 121 refs., 38 tabs., 52 figs.
Oliveira Mendes, Thiago de; Pinto, Liliane Pereira; Santos, Laurita dos; Tippavajhala, Vamshi Krishna; Téllez Soto, Claudio Alberto; Martin, Airton Abrahão
2016-07-01
The analysis of biological systems by spectroscopic techniques involves the evaluation of hundreds to thousands of variables. Hence, different statistical approaches are used to elucidate regions that discriminate classes of samples and to propose new vibrational markers for explaining various phenomena like disease monitoring, mechanisms of action of drugs, food, and so on. However, the technical statistics are not always widely discussed in applied sciences. In this context, this work presents a detailed discussion including the various steps necessary for proper statistical analysis. It includes univariate parametric and nonparametric tests, as well as multivariate unsupervised and supervised approaches. The main objective of this study is to promote proper understanding of the application of various statistical tools in these spectroscopic methods used for the analysis of biological samples. The discussion of these methods is performed on a set of in vivo confocal Raman spectra of human skin analysis that aims to identify skin aging markers. In the Appendix, a complete routine of data analysis is executed in a free software that can be used by the scientific community involved in these studies.
Kim, Seokyeon; Jeong, Seongmin; Woo, Insoo; Jang, Yun; Maciejewski, Ross; Ebert, David
2017-02-08
Geographic visualization research has focused on a variety of techniques to represent and explore spatiotemporal data. The goal of those techniques is to enable users to explore events and interactions over space and time in order to facilitate the discovery of patterns, anomalies and relationships within the data. However, it is difficult to extract and visualize data flow patterns over time for non-directional statistical data without trajectory information. In this work, we develop a novel flow analysis technique to extract, represent, and analyze flow maps of non-directional spatiotemporal data unaccompanied by trajectory information. We estimate a continuous distribution of these events over space and time, and extract flow fields for spatial and temporal changes utilizing a gravity model. Then, we visualize the spatiotemporal patterns in the data by employing flow visualization techniques. The user is presented with temporal trends of geo-referenced discrete events on a map. As such, overall spatiotemporal data flow patterns help users analyze geo-referenced temporal events, such as disease outbreaks, crime patterns, etc. To validate our model, we discard the trajectory information in an origin-destination dataset and apply our technique to the data and compare the derived trajectories and the original. Finally, we present spatiotemporal trend analysis for statistical datasets including twitter data, maritime search and rescue events, and syndromic surveillance.
Statistical design and analysis of RNA sequencing data.
Auer, Paul L; Doerge, R W
2010-06-01
Next-generation sequencing technologies are quickly becoming the preferred approach for characterizing and quantifying entire genomes. Even though data produced from these technologies are proving to be the most informative of any thus far, very little attention has been paid to fundamental design aspects of data collection and analysis, namely sampling, randomization, replication, and blocking. We discuss these concepts in an RNA sequencing framework. Using simulations we demonstrate the benefits of collecting replicated RNA sequencing data according to well known statistical designs that partition the sources of biological and technical variation. Examples of these designs and their corresponding models are presented with the goal of testing differential expression.
SAS and R data management, statistical analysis, and graphics
Kleinman, Ken
2014-01-01
An Up-to-Date, All-in-One Resource for Using SAS and R to Perform Frequent TasksThe first edition of this popular guide provided a path between SAS and R using an easy-to-understand, dictionary-like approach. Retaining the same accessible format, SAS and R: Data Management, Statistical Analysis, and Graphics, Second Edition explains how to easily perform an analytical task in both SAS and R, without having to navigate through the extensive, idiosyncratic, and sometimes unwieldy software documentation. The book covers many common tasks, such as data management, descriptive summaries, inferentia
Feature statistic analysis of ultrasound images of liver cancer
Huang, Shuqin; Ding, Mingyue; Zhang, Songgeng
2007-12-01
In this paper, a specific feature analysis of liver ultrasound images including normal liver, liver cancer especially hepatocellular carcinoma (HCC) and other hepatopathy is discussed. According to the classification of hepatocellular carcinoma (HCC), primary carcinoma is divided into four types. 15 features from single gray-level statistic, gray-level co-occurrence matrix (GLCM), and gray-level run-length matrix (GLRLM) are extracted. Experiments for the discrimination of each type of HCC, normal liver, fatty liver, angioma and hepatic abscess have been conducted. Corresponding features to potentially discriminate them are found.
STATISTIC ANALYSIS OF INTERNATIONAL TOURISM ON ROMANIAN SEASIDE
Directory of Open Access Journals (Sweden)
MIRELA SECARĂ
2010-01-01
Full Text Available In order to meet European and international touristic competition standards, modernization, re-establishment and development of Romanian tourism are necessary as well as creation of modern touristic products that are competitive on this market. The use of modern methods of statistic analysis in the field of tourism facilitates the achievement of systems of information that are the instruments for: evaluation of touristic demand and touristic supply, follow-up of touristic services of each touring form, follow-up of transportation services, leisure activities, hotel accommodation, touristic market study, and a complex flexible system of management and accountancy.
Statistics for proteomics: experimental design and 2-DE differential analysis.
Chich, Jean-François; David, Olivier; Villers, Fanny; Schaeffer, Brigitte; Lutomski, Didier; Huet, Sylvie
2007-04-15
Proteomics relies on the separation of complex protein mixtures using bidimensional electrophoresis. This approach is largely used to detect the expression variations of proteins prepared from two or more samples. Recently, attention was drawn on the reliability of the results published in literature. Among the critical points identified were experimental design, differential analysis and the problem of missing data, all problems where statistics can be of help. Using examples and terms understandable by biologists, we describe how a collaboration between biologists and statisticians can improve reliability of results and confidence in conclusions.
A Probabilistic Rain Diagnostic Model Based on Cyclone Statistical Analysis
Iordanidou, V.; A. G. Koutroulis; I. K. Tsanis
2014-01-01
Data from a dense network of 69 daily precipitation gauges over the island of Crete and cyclone climatological analysis over middle-eastern Mediterranean are combined in a statistical approach to develop a rain diagnostic model. Regarding the dataset, 0.5 × 0.5, 33-year (1979–2011) European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis (ERA-Interim) is used. The cyclone tracks and their characteristics are identified with the aid of Melbourne University algorithm (MS scheme). T...
Research and Development on Food Nutrition Statistical Analysis Software System
Directory of Open Access Journals (Sweden)
Du Li
2013-12-01
Full Text Available Designing and developing a set of food nutrition component statistical analysis software can realize the automation of nutrition calculation, improve the nutrition processional professional’s working efficiency and achieve the informatization of the nutrition propaganda and education. In the software development process, the software engineering method and database technology are used to calculate the human daily nutritional intake and the intelligent system is used to evaluate the user’s health condition. The experiment can show that the system can correctly evaluate the human health condition and offer the reasonable suggestion, thus exploring a new road to solve the complex nutrition computational problem with information engineering.
The R software fundamentals of programming and statistical analysis
Lafaye de Micheaux, Pierre; Liquet, Benoit
2013-01-01
The contents of The R Software are presented so as to be both comprehensive and easy for the reader to use. Besides its application as a self-learning text, this book can support lectures on R at any level from beginner to advanced. This book can serve as a textbook on R for beginners as well as more advanced users, working on Windows, MacOs or Linux OSes. The first part of the book deals with the heart of the R language and its fundamental concepts, including data organization, import and export, various manipulations, documentation, plots, programming and maintenance. The last chapter in this part deals with oriented object programming as well as interfacing R with C/C++ or Fortran, and contains a section on debugging techniques. This is followed by the second part of the book, which provides detailed explanations on how to perform many standard statistical analyses, mainly in the Biostatistics field. Topics from mathematical and statistical settings that are included are matrix operations, integration, o...
New technique for high-speed microjet breakup analysis
Energy Technology Data Exchange (ETDEWEB)
Vago, N. [Department of Atomic Physics, Budapest University of Technology and Economics, Budafoki ut 8, 1111, Budapest (Hungary); Synova SA, Ch. Dent d' Oche, 1024 Ecublens (Switzerland); Spiegel, A. [Department of Atomic Physics, Budapest University of Technology and Economics, Budafoki ut 8, 1111, Budapest (Hungary); Couty, P. [Institute of Imaging and Applied Optics, Swiss Federal Institute of Technology, Lausanne, BM, 1015, Lausanne (Switzerland); Wagner, F.R.; Richerzhagen, B. [Synova SA, Ch. Dent d' Oche, 1024 Ecublens (Switzerland)
2003-10-01
In this paper we introduce a new technique for visualizing the breakup of thin high-speed liquid jets. Focused light of a He-Ne laser is coupled into a water jet, which behaves as a cylindrical waveguide until the point where the amplitude of surface waves is large enough to scatter out the light from the jet. Observing the jet from a direction perpendicular to its axis, the light that appears indicates the location of breakup. Real-time examination and also statistical analysis of the jet disruption is possible with this method. A ray tracing method was developed to demonstrate the light scattering process. (orig.)
A probabilistic analysis of wind gusts using extreme value statistics
Energy Technology Data Exchange (ETDEWEB)
Friederichs, Petra; Bentzien, Sabrina; Lenz, Anne; Krampitz, Rebekka [Meteorological Inst., Univ. of Bonn (Germany); Goeber, Martin [Deutscher Wetterdienst, Offenbach (Germany)
2009-12-15
The spatial variability of wind gusts is probably as large as that of precipitation, but the observational weather station network is much less dense. The lack of an area-wide observational analysis hampers the forecast verification of wind gust warnings. This article develops and compares several approaches to derive a probabilistic analysis of wind gusts for Germany. Such an analysis provides a probability that a wind gust exceeds a certain warning level. To that end we have 5 years of observations of hourly wind maxima at about 140 weather stations of the German weather service at our disposal. The approaches are based on linear statistical modeling using generalized linear models, extreme value theory and quantile regression. Warning level exceedance probabilities are estimated in response to predictor variables such as the observed mean wind or the operational analysis of the wind velocity at a height of 10 m above ground provided by the European Centre for Medium Range Weather Forecasts (ECMWF). The study shows that approaches that apply to the differences between the recorded wind gust and the mean wind perform better in terms of the Brier skill score (which measures the quality of a probability forecast) than those using the gust factor or the wind gusts only. The study points to the benefit from using extreme value theory as the most appropriate and theoretically consistent statistical model. The most informative predictors are the observed mean wind, but also the observed gust velocities recorded at the neighboring stations. Out of the predictors used from the ECMWF analysis, the wind velocity at 10 m above ground is the most informative predictor, whereas the wind shear and the vertical velocity provide no additional skill. For illustration the results for January 2007 and during the winter storm Kyrill are shown. (orig.)
Statistical analysis of personal radiofrequency electromagnetic field measurements with nondetects.
Röösli, Martin; Frei, Patrizia; Mohler, Evelyn; Braun-Fahrländer, Charlotte; Bürgi, Alfred; Fröhlich, Jürg; Neubauer, Georg; Theis, Gaston; Egger, Matthias
2008-09-01
Exposimeters are increasingly applied in bioelectromagnetic research to determine personal radiofrequency electromagnetic field (RF-EMF) exposure. The main advantages of exposimeter measurements are their convenient handling for study participants and the large amount of personal exposure data, which can be obtained for several RF-EMF sources. However, the large proportion of measurements below the detection limit is a challenge for data analysis. With the robust ROS (regression on order statistics) method, summary statistics can be calculated by fitting an assumed distribution to the observed data. We used a preliminary sample of 109 weekly exposimeter measurements from the QUALIFEX study to compare summary statistics computed by robust ROS with a naïve approach, where values below the detection limit were replaced by the value of the detection limit. For the total RF-EMF exposure, differences between the naïve approach and the robust ROS were moderate for the 90th percentile and the arithmetic mean. However, exposure contributions from minor RF-EMF sources were considerably overestimated with the naïve approach. This results in an underestimation of the exposure range in the population, which may bias the evaluation of potential exposure-response associations. We conclude from our analyses that summary statistics of exposimeter data calculated by robust ROS are more reliable and more informative than estimates based on a naïve approach. Nevertheless, estimates of source-specific medians or even lower percentiles depend on the assumed data distribution and should be considered with caution. Copyright 2008 Wiley-Liss, Inc.
Uncertainty Analysis Technique for OMEGA Dante Measurements
Energy Technology Data Exchange (ETDEWEB)
May, M J; Widmann, K; Sorce, C; Park, H; Schneider, M
2010-05-07
The Dante is an 18 channel X-ray filtered diode array which records the spectrally and temporally resolved radiation flux from various targets (e.g. hohlraums, etc.) at X-ray energies between 50 eV to 10 keV. It is a main diagnostics installed on the OMEGA laser facility at the Laboratory for Laser Energetics, University of Rochester. The absolute flux is determined from the photometric calibration of the X-ray diodes, filters and mirrors and an unfold algorithm. Understanding the errors on this absolute measurement is critical for understanding hohlraum energetic physics. We present a new method for quantifying the uncertainties on the determined flux using a Monte-Carlo parameter variation technique. This technique combines the uncertainties in both the unfold algorithm and the error from the absolute calibration of each channel into a one sigma Gaussian error function. One thousand test voltage sets are created using these error functions and processed by the unfold algorithm to produce individual spectra and fluxes. Statistical methods are applied to the resultant set of fluxes to estimate error bars on the measurements.
Statistical analysis of proteomics, metabolomics, and lipidomics data using mass spectrometry
Mertens, Bart
2017-01-01
This book presents an overview of computational and statistical design and analysis of mass spectrometry-based proteomics, metabolomics, and lipidomics data. This contributed volume provides an introduction to the special aspects of statistical design and analysis with mass spectrometry data for the new omic sciences. The text discusses common aspects of design and analysis between and across all (or most) forms of mass spectrometry, while also providing special examples of application with the most common forms of mass spectrometry. Also covered are applications of computational mass spectrometry not only in clinical study but also in the interpretation of omics data in plant biology studies. Omics research fields are expected to revolutionize biomolecular research by the ability to simultaneously profile many compounds within either patient blood, urine, tissue, or other biological samples. Mass spectrometry is one of the key analytical techniques used in these new omic sciences. Liquid chromatography mass ...
Statistical analysis of nonlinear wave interactions in simulated Langmuir turbulence data
Directory of Open Access Journals (Sweden)
J. Soucek
Full Text Available We present a statistical analysis of strong turbulence of Langmuir and ion-sound waves resulting from beam-plasma interaction. The analysis is carried out on data sets produced by a numerical simulation of one-dimensional Zakharov’s equations. The nonlinear wave interactions are studied using two different approaches: high-order spectra and Volterra models. These methods were applied to identify two and three wave processes in the data, and the Volterra model was furthermore employed to evaluate the direction and magnitude of energy transfer between the wave modes in the case of Langmuir wave decay. We demonstrate that these methods allow one to determine the relative importance of strongly and weakly turbulent processes. The statistical validity of the results was thoroughly tested using surrogated data set analysis.
Key words. Space plasma physics (wave-wave interactions; experimental and mathematical techniques; nonlinear phenomena
Statistical Mechanics Ideas and Techniques Applied to Selected Problems in Ecology
Directory of Open Access Journals (Sweden)
Hugo Fort
2013-11-01
Full Text Available Ecosystem dynamics provides an interesting arena for the application of a plethora concepts and techniques from statistical mechanics. Here I review three examples corresponding each one to an important problem in ecology. First, I start with an analytical derivation of clumpy patterns for species relative abundances (SRA empirically observed in several ecological communities involving a high number n of species, a phenomenon which have puzzled ecologists for decades. An interesting point is that this derivation uses results obtained from a statistical mechanics model for ferromagnets. Second, going beyond the mean field approximation, I study the spatial version of a popular ecological model involving just one species representing vegetation. The goal is to address the phenomena of catastrophic shifts—gradual cumulative variations in some control parameter that suddenly lead to an abrupt change in the system—illustrating it by means of the process of desertification of arid lands. The focus is on the aggregation processes and the effects of diffusion that combined lead to the formation of non trivial spatial vegetation patterns. It is shown that different quantities—like the variance, the two-point correlation function and the patchiness—may serve as early warnings for the desertification of arid lands. Remarkably, in the onset of a desertification transition the distribution of vegetation patches exhibits scale invariance typical of many physical systems in the vicinity a phase transition. I comment on similarities of and differences between these catastrophic shifts and paradigmatic thermodynamic phase transitions like the liquid-vapor change of state for a fluid. Third, I analyze the case of many species interacting in space. I choose tropical forests, which are mega-diverse ecosystems that exhibit remarkable dynamics. Therefore these ecosystems represent a research paradigm both for studies of complex systems dynamics as well as to
A Comparative Analysis of Biomarker Selection Techniques
Directory of Open Access Journals (Sweden)
Nicoletta Dessì
2013-01-01
Full Text Available Feature selection has become the essential step in biomarker discovery from high-dimensional genomics data. It is recognized that different feature selection techniques may result in different set of biomarkers, that is, different groups of genes highly correlated to a given pathological condition, but few direct comparisons exist which quantify these differences in a systematic way. In this paper, we propose a general methodology for comparing the outcomes of different selection techniques in the context of biomarker discovery. The comparison is carried out along two dimensions: (i measuring the similarity/dissimilarity of selected gene sets; (ii evaluating the implications of these differences in terms of both predictive performance and stability of selected gene sets. As a case study, we considered three benchmarks deriving from DNA microarray experiments and conducted a comparative analysis among eight selection methods, representatives of different classes of feature selection techniques. Our results show that the proposed approach can provide useful insight about the pattern of agreement of biomarker discovery techniques.
UPLC: a preeminent technique in pharmaceutical analysis.
Kumar, Ashok; Saini, Gautam; Nair, Anroop; Sharma, Rishbha
2012-01-01
The pharmaceutical companies today are driven to create novel and more efficient tools to discover, develop, deliver and monitor the drugs. In this contest the development of rapid chromatographic method is crucial for the analytical laboratories. In precedent decade, substantial technological advances have been done in enhancing particle chemistry performance, improving detector design and in optimizing the system, data processors and various controls of chromatographic techniques. When all was blended together, it resulted in the outstanding performance via ultra-high performance liquid chromatography (UPLC), which holds back the principle of HPLC technique. UPLC shows a dramatic enhancement in speed, resolution as well as the sensitivity of analysis by using particle size less than 2 pm and the system is operational at higher pressure, while the mobile phase could be able to run at greater linear velocities as compared to HPLC. This technique is considered as a new focal point in field of liquid chromatographic studies. This review focuses on the basic principle, instrumentation of UPLC and its advantages over HPLC, furthermore, this article emphasizes various pharmaceutical applications of this technique.
STATISTICAL ANALYSIS OF THE TM- MODEL VIA BAYESIAN APPROACH
Directory of Open Access Journals (Sweden)
Muhammad Aslam
2012-11-01
Full Text Available The method of paired comparisons calls for the comparison of treatments presented in pairs to judges who prefer the better one based on their sensory evaluations. Thurstone (1927 and Mosteller (1951 employ the method of maximum likelihood to estimate the parameters of the Thurstone-Mosteller model for the paired comparisons. A Bayesian analysis of the said model using the non-informative reference (Jeffreys prior is presented in this study. The posterior estimates (means and joint modes of the parameters and the posterior probabilities comparing the two parameters are obtained for the analysis. The predictive probabilities that one treatment (Ti in preferred to any other treatment (Tj in a future single comparison are also computed. In addition, the graphs of the marginal posterior distributions of the individual parameter are drawn. The appropriateness of the model is also tested using the Chi-Square test statistic.
On Understanding Statistical Data Analysis in Higher Education
Montalbano, Vera
2012-01-01
Data analysis is a powerful tool in all experimental sciences. Statistical methods, such as sampling theory, computer technologies necessary for handling large amounts of data, skill in analysing information contained in different types of graphs are all competences necessary for achieving an in-depth data analysis. In higher education, these topics are usually fragmentized in different courses, the interdisciplinary integration can lack, some caution in the use of these topics can missing or be misunderstood. Students are often obliged to acquire these skills by themselves during the preparation of the final experimental thesis. A proposal for a learning path on nuclear phenomena is presented in order to develop these scientific competences in physics courses. An introduction to radioactivity and nuclear phenomenology is followed by measurements of natural radioactivity. Background and weak sources can be monitored for long time in a physics laboratory. The data are collected and analyzed in a computer lab i...
Statistical analysis of cascading failures in power grids
Energy Technology Data Exchange (ETDEWEB)
Chertkov, Michael [Los Alamos National Laboratory; Pfitzner, Rene [Los Alamos National Laboratory; Turitsyn, Konstantin [Los Alamos National Laboratory
2010-12-01
We introduce a new microscopic model of cascading failures in transmission power grids. This model accounts for automatic response of the grid to load fluctuations that take place on the scale of minutes, when optimum power flow adjustments and load shedding controls are unavailable. We describe extreme events, caused by load fluctuations, which cause cascading failures of loads, generators and lines. Our model is quasi-static in the causal, discrete time and sequential resolution of individual failures. The model, in its simplest realization based on the Directed Current description of the power flow problem, is tested on three standard IEEE systems consisting of 30, 39 and 118 buses. Our statistical analysis suggests a straightforward classification of cascading and islanding phases in terms of the ratios between average number of removed loads, generators and links. The analysis also demonstrates sensitivity to variations in line capacities. Future research challenges in modeling and control of cascading outages over real-world power networks are discussed.
Topics in statistical data analysis for high-energy physics
Cowan, G
2013-01-01
These lectures concern two topics that are becoming increasingly important in the analysis of High Energy Physics (HEP) data: Bayesian statistics and multivariate methods. In the Bayesian approach we extend the interpretation of probability to cover not only the frequency of repeatable outcomes but also to include a degree of belief. In this way we are able to associate probability with a hypothesis and thus to answer directly questions that cannot be addressed easily with traditional frequentist methods. In multivariate analysis we try to exploit as much information as possible from the characteristics that we measure for each event to distinguish between event types. In particular we will look at a method that has gained popularity in HEP in recent years: the boosted decision tree (BDT).
Using Statistical Analysis Software to Advance Nitro Plasticizer Wettability
Energy Technology Data Exchange (ETDEWEB)
Shear, Trevor Allan [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2017-08-29
Statistical analysis in science is an extremely powerful tool that is often underutilized. Additionally, it is frequently the case that data is misinterpreted or not used to its fullest extent. Utilizing the advanced software JMP®, many aspects of experimental design and data analysis can be evaluated and improved. This overview will detail the features of JMP® and how they were used to advance a project, resulting in time and cost savings, as well as the collection of scientifically sound data. The project analyzed in this report addresses the inability of a nitro plasticizer to coat a gold coated quartz crystal sensor used in a quartz crystal microbalance. Through the use of the JMP® software, the wettability of the nitro plasticizer was increased by over 200% using an atmospheric plasma pen, ensuring good sample preparation and reliable results.
Flash Infrared Thermography Contrast Data Analysis Technique
Koshti, Ajay
2014-01-01
This paper provides information on an IR Contrast technique that involves extracting normalized contrast versus time evolutions from the flash thermography inspection infrared video data. The analysis calculates thermal measurement features from the contrast evolution. In addition, simulation of the contrast evolution is achieved through calibration on measured contrast evolutions from many flat-bottom holes in the subject material. The measurement features and the contrast simulation are used to evaluate flash thermography data in order to characterize delamination-like anomalies. The thermal measurement features relate to the anomaly characteristics. The contrast evolution simulation is matched to the measured contrast evolution over an anomaly to provide an assessment of the anomaly depth and width which correspond to the depth and diameter of the equivalent flat-bottom hole (EFBH) similar to that used as input to the simulation. A similar analysis, in terms of diameter and depth of an equivalent uniform gap (EUG) providing a best match with the measured contrast evolution, is also provided. An edge detection technique called the half-max is used to measure width and length of the anomaly. Results of the half-max width and the EFBH/EUG diameter are compared to evaluate the anomaly. The information provided here is geared towards explaining the IR Contrast technique. Results from a limited amount of validation data on reinforced carbon-carbon (RCC) hardware are included in this paper.
Statistical analysis of magnetically soft particles in magnetorheological elastomers
Gundermann, T.; Cremer, P.; Löwen, H.; Menzel, A. M.; Odenbach, S.
2017-04-01
The physical properties of magnetorheological elastomers (MRE) are a complex issue and can be influenced and controlled in many ways, e.g. by applying a magnetic field, by external mechanical stimuli, or by an electric potential. In general, the response of MRE materials to these stimuli is crucially dependent on the distribution of the magnetic particles inside the elastomer. Specific knowledge of the interactions between particles or particle clusters is of high relevance for understanding the macroscopic rheological properties and provides an important input for theoretical calculations. In order to gain a better insight into the correlation between the macroscopic effects and microstructure and to generate a database for theoretical analysis, x-ray micro-computed tomography (X-μCT) investigations as a base for a statistical analysis of the particle configurations were carried out. Different MREs with quantities of 2–15 wt% (0.27–2.3 vol%) of iron powder and different allocations of the particles inside the matrix were prepared. The X-μCT results were edited by an image processing software regarding the geometrical properties of the particles with and without the influence of an external magnetic field. Pair correlation functions for the positions of the particles inside the elastomer were calculated to statistically characterize the distributions of the particles in the samples.
Biomechanical Analysis of Contemporary Throwing Technique Theory
Directory of Open Access Journals (Sweden)
Chen Jian
2015-01-01
Full Text Available Based on the movement process of throwing and in order to further improve the throwing technique of our country, this paper will first illustrate the main influence factors which will affect the shot distance via the mutual combination of movement equation and geometrical analysis. And then, it will give the equation of the acting force that the throwing athletes have to bear during throwing movement; and will reach the speed relationship between each arthrosis during throwing and batting based on the kinetic analysis of the throwing athletes’ arms while throwing. This paper will obtain the momentum relationship of the athletes’ each arthrosis by means of rotational inertia analysis; and then establish a restricted particle dynamics equation from the Lagrange equation. The obtained result shows that the momentum of throwing depends on the momentum of the athletes’ wrist joints while batting.
MALDI imaging mass spectrometry: statistical data analysis and current computational challenges
Directory of Open Access Journals (Sweden)
Alexandrov Theodore
2012-11-01
Full Text Available Abstract Matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF imaging mass spectrometry, also called MALDI-imaging, is a label-free bioanalytical technique used for spatially-resolved chemical analysis of a sample. Usually, MALDI-imaging is exploited for analysis of a specially prepared tissue section thaw mounted onto glass slide. A tremendous development of the MALDI-imaging technique has been observed during the last decade. Currently, it is one of the most promising innovative measurement techniques in biochemistry and a powerful and versatile tool for spatially-resolved chemical analysis of diverse sample types ranging from biological and plant tissues to bio and polymer thin films. In this paper, we outline computational methods for analyzing MALDI-imaging data with the emphasis on multivariate statistical methods, discuss their pros and cons, and give recommendations on their application. The methods of unsupervised data mining as well as supervised classification methods for biomarker discovery are elucidated. We also present a high-throughput computational pipeline for interpretation of MALDI-imaging data using spatial segmentation. Finally, we discuss current challenges associated with the statistical analysis of MALDI-imaging data.
Tay, C. K.; Hayford, E. K.; Hodgson, I. O. A.
2017-02-01
Multivariate statistical technique and hydrogeochemical approach were employed for groundwater assessment within the Lower Pra Basin. The main objective was to delineate the main processes that are responsible for the water chemistry and pollution of groundwater within the basin. Fifty-four (54) (No) boreholes were sampled in January 2012 for quality assessment. PCA using Varimax with Kaiser Normalization method of extraction for both rotated space and component matrix have been applied to the data. Results show that Spearman's correlation matrix of major ions revealed expected process-based relationships derived mainly from the geochemical processes, such as ion-exchange and silicate/aluminosilicate weathering within the aquifer. Three main principal components influence the water chemistry and pollution of groundwater within the basin. The three principal components have accounted for approximately 79% of the total variance in the hydrochemical data. Component 1 delineates the main natural processes (water-soil-rock interactions) through which groundwater within the basin acquires its chemical characteristics, Component 2 delineates the incongruent dissolution of silicate/aluminosilicates, while Component 3 delineates the prevalence of pollution principally from agricultural input as well as trace metal mobilization in groundwater within the basin. The loadings and score plots of the first two PCs show grouping pattern which indicates the strength of the mutual relation among the hydrochemical variables. In terms of proper management and development of groundwater within the basin, communities, where intense agriculture is taking place, should be monitored and protected from agricultural activities. especially where inorganic fertilizers are used by creating buffer zones. Monitoring of the water quality especially the water pH is recommended to ensure the acid neutralizing potential of groundwater within the basin thereby, curtailing further trace metal
Tay, C. K.; Hayford, E. K.; Hodgson, I. O. A.
2017-06-01
Multivariate statistical technique and hydrogeochemical approach were employed for groundwater assessment within the Lower Pra Basin. The main objective was to delineate the main processes that are responsible for the water chemistry and pollution of groundwater within the basin. Fifty-four (54) (No) boreholes were sampled in January 2012 for quality assessment. PCA using Varimax with Kaiser Normalization method of extraction for both rotated space and component matrix have been applied to the data. Results show that Spearman's correlation matrix of major ions revealed expected process-based relationships derived mainly from the geochemical processes, such as ion-exchange and silicate/aluminosilicate weathering within the aquifer. Three main principal components influence the water chemistry and pollution of groundwater within the basin. The three principal components have accounted for approximately 79% of the total variance in the hydrochemical data. Component 1 delineates the main natural processes (water-soil-rock interactions) through which groundwater within the basin acquires its chemical characteristics, Component 2 delineates the incongruent dissolution of silicate/aluminosilicates, while Component 3 delineates the prevalence of pollution principally from agricultural input as well as trace metal mobilization in groundwater within the basin. The loadings and score plots of the first two PCs show grouping pattern which indicates the strength of the mutual relation among the hydrochemical variables. In terms of proper management and development of groundwater within the basin, communities, where intense agriculture is taking place, should be monitored and protected from agricultural activities. especially where inorganic fertilizers are used by creating buffer zones. Monitoring of the water quality especially the water pH is recommended to ensure the acid neutralizing potential of groundwater within the basin thereby, curtailing further trace metal
GIS-based statistical mapping technique for block-and-ash pyroclastic flow and surge hazards
Widiwijayanti, C.; Voight, B.; Hidayat, D.; Schilling, S.
2008-12-01
Assessments of pyroclastic flow (PF) hazards are commonly based on mapping of PF and surge deposits and estimations of inundation limits, and/or computer models of varying degrees of sophistication. In volcanic crises a PF hazard map may be sorely needed, but limited time, exposures, or safety aspects may preclude fieldwork, and insufficient time or baseline data may be available for reliable dynamic simulations. We have developed a statistically constrained simulation model for block-and-ash PFs to estimate potential areas of inundation by adapting methodology from Iverson et al. (1998) for lahars. The predictive equations for block-and-ash PFs are calibrated with data from many volcanoes and given by A = (0.05-0.1)V2/3, B = (35-40)V2/3 , where A is cross-sectional area of inundation, B is planimetric area and V is deposit volume. The proportionality coefficients were obtained from regression analyses and comparison of simulations to mapped deposits. The method embeds the predictive equations in a GIS program coupled with DEM topography, using the LAHARZ program of Schilling (1998). Although the method is objective and reproducible, any PF hazard zone so computed should be considered as an approximate guide only, due to uncertainties on coefficients applicable to individual PFs, DEM details, and release volumes. Gradational nested hazard maps produced by these simulations reflect in a sense these uncertainties. The model does not explicitly consider dynamic behavior, which can be important. Surge impacts must be extended beyond PF hazard zones and we have explored several approaches to do this. The method has been used to supply PF hazard maps in two crises: Merapi 2006; and Montserrat 2006- 2007. We have also compared our hazard maps to actual recent PF deposits and to maps generated by several other model techniques.
HYPOTHESIS SETTING AND ORDER STATISTIC FOR ROBUST GENOMIC META-ANALYSIS.
Song, Chi; Tseng, George C
2014-01-01
Meta-analysis techniques have been widely developed and applied in genomic applications, especially for combining multiple transcriptomic studies. In this paper, we propose an order statistic of p-values (rth ordered p-value, rOP) across combined studies as the test statistic. We illustrate different hypothesis settings that detect gene markers differentially expressed (DE) "in all studies", "in the majority of studies", or "in one or more studies", and specify rOP as a suitable method for detecting DE genes "in the majority of studies". We develop methods to estimate the parameter r in rOP for real applications. Statistical properties such as its asymptotic behavior and a one-sided testing correction for detecting markers of concordant expression changes are explored. Power calculation and simulation show better performance of rOP compared to classical Fisher's method, Stouffer's method, minimum p-value method and maximum p-value method under the focused hypothesis setting. Theoretically, rOP is found connected to the naïve vote counting method and can be viewed as a generalized form of vote counting with better statistical properties. The method is applied to three microarray meta-analysis examples including major depressive disorder, brain cancer and diabetes. The results demonstrate rOP as a more generalizable, robust and sensitive statistical framework to detect disease-related markers.
Statistical Models and Methods for Network Meta-Analysis.
Madden, L V; Piepho, H-P; Paul, P A
2016-08-01
Meta-analysis, the methodology for analyzing the results from multiple independent studies, has grown tremendously in popularity over the last four decades. Although most meta-analyses involve a single effect size (summary result, such as a treatment difference) from each study, there are often multiple treatments of interest across the network of studies in the analysis. Multi-treatment (or network) meta-analysis can be used for simultaneously analyzing the results from all the treatments. However, the methodology is considerably more complicated than for the analysis of a single effect size, and there have not been adequate explanations of the approach for agricultural investigations. We review the methods and models for conducting a network meta-analysis based on frequentist statistical principles, and demonstrate the procedures using a published multi-treatment plant pathology data set. A major advantage of network meta-analysis is that correlations of estimated treatment effects are automatically taken into account when an appropriate model is used. Moreover, treatment comparisons may be possible in a network meta-analysis that are not possible in a single study because all treatments of interest may not be included in any given study. We review several models that consider the study effect as either fixed or random, and show how to interpret model-fitting output. We further show how to model the effect of moderator variables (study-level characteristics) on treatment effects, and present one approach to test for the consistency of treatment effects across the network. Online supplemental files give explanations on fitting the network meta-analytical models using SAS.
中国古代的统计分析%Statistical analysis in ancient China
Institute of Scientific and Technical Information of China (English)
莫曰达
2003-01-01
Analyzing social and economic problems through statistics is one of an important aspects of statistics thoughts in ancient China. This paper demonstrates some situations of statistical analysis in ancient China.
Short-run and Current Analysis Model in Statistics
Directory of Open Access Journals (Sweden)
Constantin Anghelache
2006-01-01
Full Text Available Using the short-run statistic indicators is a compulsory requirement implied in the current analysis. Therefore, there is a system of EUROSTAT indicators on short run which has been set up in this respect, being recommended for utilization by the member-countries. On the basis of these indicators, there are regular, usually monthly, analysis being achieved in respect of: the production dynamic determination; the evaluation of the short-run investment volume; the development of the turnover; the wage evolution: the employment; the price indexes and the consumer price index (inflation; the volume of exports and imports and the extent to which the imports are covered by the exports and the sold of trade balance. The EUROSTAT system of indicators of conjuncture is conceived as an open system, so that it can be, at any moment extended or restricted, allowing indicators to be amended or even removed, depending on the domestic users requirements as well as on the specific requirements of the harmonization and integration. For the short-run analysis, there is also the World Bank system of indicators of conjuncture, which is utilized, relying on the data sources offered by the World Bank, The World Institute for Resources or other international organizations statistics. The system comprises indicators of the social and economic development and focuses on the indicators for the following three fields: human resources, environment and economic performances. At the end of the paper, there is a case study on the situation of Romania, for which we used all these indicators.
Short-run and Current Analysis Model in Statistics
Directory of Open Access Journals (Sweden)
Constantin Mitrut
2006-03-01
Full Text Available Using the short-run statistic indicators is a compulsory requirement implied in the current analysis. Therefore, there is a system of EUROSTAT indicators on short run which has been set up in this respect, being recommended for utilization by the member-countries. On the basis of these indicators, there are regular, usually monthly, analysis being achieved in respect of: the production dynamic determination; the evaluation of the short-run investment volume; the development of the turnover; the wage evolution: the employment; the price indexes and the consumer price index (inflation; the volume of exports and imports and the extent to which the imports are covered by the exports and the sold of trade balance. The EUROSTAT system of indicators of conjuncture is conceived as an open system, so that it can be, at any moment extended or restricted, allowing indicators to be amended or even removed, depending on the domestic users requirements as well as on the specific requirements of the harmonization and integration. For the short-run analysis, there is also the World Bank system of indicators of conjuncture, which is utilized, relying on the data sources offered by the World Bank, The World Institute for Resources or other international organizations statistics. The system comprises indicators of the social and economic development and focuses on the indicators for the following three fields: human resources, environment and economic performances. At the end of the paper, there is a case study on the situation of Romania, for which we used all these indicators.
Directory of Open Access Journals (Sweden)
M. V. Ninu Krishnan
2015-01-01
Full Text Available In the present study, the Information Value (InfoVal and the Multiple Logistic Regression (MLR methods based on bivariate and multivariate statistical analysis have been applied for shallow landslide initiation susceptibility assessment in a selected subwatershed in the Western Ghats, Kerala, India, to determine the suitability of geographical information systems (GIS assisted statistical landslide susceptibility assessment methods in the data constrained regions. The different landslide conditioning terrain variables considered in the analysis are geomorphology, land use/land cover, soil thickness, slope, aspect, relative relief, plan curvature, profile curvature, drainage density, the distance from drainages, lineament density and distance from lineaments. Landslide Susceptibility Index (LSI maps were produced by integrating the weighted themes and divided into five landslide susceptibility zones (LSZ by correlating the LSI with general terrain conditions. The predictive performances of the models were evaluated through success and prediction rate curves. The area under success rate curves (AUC for InfoVal and MLR generated susceptibility maps shows 84.11% and 68.65%, respectively. The prediction rate curves show good to moderate correlation between the distribution of the validation group of landslides and LSZ maps with AUC values of 0.648 and 0.826 respectively for MLR and InfoVal produced LSZ maps. Considering the best fit and suitability of the models in the study area by quantitative prediction accuracy, LSZ map produced by the InfoVal technique shows higher accuracy, i.e. 82.60%, than the MLR model and is more realistic while compared in the field and is considered as the best suited model for the assessment of landslide susceptibility in areas similar to the study area. The LSZ map produced for the area can be utilised for regional planning and assessment process, by incorporating the generalised rainfall conditions in the area. DOI
Comparative Analysis of Hand Gesture Recognition Techniques
Directory of Open Access Journals (Sweden)
Arpana K. Patel
2015-03-01
Full Text Available During past few years, human hand gesture for interaction with computing devices has continues to be active area of research. In this paper survey of hand gesture recognition is provided. Hand Gesture Recognition is contained three stages: Pre-processing, Feature Extraction or matching and Classification or recognition. Each stage contains different methods and techniques. In this paper define small description of different methods used for hand gesture recognition in existing system with comparative analysis of all method with its benefits and drawbacks are provided.
COSIMA data analysis using multivariate techniques
Directory of Open Access Journals (Sweden)
J. Silén
2014-08-01
Full Text Available We describe how to use multivariate analysis of complex TOF-SIMS spectra introducing the method of random projections. The technique allows us to do full clustering and classification of the measured mass spectra. In this paper we use the tool for classification purposes. The presentation describes calibration experiments of 19 minerals on Ag and Au substrates using positive mode ion spectra. The discrimination between individual minerals gives a crossvalidation Cohen κ for classification of typically about 80%. We intend to use the method as a fast tool to deduce a qualitative similarity of measurements.
The system for statistical analysis of logistic information
Directory of Open Access Journals (Sweden)
Khayrullin Rustam Zinnatullovich
2015-05-01
Full Text Available The current problem for managers in logistic and trading companies is the task of improving the operational business performance and developing the logistics support of sales. The development of logistics sales supposes development and implementation of a set of works for the development of the existing warehouse facilities, including both a detailed description of the work performed, and the timing of their implementation. Logistics engineering of warehouse complex includes such tasks as: determining the number and the types of technological zones, calculation of the required number of loading-unloading places, development of storage structures, development and pre-sales preparation zones, development of specifications of storage types, selection of loading-unloading equipment, detailed planning of warehouse logistics system, creation of architectural-planning decisions, selection of information-processing equipment, etc. The currently used ERP and WMS systems did not allow us to solve the full list of logistics engineering problems. In this regard, the development of specialized software products, taking into account the specifics of warehouse logistics, and subsequent integration of these software with ERP and WMS systems seems to be a current task. In this paper we suggest a system of statistical analysis of logistics information, designed to meet the challenges of logistics engineering and planning. The system is based on the methods of statistical data processing.The proposed specialized software is designed to improve the efficiency of the operating business and the development of logistics support of sales. The system is based on the methods of statistical data processing, the methods of assessment and prediction of logistics performance, the methods for the determination and calculation of the data required for registration, storage and processing of metal products, as well as the methods for planning the reconstruction and development
Study of the Severity of Accidents in Tehran Using Statistical Modeling and Data Mining Techniques
Directory of Open Access Journals (Sweden)
Hesamaldin Razi
2013-01-01
Full Text Available AbstractBackgrounds and Aims: The Tehran province was subject to the second highest incidence of fatalities due to traffic accidents in 1390. Most studies in this field examine rural traffic accidents, but this study is based on the use of logit models and artificial neural networks to evaluate the factors that affect the severity of accidents within the city of Tehran.Materials and Methods: Among the various types of crashes, head-on collisions are specified as the most serious type, which is investigated in this study with the use of Tehran’s accident data. In the modeling process, the severity of the accident is the dependent variable and defined as a binary covariate, which are non-injury accidents and injury accidents. The independent variables are parameters such as the characteristics of the driver, time of the accident, traffic and environmental characteristics. In addition to the prediction accuracy comparison of the two models, the elasticity of the logit model is compared with a sensitivity analysis of the neural network.Results: The results show that the proposed model provides a good estimate of an accident's severity. The explanatory variables that have been determined to be significant in the final models are the driver’s gender, age and education, along with negligence of the traffic rules, inappropriate acceleration, deviation to the left, type of vehicle, pavement conditions, time of the crash and street width.Conclusion: An artificial neural network model can be useful as a statistical model in the analysis of factors that affect the severity of accidents. According to the results, human errors and illiteracy of drivers increase the severity of crashes, and therefore, educating drivers is the main strategy that will reduce accident severity in Iran. Special attention should be given to a driver’s age group, with particular care taken when they are very young.
Statistical techniques for diagnosing CIN using fluorescence spectroscopy: SVD and CART.
Atkinson, E N; Mitchell, M F; Ramanujam, N; Richards-Kortum, R
1995-01-01
A quantitative measure of intraepithelial neoplasia which can be made in vivo without tissue removal would be clinically significant in chemoprevention studies. Our group is working to develop such a technique based on fluorescence spectroscopy. Using empirically based algorithms, we have demonstrated that fluorescence is discriminating normal cervix from low- and high-grade cervical dysplasias with similar performance to colposcopy in expert hands. These measurements can be made in vivo, in near real time, and results can be obtained without biopsy. This paper describes a new method using automated analysis of fluorescence emission spectra to classify cervical tissue into multiple diagnostic categories. First, data is reduced using the singular value decomposition (SVD), yielding a set of orthogonal basis vectors. Each patient's emission spectrum is then fit by linear least squares regression to the basis vectors, producing a set of coefficients for each patient. Based on these coefficient values, the classification and regression tree (CART) method predicts the patient's classification. These results suggest that laser-induced fluorescence can be used to automatically recognize and differentially diagnose cervical intraepithelial neoplasia (CIN) at colposcopy. This method of analysis is general in nature, and can analyze fluorescence spectra of suspected intraepithelial neoplasms from other organ sites. As a more complete understanding of the biochemical and morphologic basis of tissue spectroscopy is developed, it may also be possible to use fluorescence spectroscopy of the cervix as a surrogate endpoint biomarker in Phase I and II chemoprevention trials.
Directory of Open Access Journals (Sweden)
Gledsneli Maria Lima Lins
2010-12-01
Full Text Available Water has a decisive influence on populations’ life quality – specifically in areas like urban supply, drainage, and effluents treatment – due to its sound impact over public health. Water rational use constitutes the greatest challenge faced by water demand management, mainly with regard to urban household water consumption. This makes it important to develop researches to assist water managers and public policy-makers in planning and formulating water demand measures which may allow urban water rational use to be met. This work utilized the multivariate techniques Factor Analysis and Multiple Linear Regression Analysis – in order to determine the participation level of socioeconomic and climatic variables in monthly urban household consumption changes – applying them to two districts of Campina Grande city (State of Paraíba, Brazil. The districts were chosen based on socioeconomic criterion (income level so as to evaluate their water consumer’s behavior. A 9-year monthly data series (from year 2000 up to 2008 was utilized, comprising family income, water tariff, and quantity of household connections (economies – as socioeconomic variables – and average temperature and precipitation, as climatic variables. For both the selected districts of Campina Grande city, the obtained results point out the variables “water tariff” and “family income” as indicators of these district’s household consumption.
Statistical analysis of the breaking processes of Ni nanowires
Energy Technology Data Exchange (ETDEWEB)
Garcia-Mochales, P [Departamento de Fisica de la Materia Condensada, Facultad de Ciencias, Universidad Autonoma de Madrid, c/ Francisco Tomas y Valiente 7, Campus de Cantoblanco, E-28049-Madrid (Spain); Paredes, R [Centro de Fisica, Instituto Venezolano de Investigaciones CientIficas, Apartado 20632, Caracas 1020A (Venezuela); Pelaez, S; Serena, P A [Instituto de Ciencia de Materiales de Madrid, Consejo Superior de Investigaciones CientIficas, c/ Sor Juana Ines de la Cruz 3, Campus de Cantoblanco, E-28049-Madrid (Spain)], E-mail: pedro.garciamochales@uam.es
2008-06-04
We have performed a massive statistical analysis on the breaking behaviour of Ni nanowires using molecular dynamic simulations. Three stretching directions, five initial nanowire sizes and two temperatures have been studied. We have constructed minimum cross-section histograms and analysed for the first time the role played by monomers and dimers. The shape of such histograms and the absolute number of monomers and dimers strongly depend on the stretching direction and the initial size of the nanowire. In particular, the statistical behaviour of the breakage final stages of narrow nanowires strongly differs from the behaviour obtained for large nanowires. We have analysed the structure around monomers and dimers. Their most probable local configurations differ from those usually appearing in static electron transport calculations. Their non-local environments show disordered regions along the nanowire if the stretching direction is [100] or [110]. Additionally, we have found that, at room temperature, [100] and [110] stretching directions favour the appearance of non-crystalline staggered pentagonal structures. These pentagonal Ni nanowires are reported in this work for the first time. This set of results suggests that experimental Ni conducting histograms could show a strong dependence on the orientation and temperature.
Blog Content and User Engagement - An Insight Using Statistical Analysis.
Directory of Open Access Journals (Sweden)
Apoorva Vikrant Kulkarni
2013-06-01
Full Text Available Since the past few years organizations have increasingly realized the value of social media in positioning, propagating and marketing the product/service and organization itself. Today every organization be it small or big has realized the essence of creating a space in the World Wide Web. Social Media through its multifaceted platforms has enabled the organizations to propagate their brands. There are a number of social media networks which are helpful in spreading the message to customers. Many organizations are having full time web analytics teams that are regularly trying to ensure that prospectivecustomers are visiting their organization through various forms of social media. Web analytics is foreseen as a tool for Business Intelligence by organizations and there are a large number of analytics tools available for monitoring the visibility of a particular brand on the web. For example, Google has its ownanalytic tool that is very widely used. There are number of free as well as paid analytical tools available on the internet. The objective of this paper is to study what content in a blog present in the social media creates a greater impact on user engagement. The study statistically analyzes the relation between content of the blog and user engagement. The statistical analysis was carried out on a blog of a reputed management institute in Pune to arrive at conclusions.
Data analysis techniques for gravitational wave observations
Indian Academy of Sciences (India)
S V Dhurandhar
2004-10-01
Astrophysical sources of gravitational waves fall broadly into three categories: (i) transient and bursts, (ii) periodic or continuous wave and (iii) stochastic. Each type of source requires a different type of data analysis strategy. In this talk various data analysis strategies will be reviewed. Optimal filtering is used for extracting binary inspirals; Fourier transforms over Doppler shifted time intervals are computed for long duration periodic sources; optimally weighted cross-correlations for stochastic background. Some recent schemes which efficiently search for inspirals will be described. The performance of some of these techniques on real data obtained will be discussed. Finally, some results on cancellation of systematic noises in laser interferometric space antenna (LISA) will be presented and future directions indicated.
Application of Electromigration Techniques in Environmental Analysis
Bald, Edward; Kubalczyk, Paweł; Studzińska, Sylwia; Dziubakiewicz, Ewelina; Buszewski, Bogusław
Inherently trace-level concentration of pollutants in the environment, together with the complexity of sample matrices, place a strong demand on the detection capabilities of electromigration methods. Significant progress is continually being made, widening the applicability of these techniques, mostly capillary zone electrophoresis, micellar electrokinetic chromatography, and capillary electrochromatography, to the analysis of real-world environmental samples, including the concentration sensitivity and robustness of the developed analytical procedures. This chapter covers the recent major developments in the domain of capillary electrophoresis analysis of environmental samples for pesticides, polycyclic aromatic hydrocarbons, phenols, amines, carboxylic acids, explosives, pharmaceuticals, and ionic liquids. Emphasis is made on pre-capillary and on-capillary chromatography and electrophoresis-based concentration of analytes and detection improvement.
Piersanti, Mirko; Materassi, Massimo; Spogli, Luca; Cicone, Antonio; Alberti, Tommaso
2016-04-01
Highly irregular fluctuations of the power of trans-ionospheric GNSS signals, namely radio power scintillation, are, at least to a large extent, the effect of ionospheric plasma turbulence, a by-product of the non-linear and non-stationary evolution of the plasma fields defining the Earth's upper atmosphere. One could expect the ionospheric turbulence characteristics of inter-scale coupling, local randomness and high time variability to be inherited by the scintillation on radio signals crossing the medium. On this basis, the remote sensing of local features of the turbulent plasma could be expected as feasible by studying radio scintillation. The dependence of the statistical properties of the medium fluctuations on the space- and time-scale is the distinctive character of intermittent turbulent media. In this paper, a multi-scale statistical analysis of some samples of GPS radio scintillation is presented: the idea is that assessing how the statistics of signal fluctuations vary with time scale under different Helio-Geophysical conditions will be of help in understanding the corresponding multi-scale statistics of the turbulent medium causing that scintillation. In particular, two techniques are tested as multi-scale decomposition schemes of the signals: the discrete wavelet analysis and the Empirical Mode Decomposition. The discussion of the results of the one analysis versus the other will be presented, trying to highlight benefits and limits of each scheme, also under suitably different helio-geophysical conditions.
Statistical models of video structure for content analysis and characterization.
Vasconcelos, N; Lippman, A
2000-01-01
Content structure plays an important role in the understanding of video. In this paper, we argue that knowledge about structure can be used both as a means to improve the performance of content analysis and to extract features that convey semantic information about the content. We introduce statistical models for two important components of this structure, shot duration and activity, and demonstrate the usefulness of these models with two practical applications. First, we develop a Bayesian formulation for the shot segmentation problem that is shown to extend the standard thresholding model in an adaptive and intuitive way, leading to improved segmentation accuracy. Second, by applying the transformation into the shot duration/activity feature space to a database of movie clips, we also illustrate how the Bayesian model captures semantic properties of the content. We suggest ways in which these properties can be used as a basis for intuitive content-based access to movie libraries.
Frequency of PSV inspection optmization using statistical data analysis
Directory of Open Access Journals (Sweden)
Alexandre Guimarães Botelho
2015-12-01
Full Text Available The present paper shows how qualitative analytical methodologies can be enhanced by statistical failure data analysis of process equipment in order to select an appropriate and cost-effective maintenance policy to reduce equipment life cycle cost. As such, a case study was carried out with failure and maintenance data from a sample of pressure safety valves (PSV of a PETROBRAS’s oil and gas production unit. Data was classified according to a failure mode and effect analysis— FMEA, and adjusted using a weibull distribution. The results show the possibility of reduction of maintenance frequency representing 29% of event reduction, without increasing risk, as well as evidencing the potential failures which must be blocked by the inspection plan.
Higher order statistical moment application for solar PV potential analysis
Basri, Mohd Juhari Mat; Abdullah, Samizee; Azrulhisham, Engku Ahmad; Harun, Khairulezuan
2016-10-01
Solar photovoltaic energy could be as alternative energy to fossil fuel, which is depleting and posing a global warming problem. However, this renewable energy is so variable and intermittent to be relied on. Therefore the knowledge of energy potential is very important for any site to build this solar photovoltaic power generation system. Here, the application of higher order statistical moment model is being analyzed using data collected from 5MW grid-connected photovoltaic system. Due to the dynamic changes of skewness and kurtosis of AC power and solar irradiance distributions of the solar farm, Pearson system where the probability distribution is calculated by matching their theoretical moments with that of the empirical moments of a distribution could be suitable for this purpose. On the advantage of the Pearson system in MATLAB, a software programming has been developed to help in data processing for distribution fitting and potential analysis for future projection of amount of AC power and solar irradiance availability.
Statistical uncertainty analysis of radon transport in nonisothermal, unsaturated soils
Energy Technology Data Exchange (ETDEWEB)
Holford, D.J.; Owczarski, P.C.; Gee, G.W.; Freeman, H.D.
1990-10-01
To accurately predict radon fluxes soils to the atmosphere, we must know more than the radium content of the soil. Radon flux from soil is affected not only by soil properties, but also by meteorological factors such as air pressure and temperature changes at the soil surface, as well as the infiltration of rainwater. Natural variations in meteorological factors and soil properties contribute to uncertainty in subsurface model predictions of radon flux, which, when coupled with a building transport model, will also add uncertainty to predictions of radon concentrations in homes. A statistical uncertainty analysis using our Rn3D finite-element numerical model was conducted to assess the relative importance of these meteorological factors and the soil properties affecting radon transport. 10 refs., 10 figs., 3 tabs.
A Statistical Analysis of Cointegration for I(2) Variables
DEFF Research Database (Denmark)
Johansen, Søren
1995-01-01
This paper discusses inference for I(2) variables in a VAR model. The estimation procedure suggested consists of two reduced rank regressions. The asymptotic distribution of the proposed estimators of the cointegrating coefficients is mixed Gaussian, which implies that asymptotic inference can...... be conducted using the ¿ sup2/sup distribution. It is shown to what extent inference on the cointegration ranks can be conducted using the tables already prepared for the analysis of cointegration of I(1) variables. New tables are needed for the test statistics to control the size of the tests. This paper...... contains a multivariate test for the existence of I(2) variables. This test is illustrated using a data set consisting of U.K. and foreign prices and interest rates as well as the exchange rate....
A numerical comparison of sensitivity analysis techniques
Energy Technology Data Exchange (ETDEWEB)
Hamby, D.M.
1993-12-31
Engineering and scientific phenomena are often studied with the aid of mathematical models designed to simulate complex physical processes. In the nuclear industry, modeling the movement and consequence of radioactive pollutants is extremely important for environmental protection and facility control. One of the steps in model development is the determination of the parameters most influential on model results. A {open_quotes}sensitivity analysis{close_quotes} of these parameters is not only critical to model validation but also serves to guide future research. A previous manuscript (Hamby) detailed many of the available methods for conducting sensitivity analyses. The current paper is a comparative assessment of several methods for estimating relative parameter sensitivity. Method practicality is based on calculational ease and usefulness of the results. It is the intent of this report to demonstrate calculational rigor and to compare parameter sensitivity rankings resulting from various sensitivity analysis techniques. An atmospheric tritium dosimetry model (Hamby) is used here as an example, but the techniques described can be applied to many different modeling problems. Other investigators (Rose; Dalrymple and Broyd) present comparisons of sensitivity analyses methodologies, but none as comprehensive as the current work.
ROOT - A C++ Framework for Petabyte Data Storage, Statistical Analysis and Visualization
Naumann, Axel; Ballintijn, Maarten; Bellenot, Bertrand; Biskup, Marek; Brun, Rene; Buncic, Nenad; Canal, Philippe; Casadei, Diego; Couet, Olivier; Fine, Valery; Franco, Leandro; Ganis, Gerardo; Gheata, Andrei; Gonzalez~Maline, David; Goto, Masaharu; Iwaszkiewicz, Jan; Kreshuk, Anna; Marcos Segura, Diego; Maunder, Richard; Moneta, Lorenzo; Offermann, Eddy; Onuchin, Valeriy; Panacek, Suzanne; Rademakers, Fons; Russo, Paul; Tadel, Matevz
2009-01-01
ROOT is an object-oriented C++ framework conceived in the high-energy physics (HEP) community, designed for storing and analyzing petabytes of data in an efficient way. Any instance of a C++ class can be stored into a ROOT file in a machine-independent compressed binary format. In ROOT the TTree object container is optimized for statistical data analysis over very large data sets by using vertical data storage techniques. These containers can span a large number of files on local disks, the web, or a number of different shared file systems. In order to analyze this data, the user can chose out of a wide set of mathematical and statistical functions, including linear algebra classes, numerical algorithms such as integration and minimization, and various methods for performing regression analysis (fitting). In particular, the RooFit package allows the user to perform complex data modeling and fitting while the RooStats library provides abstractions and implementations for advance...