WorldWideScience

Sample records for multivariable statistical techniques

  1. Application of multivariate statistical techniques in microbial ecology.

    Science.gov (United States)

    Paliy, O; Shankar, V

    2016-03-01

    Recent advances in high-throughput methods of molecular analyses have led to an explosion of studies generating large-scale ecological data sets. In particular, noticeable effect has been attained in the field of microbial ecology, where new experimental approaches provided in-depth assessments of the composition, functions and dynamic changes of complex microbial communities. Because even a single high-throughput experiment produces large amount of data, powerful statistical techniques of multivariate analysis are well suited to analyse and interpret these data sets. Many different multivariate techniques are available, and often it is not clear which method should be applied to a particular data set. In this review, we describe and compare the most widely used multivariate statistical techniques including exploratory, interpretive and discriminatory procedures. We consider several important limitations and assumptions of these methods, and we present examples of how these approaches have been utilized in recent studies to provide insight into the ecology of the microbial world. Finally, we offer suggestions for the selection of appropriate methods based on the research question and data set structure. © 2016 John Wiley & Sons Ltd.

  2. Application of Multivariable Statistical Techniques in Plant-wide WWTP Control Strategies Analysis

    DEFF Research Database (Denmark)

    Flores Alsina, Xavier; Comas, J.; Rodríguez-Roda, I.

    2007-01-01

    The main objective of this paper is to present the application of selected multivariable statistical techniques in plant-wide wastewater treatment plant (WWTP) control strategies analysis. In this study, cluster analysis (CA), principal component analysis/factor analysis (PCA/FA) and discriminant...... analysis (DA) are applied to the evaluation matrix data set obtained by simulation of several control strategies applied to the plant-wide IWA Benchmark Simulation Model No 2 (BSM2). These techniques allow i) to determine natural groups or clusters of control strategies with a similar behaviour, ii......) to find and interpret hidden, complex and casual relation features in the data set and iii) to identify important discriminant variables within the groups found by the cluster analysis. This study illustrates the usefulness of multivariable statistical techniques for both analysis and interpretation...

  3. A primer of multivariate statistics

    CERN Document Server

    Harris, Richard J

    2014-01-01

    Drawing upon more than 30 years of experience in working with statistics, Dr. Richard J. Harris has updated A Primer of Multivariate Statistics to provide a model of balance between how-to and why. This classic text covers multivariate techniques with a taste of latent variable approaches. Throughout the book there is a focus on the importance of describing and testing one's interpretations of the emergent variables that are produced by multivariate analysis. This edition retains its conversational writing style while focusing on classical techniques. The book gives the reader a feel for why

  4. Method for statistical data analysis of multivariate observations

    CERN Document Server

    Gnanadesikan, R

    1997-01-01

    A practical guide for multivariate statistical techniques-- now updated and revised In recent years, innovations in computer technology and statistical methodologies have dramatically altered the landscape of multivariate data analysis. This new edition of Methods for Statistical Data Analysis of Multivariate Observations explores current multivariate concepts and techniques while retaining the same practical focus of its predecessor. It integrates methods and data-based interpretations relevant to multivariate analysis in a way that addresses real-world problems arising in many areas of inte

  5. Multivariate Statistical Process Control Charts: An Overview

    OpenAIRE

    Bersimis, Sotiris; Psarakis, Stelios; Panaretos, John

    2006-01-01

    In this paper we discuss the basic procedures for the implementation of multivariate statistical process control via control charting. Furthermore, we review multivariate extensions for all kinds of univariate control charts, such as multivariate Shewhart-type control charts, multivariate CUSUM control charts and multivariate EWMA control charts. In addition, we review unique procedures for the construction of multivariate control charts, based on multivariate statistical techniques such as p...

  6. Multivariate methods and forecasting with IBM SPSS statistics

    CERN Document Server

    Aljandali, Abdulkader

    2017-01-01

    This is the second of a two-part guide to quantitative analysis using the IBM SPSS Statistics software package; this volume focuses on multivariate statistical methods and advanced forecasting techniques. More often than not, regression models involve more than one independent variable. For example, forecasting methods are commonly applied to aggregates such as inflation rates, unemployment, exchange rates, etc., that have complex relationships with determining variables. This book introduces multivariate regression models and provides examples to help understand theory underpinning the model. The book presents the fundamentals of multivariate regression and then moves on to examine several related techniques that have application in business-orientated fields such as logistic and multinomial regression. Forecasting tools such as the Box-Jenkins approach to time series modeling are introduced, as well as exponential smoothing and naïve techniques. This part also covers hot topics such as Factor Analysis, Dis...

  7. Hierarchical probabilistic regionalization of volcanism for Sengan region in Japan using multivariate statistical techniques and geostatistical interpolation techniques

    International Nuclear Information System (INIS)

    Park, Jinyong; Balasingham, P.; McKenna, Sean Andrew; Kulatilake, Pinnaduwa H. S. W.

    2004-01-01

    Sandia National Laboratories, under contract to Nuclear Waste Management Organization of Japan (NUMO), is performing research on regional classification of given sites in Japan with respect to potential volcanic disruption using multivariate statistics and geo-statistical interpolation techniques. This report provides results obtained for hierarchical probabilistic regionalization of volcanism for the Sengan region in Japan by applying multivariate statistical techniques and geostatistical interpolation techniques on the geologic data provided by NUMO. A workshop report produced in September 2003 by Sandia National Laboratories (Arnold et al., 2003) on volcanism lists a set of most important geologic variables as well as some secondary information related to volcanism. Geologic data extracted for the Sengan region in Japan from the data provided by NUMO revealed that data are not available at the same locations for all the important geologic variables. In other words, the geologic variable vectors were found to be incomplete spatially. However, it is necessary to have complete geologic variable vectors to perform multivariate statistical analyses. As a first step towards constructing complete geologic variable vectors, the Universal Transverse Mercator (UTM) zone 54 projected coordinate system and a 1 km square regular grid system were selected. The data available for each geologic variable on a geographic coordinate system were transferred to the aforementioned grid system. Also the recorded data on volcanic activity for Sengan region were produced on the same grid system. Each geologic variable map was compared with the recorded volcanic activity map to determine the geologic variables that are most important for volcanism. In the regionalized classification procedure, this step is known as the variable selection step. The following variables were determined as most important for volcanism: geothermal gradient, groundwater temperature, heat discharge, groundwater

  8. Assessment of Surface Water Quality Using Multivariate Statistical Techniques in the Terengganu River Basin

    International Nuclear Information System (INIS)

    Aminu Ibrahim; Hafizan Juahir; Mohd Ekhwan Toriman; Mustapha, A.; Azman Azid; Isiyaka, H.A.

    2015-01-01

    Multivariate Statistical techniques including cluster analysis, discriminant analysis, and principal component analysis/factor analysis were applied to investigate the spatial variation and pollution sources in the Terengganu river basin during 5 years of monitoring 13 water quality parameters at thirteen different stations. Cluster analysis (CA) classified 13 stations into 2 clusters low polluted (LP) and moderate polluted (MP) based on similar water quality characteristics. Discriminant analysis (DA) rendered significant data reduction with 4 parameters (pH, NH 3 -NL, PO 4 and EC) and correct assignation of 95.80 %. The PCA/ FA applied to the data sets, yielded in five latent factors accounting 72.42 % of the total variance in the water quality data. The obtained varifactors indicate that parameters in charge for water quality variations are mainly related to domestic waste, industrial, runoff and agricultural (anthropogenic activities). Therefore, multivariate techniques are important in environmental management. (author)

  9. Application of multivariate statistical techniques for differentiation of ripe banana flour based on the composition of elements.

    Science.gov (United States)

    Alkarkhi, Abbas F M; Ramli, Saifullah Bin; Easa, Azhar Mat

    2009-01-01

    Major (sodium, potassium, calcium, magnesium) and minor elements (iron, copper, zinc, manganese) and one heavy metal (lead) of Cavendish banana flour and Dream banana flour were determined, and data were analyzed using multivariate statistical techniques of factor analysis and discriminant analysis. Factor analysis yielded four factors explaining more than 81% of the total variance: the first factor explained 28.73%, comprising magnesium, sodium, and iron; the second factor explained 21.47%, comprising only manganese and copper; the third factor explained 15.66%, comprising zinc and lead; while the fourth factor explained 15.50%, comprising potassium. Discriminant analysis showed that magnesium and sodium exhibited a strong contribution in discriminating the two types of banana flour, affording 100% correct assignation. This study presents the usefulness of multivariate statistical techniques for analysis and interpretation of complex mineral content data from banana flour of different varieties.

  10. Applied Statistics: From Bivariate through Multivariate Techniques [with CD-ROM

    Science.gov (United States)

    Warner, Rebecca M.

    2007-01-01

    This book provides a clear introduction to widely used topics in bivariate and multivariate statistics, including multiple regression, discriminant analysis, MANOVA, factor analysis, and binary logistic regression. The approach is applied and does not require formal mathematics; equations are accompanied by verbal explanations. Students are asked…

  11. Water quality, Multivariate statistical techniques, submarine out fall, spatial variation, temporal variation

    International Nuclear Information System (INIS)

    Garcia, Francisco; Palacio, Carlos; Garcia, Uriel

    2012-01-01

    Multivariate statistical techniques were used to investigate the temporal and spatial variations of water quality at the Santa Marta coastal area where a submarine out fall that discharges 1 m3/s of domestic wastewater is located. Two-way analysis of variance (ANOVA), cluster and principal component analysis and Krigging interpolation were considered for this report. Temporal variation showed two heterogeneous periods. From December to April, and July, where the concentration of the water quality parameters is higher; the rest of the year (May, June, August-November) were significantly lower. The spatial variation reported two areas where the water quality is different, this difference is related to the proximity to the submarine out fall discharge.

  12. Multivariate mixed linear model analysis of longitudinal data: an information-rich statistical technique for analyzing disease resistance data

    Science.gov (United States)

    The mixed linear model (MLM) is currently among the most advanced and flexible statistical modeling techniques and its use in tackling problems in plant pathology has begun surfacing in the literature. The longitudinal MLM is a multivariate extension that handles repeatedly measured data, such as r...

  13. Synthetic environmental indicators: A conceptual approach from the multivariate statistics

    International Nuclear Information System (INIS)

    Escobar J, Luis A

    2008-01-01

    This paper presents a general description of multivariate statistical analysis and shows two methodologies: analysis of principal components and analysis of distance, DP2. Both methods use techniques of multivariate analysis to define the true dimension of data, which is useful to estimate indicators of environmental quality.

  14. Multivariate statistical methods a first course

    CERN Document Server

    Marcoulides, George A

    2014-01-01

    Multivariate statistics refer to an assortment of statistical methods that have been developed to handle situations in which multiple variables or measures are involved. Any analysis of more than two variables or measures can loosely be considered a multivariate statistical analysis. An introductory text for students learning multivariate statistical methods for the first time, this book keeps mathematical details to a minimum while conveying the basic principles. One of the principal strategies used throughout the book--in addition to the presentation of actual data analyses--is poin

  15. Advances in statistical monitoring of complex multivariate processes with applications in industrial process control

    CERN Document Server

    Kruger, Uwe

    2012-01-01

    The development and application of multivariate statistical techniques in process monitoring has gained substantial interest over the past two decades in academia and industry alike.  Initially developed for monitoring and fault diagnosis in complex systems, such techniques have been refined and applied in various engineering areas, for example mechanical and manufacturing, chemical, electrical and electronic, and power engineering.  The recipe for the tremendous interest in multivariate statistical techniques lies in its simplicity and adaptability for developing monitoring applica

  16. Applied multivariate statistics with R

    CERN Document Server

    Zelterman, Daniel

    2015-01-01

    This book brings the power of multivariate statistics to graduate-level practitioners, making these analytical methods accessible without lengthy mathematical derivations. Using the open source, shareware program R, Professor Zelterman demonstrates the process and outcomes for a wide array of multivariate statistical applications. Chapters cover graphical displays, linear algebra, univariate, bivariate and multivariate normal distributions, factor methods, linear regression, discrimination and classification, clustering, time series models, and additional methods. Zelterman uses practical examples from diverse disciplines to welcome readers from a variety of academic specialties. Those with backgrounds in statistics will learn new methods while they review more familiar topics. Chapters include exercises, real data sets, and R implementations. The data are interesting, real-world topics, particularly from health and biology-related contexts. As an example of the approach, the text examines a sample from the B...

  17. MIDAS: Regionally linear multivariate discriminative statistical mapping.

    Science.gov (United States)

    Varol, Erdem; Sotiras, Aristeidis; Davatzikos, Christos

    2018-07-01

    Statistical parametric maps formed via voxel-wise mass-univariate tests, such as the general linear model, are commonly used to test hypotheses about regionally specific effects in neuroimaging cross-sectional studies where each subject is represented by a single image. Despite being informative, these techniques remain limited as they ignore multivariate relationships in the data. Most importantly, the commonly employed local Gaussian smoothing, which is important for accounting for registration errors and making the data follow Gaussian distributions, is usually chosen in an ad hoc fashion. Thus, it is often suboptimal for the task of detecting group differences and correlations with non-imaging variables. Information mapping techniques, such as searchlight, which use pattern classifiers to exploit multivariate information and obtain more powerful statistical maps, have become increasingly popular in recent years. However, existing methods may lead to important interpretation errors in practice (i.e., misidentifying a cluster as informative, or failing to detect truly informative voxels), while often being computationally expensive. To address these issues, we introduce a novel efficient multivariate statistical framework for cross-sectional studies, termed MIDAS, seeking highly sensitive and specific voxel-wise brain maps, while leveraging the power of regional discriminant analysis. In MIDAS, locally linear discriminative learning is applied to estimate the pattern that best discriminates between two groups, or predicts a variable of interest. This pattern is equivalent to local filtering by an optimal kernel whose coefficients are the weights of the linear discriminant. By composing information from all neighborhoods that contain a given voxel, MIDAS produces a statistic that collectively reflects the contribution of the voxel to the regional classifiers as well as the discriminative power of the classifiers. Critically, MIDAS efficiently assesses the

  18. Application of multivariate techniques to analytical data on Aegean ceramics

    International Nuclear Information System (INIS)

    Bieber, A.M.; Brooks, D.W.; Harbottle, G.; Sayre, E.V.

    1976-01-01

    The general problems of data collection and handling for multivariate elemental analyses of ancient pottery are considered including such specific questions as the level of analytical precision required, the number and type of elements to be determined and the need for comprehensive multivariate statistical analysis of the collected data in contrast to element by element statistical analysis. The multivariate statistical procedures of clustering in a multidimensional space and determination of the numerical probabilities of specimens belonging to a group through calculation of the Mahalanobis distances for these specimens in multicomponent space are described together with supporting univariate statistical procedures used at Brookhaven. The application of these techniques to the data on Late Bronze Age Aegean pottery (largely previously analysed at Oxford and Brookhaven with some new specimens considered) have resulted in meaningful subdivisions of previously established groups. (author)

  19. Multivariate statistical evaluation of trace elements in groundwater in a coastal area in Shenzhen, China

    International Nuclear Information System (INIS)

    Chen Kouping; Jiao, Jiu J.; Huang Jianmin; Huang Runqiu

    2007-01-01

    Multivariate statistical techniques are efficient ways to display complex relationships among many objects. An attempt was made to study the data of trace elements in groundwater using multivariate statistical techniques such as principal component analysis (PCA), Q-mode factor analysis and cluster analysis. The original matrix consisted of 17 trace elements estimated from 55 groundwater samples colleted in 27 wells located in a coastal area in Shenzhen, China. PCA results show that trace elements of V, Cr, As, Mo, W, and U with greatest positive loadings typically occur as soluble oxyanions in oxidizing waters, while Mn and Co with greatest negative loadings are generally more soluble within oxygen depleted groundwater. Cluster analyses demonstrate that most groundwater samples collected from the same well in the study area during summer and winter still fall into the same group. This study also demonstrates the usefulness of multivariate statistical analysis in hydrochemical studies. - Multivariate statistical analysis was used to investigate relationships among trace elements and factors controlling trace element distribution in groundwater

  20. Application of multivariate statistical techniques in the water quality assessment of Danube river, Serbia

    Directory of Open Access Journals (Sweden)

    Voza Danijela

    2015-12-01

    Full Text Available The aim of this article is to evaluate the quality of the Danube River in its course through Serbia as well as to demonstrate the possibilities for using three statistical methods: Principal Component Analysis (PCA, Factor Analysis (FA and Cluster Analysis (CA in the surface water quality management. Given that the Danube is an important trans-boundary river, thorough water quality monitoring by sampling at different distances during shorter and longer periods of time is not only ecological, but also a political issue. Monitoring was carried out at monthly intervals from January to December 2011, at 17 sampling sites. The obtained data set was treated by multivariate techniques in order, firstly, to identify the similarities and differences between sampling periods and locations, secondly, to recognize variables that affect the temporal and spatial water quality changes and thirdly, to present the anthropogenic impact on water quality parameters.

  1. Evaluation of significantly modified water bodies in Vojvodina by using multivariate statistical techniques

    Directory of Open Access Journals (Sweden)

    Vujović Svetlana R.

    2013-01-01

    Full Text Available This paper illustrates the utility of multivariate statistical techniques for analysis and interpretation of water quality data sets and identification of pollution sources/factors with a view to get better information about the water quality and design of monitoring network for effective management of water resources. Multivariate statistical techniques, such as factor analysis (FA/principal component analysis (PCA and cluster analysis (CA, were applied for the evaluation of variations and for the interpretation of a water quality data set of the natural water bodies obtained during 2010 year of monitoring of 13 parameters at 33 different sites. FA/PCA attempts to explain the correlations between the observations in terms of the underlying factors, which are not directly observable. Factor analysis is applied to physico-chemical parameters of natural water bodies with the aim classification and data summation as well as segmentation of heterogeneous data sets into smaller homogeneous subsets. Factor loadings were categorized as strong and moderate corresponding to the absolute loading values of >0.75, 0.75-0.50, respectively. Four principal factors were obtained with Eigenvalues >1 summing more than 78 % of the total variance in the water data sets, which is adequate to give good prior information regarding data structure. Each factor that is significantly related to specific variables represents a different dimension of water quality. The first factor F1 accounting for 28 % of the total variance and represents the hydrochemical dimension of water quality. The second factor F2 accounting for 18% of the total variance and may be taken factor of water eutrophication. The third factor F3 accounting 17 % of the total variance and represents the influence of point sources of pollution on water quality. The fourth factor F4 accounting 13 % of the total variance and may be taken as an ecological dimension of water quality. Cluster analysis (CA is an

  2. Multivariate statistical analysis for x-ray photoelectron spectroscopy spectral imaging: Effect of image acquisition time

    International Nuclear Information System (INIS)

    Peebles, D.E.; Ohlhausen, J.A.; Kotula, P.G.; Hutton, S.; Blomfield, C.

    2004-01-01

    The acquisition of spectral images for x-ray photoelectron spectroscopy (XPS) is a relatively new approach, although it has been used with other analytical spectroscopy tools for some time. This technique provides full spectral information at every pixel of an image, in order to provide a complete chemical mapping of the imaged surface area. Multivariate statistical analysis techniques applied to the spectral image data allow the determination of chemical component species, and their distribution and concentrations, with minimal data acquisition and processing times. Some of these statistical techniques have proven to be very robust and efficient methods for deriving physically realistic chemical components without input by the user other than the spectral matrix itself. The benefits of multivariate analysis of the spectral image data include significantly improved signal to noise, improved image contrast and intensity uniformity, and improved spatial resolution - which are achieved due to the effective statistical aggregation of the large number of often noisy data points in the image. This work demonstrates the improvements in chemical component determination and contrast, signal-to-noise level, and spatial resolution that can be obtained by the application of multivariate statistical analysis to XPS spectral images

  3. Multivariate statistics high-dimensional and large-sample approximations

    CERN Document Server

    Fujikoshi, Yasunori; Shimizu, Ryoichi

    2010-01-01

    A comprehensive examination of high-dimensional analysis of multivariate methods and their real-world applications Multivariate Statistics: High-Dimensional and Large-Sample Approximations is the first book of its kind to explore how classical multivariate methods can be revised and used in place of conventional statistical tools. Written by prominent researchers in the field, the book focuses on high-dimensional and large-scale approximations and details the many basic multivariate methods used to achieve high levels of accuracy. The authors begin with a fundamental presentation of the basic

  4. Multivariate statistics exercises and solutions

    CERN Document Server

    Härdle, Wolfgang Karl

    2015-01-01

    The authors present tools and concepts of multivariate data analysis by means of exercises and their solutions. The first part is devoted to graphical techniques. The second part deals with multivariate random variables and presents the derivation of estimators and tests for various practical situations. The last part introduces a wide variety of exercises in applied multivariate data analysis. The book demonstrates the application of simple calculus and basic multivariate methods in real life situations. It contains altogether more than 250 solved exercises which can assist a university teacher in setting up a modern multivariate analysis course. All computer-based exercises are available in the R language. All R codes and data sets may be downloaded via the quantlet download center  www.quantlet.org or via the Springer webpage. For interactive display of low-dimensional projections of a multivariate data set, we recommend GGobi.

  5. Applied multivariate statistical analysis

    CERN Document Server

    Härdle, Wolfgang Karl

    2015-01-01

    Focusing on high-dimensional applications, this 4th edition presents the tools and concepts used in multivariate data analysis in a style that is also accessible for non-mathematicians and practitioners.  It surveys the basic principles and emphasizes both exploratory and inferential statistics; a new chapter on Variable Selection (Lasso, SCAD and Elastic Net) has also been added.  All chapters include practical exercises that highlight applications in different multivariate data analysis fields: in quantitative financial studies, where the joint dynamics of assets are observed; in medicine, where recorded observations of subjects in different locations form the basis for reliable diagnoses and medication; and in quantitative marketing, where consumers’ preferences are collected in order to construct models of consumer behavior.  All of these examples involve high to ultra-high dimensions and represent a number of major fields in big data analysis. The fourth edition of this book on Applied Multivariate ...

  6. Multivariate techniques of analysis for ToF-E recoil spectrometry data

    Energy Technology Data Exchange (ETDEWEB)

    Whitlow, H.J.; Bouanani, M.E.; Persson, L.; Hult, M.; Jonsson, P.; Johnston, P.N. [Lund Institute of Technology, Solvegatan, (Sweden), Department of Nuclear Physics; Andersson, M. [Uppsala Univ. (Sweden). Dept. of Organic Chemistry; Ostling, M.; Zaring, C. [Royal institute of Technology, Electrum, Kista, (Sweden), Department of Electronics; Johnston, P.N.; Bubb, I.F.; Walker, B.R.; Stannard, W.B. [Royal Melbourne Inst. of Tech., VIC (Australia); Cohen, D.D.; Dytlewski, N. [Australian Nuclear Science and Technology Organisation, Lucas Heights, NSW (Australia)

    1996-12-31

    Multivariate statistical methods are being developed by the Australian -Swedish Recoil Spectrometry Collaboration for quantitative analysis of the wealth of information in Time of Flight (ToF) and energy dispersive Recoil Spectrometry. An overview is presented of progress made in the use of multivariate techniques for energy calibration, separation of mass-overlapped signals and simulation of ToF-E data. 6 refs., 5 figs.

  7. Multivariate techniques of analysis for ToF-E recoil spectrometry data

    Energy Technology Data Exchange (ETDEWEB)

    Whitlow, H J; Bouanani, M E; Persson, L; Hult, M; Jonsson, P; Johnston, P N [Lund Institute of Technology, Solvegatan, (Sweden), Department of Nuclear Physics; Andersson, M [Uppsala Univ. (Sweden). Dept. of Organic Chemistry; Ostling, M; Zaring, C [Royal institute of Technology, Electrum, Kista, (Sweden), Department of Electronics; Johnston, P N; Bubb, I F; Walker, B R; Stannard, W B [Royal Melbourne Inst. of Tech., VIC (Australia); Cohen, D D; Dytlewski, N [Australian Nuclear Science and Technology Organisation, Lucas Heights, NSW (Australia)

    1997-12-31

    Multivariate statistical methods are being developed by the Australian -Swedish Recoil Spectrometry Collaboration for quantitative analysis of the wealth of information in Time of Flight (ToF) and energy dispersive Recoil Spectrometry. An overview is presented of progress made in the use of multivariate techniques for energy calibration, separation of mass-overlapped signals and simulation of ToF-E data. 6 refs., 5 figs.

  8. Groundwater quality assessment of urban Bengaluru using multivariate statistical techniques

    Science.gov (United States)

    Gulgundi, Mohammad Shahid; Shetty, Amba

    2018-03-01

    Groundwater quality deterioration due to anthropogenic activities has become a subject of prime concern. The objective of the study was to assess the spatial and temporal variations in groundwater quality and to identify the sources in the western half of the Bengaluru city using multivariate statistical techniques. Water quality index rating was calculated for pre and post monsoon seasons to quantify overall water quality for human consumption. The post-monsoon samples show signs of poor quality in drinking purpose compared to pre-monsoon. Cluster analysis (CA), principal component analysis (PCA) and discriminant analysis (DA) were applied to the groundwater quality data measured on 14 parameters from 67 sites distributed across the city. Hierarchical cluster analysis (CA) grouped the 67 sampling stations into two groups, cluster 1 having high pollution and cluster 2 having lesser pollution. Discriminant analysis (DA) was applied to delineate the most meaningful parameters accounting for temporal and spatial variations in groundwater quality of the study area. Temporal DA identified pH as the most important parameter, which discriminates between water quality in the pre-monsoon and post-monsoon seasons and accounts for 72% seasonal assignation of cases. Spatial DA identified Mg, Cl and NO3 as the three most important parameters discriminating between two clusters and accounting for 89% spatial assignation of cases. Principal component analysis was applied to the dataset obtained from the two clusters, which evolved three factors in each cluster, explaining 85.4 and 84% of the total variance, respectively. Varifactors obtained from principal component analysis showed that groundwater quality variation is mainly explained by dissolution of minerals from rock water interactions in the aquifer, effect of anthropogenic activities and ion exchange processes in water.

  9. Multivariate statistical analysis a high-dimensional approach

    CERN Document Server

    Serdobolskii, V

    2000-01-01

    In the last few decades the accumulation of large amounts of in­ formation in numerous applications. has stimtllated an increased in­ terest in multivariate analysis. Computer technologies allow one to use multi-dimensional and multi-parametric models successfully. At the same time, an interest arose in statistical analysis with a de­ ficiency of sample data. Nevertheless, it is difficult to describe the recent state of affairs in applied multivariate methods as satisfactory. Unimprovable (dominating) statistical procedures are still unknown except for a few specific cases. The simplest problem of estimat­ ing the mean vector with minimum quadratic risk is unsolved, even for normal distributions. Commonly used standard linear multivari­ ate procedures based on the inversion of sample covariance matrices can lead to unstable results or provide no solution in dependence of data. Programs included in standard statistical packages cannot process 'multi-collinear data' and there are no theoretical recommen­ ...

  10. "Statistical Techniques for Particle Physics" (2/4)

    CERN Multimedia

    CERN. Geneva

    2009-01-01

    This series will consist of four 1-hour lectures on statistics for particle physics. The goal will be to build up to techniques meant for dealing with problems of realistic complexity while maintaining a formal approach. I will also try to incorporate usage of common tools like ROOT, RooFit, and the newly developed RooStats framework into the lectures. The first lecture will begin with a review the basic principles of probability, some terminology, and the three main approaches towards statistical inference (Frequentist, Bayesian, and Likelihood-based). I will then outline the statistical basis for multivariate analysis techniques (the Neyman-Pearson lemma) and the motivation for machine learning algorithms. Later, I will extend simple hypothesis testing to the case in which the statistical model has one or many parameters (the Neyman Construction and the Feldman-Cousins technique). From there I will outline techniques to incorporate background uncertainties. If time allows, I will touch on the statist...

  11. "Statistical Techniques for Particle Physics" (1/4)

    CERN Multimedia

    CERN. Geneva

    2009-01-01

    This series will consist of four 1-hour lectures on statistics for particle physics. The goal will be to build up to techniques meant for dealing with problems of realistic complexity while maintaining a formal approach. I will also try to incorporate usage of common tools like ROOT, RooFit, and the newly developed RooStats framework into the lectures. The first lecture will begin with a review the basic principles of probability, some terminology, and the three main approaches towards statistical inference (Frequentist, Bayesian, and Likelihood-based). I will then outline the statistical basis for multivariate analysis techniques (the Neyman-Pearson lemma) and the motivation for machine learning algorithms. Later, I will extend simple hypothesis testing to the case in which the statistical model has one or many parameters (the Neyman Construction and the Feldman-Cousins technique). From there I will outline techniques to incorporate background uncertainties. If time allows, I will touch on the statist...

  12. "Statistical Techniques for Particle Physics" (4/4)

    CERN Multimedia

    CERN. Geneva

    2009-01-01

    This series will consist of four 1-hour lectures on statistics for particle physics. The goal will be to build up to techniques meant for dealing with problems of realistic complexity while maintaining a formal approach. I will also try to incorporate usage of common tools like ROOT, RooFit, and the newly developed RooStats framework into the lectures. The first lecture will begin with a review the basic principles of probability, some terminology, and the three main approaches towards statistical inference (Frequentist, Bayesian, and Likelihood-based). I will then outline the statistical basis for multivariate analysis techniques (the Neyman-Pearson lemma) and the motivation for machine learning algorithms. Later, I will extend simple hypothesis testing to the case in which the statistical model has one or many parameters (the Neyman Construction and the Feldman-Cousins technique). From there I will outline techniques to incorporate background uncertainties. If time allows, I will touch on the statist...

  13. "Statistical Techniques for Particle Physics" (3/4)

    CERN Multimedia

    CERN. Geneva

    2009-01-01

    This series will consist of four 1-hour lectures on statistics for particle physics. The goal will be to build up to techniques meant for dealing with problems of realistic complexity while maintaining a formal approach. I will also try to incorporate usage of common tools like ROOT, RooFit, and the newly developed RooStats framework into the lectures. The first lecture will begin with a review the basic principles of probability, some terminology, and the three main approaches towards statistical inference (Frequentist, Bayesian, and Likelihood-based). I will then outline the statistical basis for multivariate analysis techniques (the Neyman-Pearson lemma) and the motivation for machine learning algorithms. Later, I will extend simple hypothesis testing to the case in which the statistical model has one or many parameters (the Neyman Construction and the Feldman-Cousins technique). From there I will outline techniques to incorporate background uncertainties. If time allows, I will touch on the statist...

  14. Multivariate statistical analysis of precipitation chemistry in Northwestern Spain

    International Nuclear Information System (INIS)

    Prada-Sanchez, J.M.; Garcia-Jurado, I.; Gonzalez-Manteiga, W.; Fiestras-Janeiro, M.G.; Espada-Rios, M.I.; Lucas-Dominguez, T.

    1993-01-01

    149 samples of rainwater were collected in the proximity of a power station in northwestern Spain at three rainwater monitoring stations. The resulting data are analyzed using multivariate statistical techniques. Firstly, the Principal Component Analysis shows that there are three main sources of pollution in the area (a marine source, a rural source and an acid source). The impact from pollution from these sources on the immediate environment of the stations is studied using Factorial Discriminant Analysis. 8 refs., 7 figs., 11 tabs

  15. Multivariate statistical analysis of precipitation chemistry in Northwestern Spain

    Energy Technology Data Exchange (ETDEWEB)

    Prada-Sanchez, J.M.; Garcia-Jurado, I.; Gonzalez-Manteiga, W.; Fiestras-Janeiro, M.G.; Espada-Rios, M.I.; Lucas-Dominguez, T. (University of Santiago, Santiago (Spain). Faculty of Mathematics, Dept. of Statistics and Operations Research)

    1993-07-01

    149 samples of rainwater were collected in the proximity of a power station in northwestern Spain at three rainwater monitoring stations. The resulting data are analyzed using multivariate statistical techniques. Firstly, the Principal Component Analysis shows that there are three main sources of pollution in the area (a marine source, a rural source and an acid source). The impact from pollution from these sources on the immediate environment of the stations is studied using Factorial Discriminant Analysis. 8 refs., 7 figs., 11 tabs.

  16. Multi-Site and Multi-Variables Statistical Downscaling Technique in the Monsoon Dominated Region of Pakistan

    Science.gov (United States)

    Khan, Firdos; Pilz, Jürgen

    2016-04-01

    South Asia is under the severe impacts of changing climate and global warming. The last two decades showed that climate change or global warming is happening and the first decade of 21st century is considered as the warmest decade over Pakistan ever in history where temperature reached 53 0C in 2010. Consequently, the spatio-temporal distribution and intensity of precipitation is badly effected and causes floods, cyclones and hurricanes in the region which further have impacts on agriculture, water, health etc. To cope with the situation, it is important to conduct impact assessment studies and take adaptation and mitigation remedies. For impact assessment studies, we need climate variables at higher resolution. Downscaling techniques are used to produce climate variables at higher resolution; these techniques are broadly divided into two types, statistical downscaling and dynamical downscaling. The target location of this study is the monsoon dominated region of Pakistan. One reason for choosing this area is because the contribution of monsoon rains in this area is more than 80 % of the total rainfall. This study evaluates a statistical downscaling technique which can be then used for downscaling climatic variables. Two statistical techniques i.e. quantile regression and copula modeling are combined in order to produce realistic results for climate variables in the area under-study. To reduce the dimension of input data and deal with multicollinearity problems, empirical orthogonal functions will be used. Advantages of this new method are: (1) it is more robust to outliers as compared to ordinary least squares estimates and other estimation methods based on central tendency and dispersion measures; (2) it preserves the dependence among variables and among sites and (3) it can be used to combine different types of distributions. This is important in our case because we are dealing with climatic variables having different distributions over different meteorological

  17. Performance evaluation of a hybrid-passive landfill leachate treatment system using multivariate statistical techniques

    Energy Technology Data Exchange (ETDEWEB)

    Wallace, Jack, E-mail: jack.wallace@ce.queensu.ca [Department of Civil Engineering, Queen’s University, Ellis Hall, 58 University Avenue, Kingston, Ontario K7L 3N6 (Canada); Champagne, Pascale, E-mail: champagne@civil.queensu.ca [Department of Civil Engineering, Queen’s University, Ellis Hall, 58 University Avenue, Kingston, Ontario K7L 3N6 (Canada); Monnier, Anne-Charlotte, E-mail: anne-charlotte.monnier@insa-lyon.fr [National Institute for Applied Sciences – Lyon, 20 Avenue Albert Einstein, 69621 Villeurbanne Cedex (France)

    2015-01-15

    Highlights: • Performance of a hybrid passive landfill leachate treatment system was evaluated. • 33 Water chemistry parameters were sampled for 21 months and statistically analyzed. • Parameters were strongly linked and explained most (>40%) of the variation in data. • Alkalinity, ammonia, COD, heavy metals, and iron were criteria for performance. • Eight other parameters were key in modeling system dynamics and criteria. - Abstract: A pilot-scale hybrid-passive treatment system operated at the Merrick Landfill in North Bay, Ontario, Canada, treats municipal landfill leachate and provides for subsequent natural attenuation. Collected leachate is directed to a hybrid-passive treatment system, followed by controlled release to a natural attenuation zone before entering the nearby Little Sturgeon River. The study presents a comprehensive evaluation of the performance of the system using multivariate statistical techniques to determine the interactions between parameters, major pollutants in the leachate, and the biological and chemical processes occurring in the system. Five parameters (ammonia, alkalinity, chemical oxygen demand (COD), “heavy” metals of interest, with atomic weights above calcium, and iron) were set as criteria for the evaluation of system performance based on their toxicity to aquatic ecosystems and importance in treatment with respect to discharge regulations. System data for a full range of water quality parameters over a 21-month period were analyzed using principal components analysis (PCA), as well as principal components (PC) and partial least squares (PLS) regressions. PCA indicated a high degree of association for most parameters with the first PC, which explained a high percentage (>40%) of the variation in the data, suggesting strong statistical relationships among most of the parameters in the system. Regression analyses identified 8 parameters (set as independent variables) that were most frequently retained for modeling

  18. Multivariate Statistical Process Control

    DEFF Research Database (Denmark)

    Kulahci, Murat

    2013-01-01

    As sensor and computer technology continues to improve, it becomes a normal occurrence that we confront with high dimensional data sets. As in many areas of industrial statistics, this brings forth various challenges in statistical process control (SPC) and monitoring for which the aim...... is to identify “out-of-control” state of a process using control charts in order to reduce the excessive variation caused by so-called assignable causes. In practice, the most common method of monitoring multivariate data is through a statistic akin to the Hotelling’s T2. For high dimensional data with excessive...... amount of cross correlation, practitioners are often recommended to use latent structures methods such as Principal Component Analysis to summarize the data in only a few linear combinations of the original variables that capture most of the variation in the data. Applications of these control charts...

  19. Point defect characterization in HAADF-STEM images using multivariate statistical analysis

    International Nuclear Information System (INIS)

    Sarahan, Michael C.; Chi, Miaofang; Masiel, Daniel J.; Browning, Nigel D.

    2011-01-01

    Quantitative analysis of point defects is demonstrated through the use of multivariate statistical analysis. This analysis consists of principal component analysis for dimensional estimation and reduction, followed by independent component analysis to obtain physically meaningful, statistically independent factor images. Results from these analyses are presented in the form of factor images and scores. Factor images show characteristic intensity variations corresponding to physical structure changes, while scores relate how much those variations are present in the original data. The application of this technique is demonstrated on a set of experimental images of dislocation cores along a low-angle tilt grain boundary in strontium titanate. A relationship between chemical composition and lattice strain is highlighted in the analysis results, with picometer-scale shifts in several columns measurable from compositional changes in a separate column. -- Research Highlights: → Multivariate analysis of HAADF-STEM images. → Distinct structural variations among SrTiO 3 dislocation cores. → Picometer atomic column shifts correlated with atomic column population changes.

  20. MODEL APPLICATION MULTIVARIATE ANALYSIS OF STATISTICAL TECHNIQUES PCA AND HCA ASSESSMENT QUESTIONNAIRE ON CUSTOMER SATISFACTION: CASE STUDY IN A METALLURGICAL COMPANY OF METAL CONTAINERS

    Directory of Open Access Journals (Sweden)

    Cláudio Roberto Rosário

    2012-07-01

    Full Text Available The purpose of this research is to improve the practice on customer satisfaction analysis The article presents an analysis model to analyze the answers of a customer satisfaction evaluation in a systematic way with the aid of multivariate statistical techniques, specifically, exploratory analysis with PCA – Partial Components Analysis with HCA - Hierarchical Cluster Analysis. It was tried to evaluate the applicability of the model to be used by the issue company as a tool to assist itself on identifying the value chain perceived by the customer when applied the questionnaire of customer satisfaction. It was found with the assistance of multivariate statistical analysis that it was observed similar behavior among customers. It also allowed the company to conduct reviews on questions of the questionnaires, using analysis of the degree of correlation between the questions that was not a company’s practice before this research.

  1. Assessment of arsenic and heavy metal contents in cockles (Anadara granosa) using multivariate statistical techniques

    International Nuclear Information System (INIS)

    Abbas Alkarkhi, F.M.; Ismail, Norli; Easa, Azhar Mat

    2008-01-01

    Cockles (Anadara granosa) sample obtained from two rivers in the Penang State of Malaysia were analyzed for the content of arsenic (As) and heavy metals (Cr, Cd, Zn, Cu, Pb, and Hg) using a graphite flame atomic absorption spectrometer (GF-AAS) for Cr, Cd, Zn, Cu, Pb, As and cold vapor atomic absorption spectrometer (CV-AAS) for Hg. The two locations of interest with 20 sampling points of each location were Kuala Juru (Juru River) and Bukit Tambun (Jejawi River). Multivariate statistical techniques such as multivariate analysis of variance (MANOVA) and discriminant analysis (DA) were applied for analyzing the data. MANOVA showed a strong significant difference between the two rivers in term of As and heavy metals contents in cockles. DA gave the best result to identify the relative contribution for all parameters in discriminating (distinguishing) the two rivers. It provided an important data reduction as it used only two parameters (Zn and Cd) affording more than 72% correct assignations. Results indicated that the two rivers were different in terms of As and heavy metal contents in cockle, and the major difference was due to the contribution of Zn and Cd. A positive correlation was found between discriminate functions (DF) and Zn, Cd and Cr, whereas negative correlation was exhibited with other heavy metals. Therefore, DA allowed a reduction in the dimensionality of the data set, delineating a few indicator parameters responsible for large variations in heavy metals and arsenic content. Taking into account of these results, it can be suggested that a continuous monitoring of As and heavy metals in cockles be performed in these two rivers

  2. Correlation analysis of energy indicators for sustainable development using multivariate statistical techniques

    International Nuclear Information System (INIS)

    Carneiro, Alvaro Luiz Guimaraes; Santos, Francisco Carlos Barbosa dos

    2007-01-01

    Energy is an essential input for social development and economic growth. The production and use of energy cause environmental degradation at all levels, being local, regional and global such as, combustion of fossil fuels causing air pollution; hydropower often causes environmental damage due to the submergence of large areas of land; and global climate change associated with the increasing concentration of greenhouse gases in the atmosphere. As mentioned in chapter 9 of Agenda 21, the Energy is essential to economic and social development and improved quality of life. Much of the world's energy, however, is currently produced and consumed in ways that could not be sustained if technologies were remain constant and if overall quantities were to increase substantially. All energy sources will need to be used in ways that respect the atmosphere, human health, and the environment as a whole. The energy in the context of sustainable development needs a set of quantifiable parameters, called indicators, to measure and monitor important changes and significant progress towards the achievement of the objectives of sustainable development policies. The indicators are divided into four dimensions: social, economic, environmental and institutional. This paper shows a methodology of analysis using Multivariate Statistical Technique that provide the ability to analyse complex sets of data. The main goal of this study is to explore the correlation analysis among the indicators. The data used on this research work, is an excerpt of IBGE (Instituto Brasileiro de Geografia e Estatistica) data census. The core indicators used in this study follows The IAEA (International Atomic Energy Agency) framework: Energy Indicators for Sustainable Development. (author)

  3. Integrated GIS and multivariate statistical analysis for regional scale assessment of heavy metal soil contamination: A critical review

    International Nuclear Information System (INIS)

    Hou, Deyi; O'Connor, David; Nathanail, Paul; Tian, Li; Ma, Yan

    2017-01-01

    Heavy metal soil contamination is associated with potential toxicity to humans or ecotoxicity. Scholars have increasingly used a combination of geographical information science (GIS) with geostatistical and multivariate statistical analysis techniques to examine the spatial distribution of heavy metals in soils at a regional scale. A review of such studies showed that most soil sampling programs were based on grid patterns and composite sampling methodologies. Many programs intended to characterize various soil types and land use types. The most often used sampling depth intervals were 0–0.10 m, or 0–0.20 m, below surface; and the sampling densities used ranged from 0.0004 to 6.1 samples per km 2 , with a median of 0.4 samples per km 2 . The most widely used spatial interpolators were inverse distance weighted interpolation and ordinary kriging; and the most often used multivariate statistical analysis techniques were principal component analysis and cluster analysis. The review also identified several determining and correlating factors in heavy metal distribution in soils, including soil type, soil pH, soil organic matter, land use type, Fe, Al, and heavy metal concentrations. The major natural and anthropogenic sources of heavy metals were found to derive from lithogenic origin, roadway and transportation, atmospheric deposition, wastewater and runoff from industrial and mining facilities, fertilizer application, livestock manure, and sewage sludge. This review argues that the full potential of integrated GIS and multivariate statistical analysis for assessing heavy metal distribution in soils on a regional scale has not yet been fully realized. It is proposed that future research be conducted to map multivariate results in GIS to pinpoint specific anthropogenic sources, to analyze temporal trends in addition to spatial patterns, to optimize modeling parameters, and to expand the use of different multivariate analysis tools beyond principal component

  4. Arsenic health risk assessment in drinking water and source apportionment using multivariate statistical techniques in Kohistan region, northern Pakistan.

    Science.gov (United States)

    Muhammad, Said; Tahir Shah, M; Khan, Sardar

    2010-10-01

    The present study was conducted in Kohistan region, where mafic and ultramafic rocks (Kohistan island arc and Indus suture zone) and metasedimentary rocks (Indian plate) are exposed. Water samples were collected from the springs, streams and Indus river and analyzed for physical parameters, anions, cations and arsenic (As(3+), As(5+) and arsenic total). The water quality in Kohistan region was evaluated by comparing the physio-chemical parameters with permissible limits set by Pakistan environmental protection agency and world health organization. Most of the studied parameters were found within their respective permissible limits. However in some samples, the iron and arsenic concentrations exceeded their permissible limits. For health risk assessment of arsenic, the average daily dose, hazards quotient (HQ) and cancer risk were calculated by using statistical formulas. The values of HQ were found >1 in the samples collected from Jabba, Dubair, while HQ values were pollution load was also calculated by using multivariate statistical techniques like one-way ANOVA, correlation analysis, regression analysis, cluster analysis and principle component analysis. Copyright © 2010 Elsevier Ltd. All rights reserved.

  5. Multivariate statistical assessment of coal properties

    Czech Academy of Sciences Publication Activity Database

    Klika, Z.; Serenčíšová, J.; Kožušníková, Alena; Kolomazník, I.; Študentová, S.; Vontorová, J.

    2014-01-01

    Roč. 128, č. 128 (2014), s. 119-127 ISSN 0378-3820 R&D Projects: GA MŠk ED2.1.00/03.0082 Institutional support: RVO:68145535 Keywords : coal properties * structural,chemical and petrographical properties * multivariate statistics Subject RIV: DH - Mining, incl. Coal Mining Impact factor: 3.352, year: 2014 http://dx.doi.org/10.1016/j.fuproc.2014.06.029

  6. Application of a Multivariate Statistical Technique to Interpreting Data from Multichannel Equipment for the Example of the KLEM Spectrometer

    International Nuclear Information System (INIS)

    Podorozhnyi, D.M.; Postnikov, E.B.; Sveshnikova, L.G.; Turundaevsky, A.N.

    2005-01-01

    A multivariate statistical procedure for solving problems of estimating physical parameters on the basis of data from measurements with multichannel equipment is described. Within the multivariate procedure, an algorithm is constructed for estimating the energy of primary cosmic rays and the exponent in their power-law spectrum. They are investigated by using the KLEM spectrometer (NUCLEON project) as a specific example of measuring equipment. The results of computer experiments simulating the operation of the multivariate procedure for this equipment are given, the proposed approach being compared in these experiments with the one-parameter approach presently used in data processing

  7. Remote sensing estimation of the total phosphorus concentration in a large lake using band combinations and regional multivariate statistical modeling techniques.

    Science.gov (United States)

    Gao, Yongnian; Gao, Junfeng; Yin, Hongbin; Liu, Chuansheng; Xia, Ting; Wang, Jing; Huang, Qi

    2015-03-15

    Remote sensing has been widely used for ater quality monitoring, but most of these monitoring studies have only focused on a few water quality variables, such as chlorophyll-a, turbidity, and total suspended solids, which have typically been considered optically active variables. Remote sensing presents a challenge in estimating the phosphorus concentration in water. The total phosphorus (TP) in lakes has been estimated from remotely sensed observations, primarily using the simple individual band ratio or their natural logarithm and the statistical regression method based on the field TP data and the spectral reflectance. In this study, we investigated the possibility of establishing a spatial modeling scheme to estimate the TP concentration of a large lake from multi-spectral satellite imagery using band combinations and regional multivariate statistical modeling techniques, and we tested the applicability of the spatial modeling scheme. The results showed that HJ-1A CCD multi-spectral satellite imagery can be used to estimate the TP concentration in a lake. The correlation and regression analysis showed a highly significant positive relationship between the TP concentration and certain remotely sensed combination variables. The proposed modeling scheme had a higher accuracy for the TP concentration estimation in the large lake compared with the traditional individual band ratio method and the whole-lake scale regression-modeling scheme. The TP concentration values showed a clear spatial variability and were high in western Lake Chaohu and relatively low in eastern Lake Chaohu. The northernmost portion, the northeastern coastal zone and the southeastern portion of western Lake Chaohu had the highest TP concentrations, and the other regions had the lowest TP concentration values, except for the coastal zone of eastern Lake Chaohu. These results strongly suggested that the proposed modeling scheme, i.e., the band combinations and the regional multivariate

  8. metaCCA: summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis.

    Science.gov (United States)

    Cichonska, Anna; Rousu, Juho; Marttinen, Pekka; Kangas, Antti J; Soininen, Pasi; Lehtimäki, Terho; Raitakari, Olli T; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ala-Korpela, Mika; Ripatti, Samuli; Pirinen, Matti

    2016-07-01

    A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests. We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness.Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies. Code is available at https://github.com/aalto-ics-kepaco anna.cichonska@helsinki.fi or matti.pirinen@helsinki.fi Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  9. Aspects of multivariate statistical theory

    CERN Document Server

    Muirhead, Robb J

    2009-01-01

    The Wiley-Interscience Paperback Series consists of selected books that have been made more accessible to consumers in an effort to increase global appeal and general circulation. With these new unabridged softcover volumes, Wiley hopes to extend the lives of these works by making them available to future generations of statisticians, mathematicians, and scientists. "". . . the wealth of material on statistics concerning the multivariate normal distribution is quite exceptional. As such it is a very useful source of information for the general statistician and a must for anyone wanting to pen

  10. Multivariate statistical analysis of radioactive variables in two phosphate ores from Sudan

    International Nuclear Information System (INIS)

    Adam, Abdel Majid A.; Eltayeb, Mohamed Ahmed H.

    2012-01-01

    Multivariate statistical techniques are efficient ways to display complex relationships among many objects. An attempt was made to study the radioactive data in two types of Sudanese phosphate deposits; Kurun and Uro phosphate, using several multivariate statistical methods. Pearson correlation coefficient revealed that a U-238 distribution in Kurun phosphate is controlled by the variation of K-40 concentration, whereas in Uro phosphate it is controlled by the variation of U-235 and U-234 concentration. Histograms and normal Q–Q plots clearly show that the radioactive variables did not follow a normal distribution. This non-normality feature observed may be attributed to complicating influence of geological factors. The principal components analysis (PCA) gives a model of five components for representing the acquired data from Kurun phosphate, where 89.5% of the total variance is explained. A model of four components was sufficient to represent the acquired data from Uro phosphate, where 87.5% of the total data variance is explained. The hierarchical cluster analysis (HCA) indicates that U-238 behaves in the same manner in the two types of phosphates; it associated with a group of four radionuclides; U-234, Po-210, Ra-226, Th-230, which the most abundant radionuclides, and all belong to the uranium-238 decay series. Two parameters have been adapted for the direct differentiate between the two phosphates. Firstly, U-238 in Uro phosphate have shown higher degree of mobility (CV% = 82.6) than that in Kurun phosphate (CV% = 64.7), and secondly, the activity ratio of Th-230/Th-232 in Uro phosphate is nine times than that in Kurun phosphate. - Highlights: ► Multivariate statistical techniques were used to characterize radioactive data. ► U-238 in Uro phosphate shows higher degree of mobility (CV% = 82.6). ► U-238 in Kurun phosphate shows lower degree of mobility (CV% = 64.7). ► The radioactive variables did not follow a normal distribution. ► The ratio of Th

  11. Classifying hot water chemistry: Application of MULTIVARIATE STATISTICS

    OpenAIRE

    Sumintadireja, Prihadi; Irawan, Dasapta Erwin; Rezky, Yuanno; Gio, Prana Ugiana; Agustin, Anggita

    2016-01-01

    This file is the dataset for the following paper "Classifying hot water chemistry: Application of MULTIVARIATE STATISTICS". Authors: Prihadi Sumintadireja1, Dasapta Erwin Irawan1, Yuano Rezky2, Prana Ugiana Gio3, Anggita Agustin1

  12. Multivariate statistical characterization of groundwater quality in Ain ...

    African Journals Online (AJOL)

    Administrator

    depends much on the sustainability of the available water resources. Water of .... 18 wells currently in use were selected based on the preliminary field survey carried out to ... In recent times, multivariate statistical methods have been applied ...

  13. Water quality assessment and apportionment of pollution sources of Gomti river (India) using multivariate statistical techniques--a case study

    International Nuclear Information System (INIS)

    Singh, Kunwar P.; Malik, Amrita; Sinha, Sarita

    2005-01-01

    Multivariate statistical techniques, such as cluster analysis (CA), factor analysis (FA), principal component analysis (PCA) and discriminant analysis (DA) were applied to the data set on water quality of the Gomti river (India), generated during three years (1999-2001) monitoring at eight different sites for 34 parameters (9792 observations). This study presents usefulness of multivariate statistical techniques for evaluation and interpretation of large complex water quality data sets and apportionment of pollution sources/factors with a view to get better information about the water quality and design of monitoring network for effective management of water resources. Three significant groups, upper catchments (UC), middle catchments (MC) and lower catchments (LC) of sampling sites were obtained through CA on the basis of similarity between them. FA/PCA applied to the data sets pertaining to three catchments regions of the river resulted in seven, seven and six latent factors, respectively responsible for the data structure, explaining 74.3, 73.6 and 81.4% of the total variance of the respective data sets. These included the trace metals group (leaching from soil and industrial waste disposal sites), organic pollution group (municipal and industrial effluents), nutrients group (agricultural runoff), alkalinity, hardness, EC and solids (soil leaching and runoff process). DA showed the best results for data reduction and pattern recognition during both temporal and spatial analysis. It rendered five parameters (temperature, total alkalinity, Cl, Na and K) affording more than 94% right assignations in temporal analysis, while 10 parameters (river discharge, pH, BOD, Cl, F, PO 4 , NH 4 -N, NO 3 -N, TKN and Zn) to afford 97% right assignations in spatial analysis of three different regions in the basin. Thus, DA allowed reduction in dimensionality of the large data set, delineating a few indicator parameters responsible for large variations in water quality. Further

  14. Synchrotron-Based Microspectroscopic Analysis of Molecular and Biopolymer Structures Using Multivariate Techniques and Advanced Multi-Components Modeling

    International Nuclear Information System (INIS)

    Yu, P.

    2008-01-01

    More recently, advanced synchrotron radiation-based bioanalytical technique (SRFTIRM) has been applied as a novel non-invasive analysis tool to study molecular, functional group and biopolymer chemistry, nutrient make-up and structural conformation in biomaterials. This novel synchrotron technique, taking advantage of bright synchrotron light (which is million times brighter than sunlight), is capable of exploring the biomaterials at molecular and cellular levels. However, with the synchrotron RFTIRM technique, a large number of molecular spectral data are usually collected. The objective of this article was to illustrate how to use two multivariate statistical techniques: (1) agglomerative hierarchical cluster analysis (AHCA) and (2) principal component analysis (PCA) and two advanced multicomponent modeling methods: (1) Gaussian and (2) Lorentzian multi-component peak modeling for molecular spectrum analysis of bio-tissues. The studies indicated that the two multivariate analyses (AHCA, PCA) are able to create molecular spectral corrections by including not just one intensity or frequency point of a molecular spectrum, but by utilizing the entire spectral information. Gaussian and Lorentzian modeling techniques are able to quantify spectral omponent peaks of molecular structure, functional group and biopolymer. By application of these four statistical methods of the multivariate techniques and Gaussian and Lorentzian modeling, inherent molecular structures, functional group and biopolymer onformation between and among biological samples can be quantified, discriminated and classified with great efficiency.

  15. A unifying framework for k-statistics, polykays and their multivariate generalizations.

    OpenAIRE

    DI NARDO, Elvira; GUARINO G, G.; Senato, D.

    2008-01-01

    Through the classical umbral calculus, we provide a unifying syntax for single and multivariate $k$-statistics, polykays and multivariate polykays. From a combinatorial point of view, we revisit the theory as exposed by Stuart and Ord, taking into account the Doubilet approach to symmetric functions. Moreover, by using exponential polynomials rather than set partitions, we provide a new formula for $k$-statistics that results in a very fast algorithm to generate such estimators.

  16. Multivariate ordination statistics workshop with R slides

    OpenAIRE

    Strack, Michael

    2015-01-01

    2-hour workshop given at Macquarie University Department of Biological Sciences, 4 November 2015. Workshop was an introduction to the family of techniques falling under multivariate ordination, using the R language and drawing heavily from the book "Numerical Ecology with R" by Borcard et. al (2012).

  17. Adjustment of geochemical background by robust multivariate statistics

    Science.gov (United States)

    Zhou, D.

    1985-01-01

    Conventional analyses of exploration geochemical data assume that the background is a constant or slowly changing value, equivalent to a plane or a smoothly curved surface. However, it is better to regard the geochemical background as a rugged surface, varying with changes in geology and environment. This rugged surface can be estimated from observed geological, geochemical and environmental properties by using multivariate statistics. A method of background adjustment was developed and applied to groundwater and stream sediment reconnaissance data collected from the Hot Springs Quadrangle, South Dakota, as part of the National Uranium Resource Evaluation (NURE) program. Source-rock lithology appears to be a dominant factor controlling the chemical composition of groundwater or stream sediments. The most efficacious adjustment procedure is to regress uranium concentration on selected geochemical and environmental variables for each lithologic unit, and then to delineate anomalies by a common threshold set as a multiple of the standard deviation of the combined residuals. Robust versions of regression and RQ-mode principal components analysis techniques were used rather than ordinary techniques to guard against distortion caused by outliers Anomalies delineated by this background adjustment procedure correspond with uranium prospects much better than do anomalies delineated by conventional procedures. The procedure should be applicable to geochemical exploration at different scales for other metals. ?? 1985.

  18. Multivariate data analysis

    DEFF Research Database (Denmark)

    Hansen, Michael Adsetts Edberg

    Interest in statistical methodology is increasing so rapidly in the astronomical community that accessible introductory material in this area is long overdue. This book fills the gap by providing a presentation of the most useful techniques in multivariate statistics. A wide-ranging annotated set...

  19. Temporal and spatial assessment of river surface water quality using multivariate statistical techniques: a study in Can Tho City, a Mekong Delta area, Vietnam.

    Science.gov (United States)

    Phung, Dung; Huang, Cunrui; Rutherford, Shannon; Dwirahmadi, Febi; Chu, Cordia; Wang, Xiaoming; Nguyen, Minh; Nguyen, Nga Huy; Do, Cuong Manh; Nguyen, Trung Hieu; Dinh, Tuan Anh Diep

    2015-05-01

    The present study is an evaluation of temporal/spatial variations of surface water quality using multivariate statistical techniques, comprising cluster analysis (CA), principal component analysis (PCA), factor analysis (FA) and discriminant analysis (DA). Eleven water quality parameters were monitored at 38 different sites in Can Tho City, a Mekong Delta area of Vietnam from 2008 to 2012. Hierarchical cluster analysis grouped the 38 sampling sites into three clusters, representing mixed urban-rural areas, agricultural areas and industrial zone. FA/PCA resulted in three latent factors for the entire research location, three for cluster 1, four for cluster 2, and four for cluster 3 explaining 60, 60.2, 80.9, and 70% of the total variance in the respective water quality. The varifactors from FA indicated that the parameters responsible for water quality variations are related to erosion from disturbed land or inflow of effluent from sewage plants and industry, discharges from wastewater treatment plants and domestic wastewater, agricultural activities and industrial effluents, and contamination by sewage waste with faecal coliform bacteria through sewer and septic systems. Discriminant analysis (DA) revealed that nephelometric turbidity units (NTU), chemical oxygen demand (COD) and NH₃ are the discriminating parameters in space, affording 67% correct assignation in spatial analysis; pH and NO₂ are the discriminating parameters according to season, assigning approximately 60% of cases correctly. The findings suggest a possible revised sampling strategy that can reduce the number of sampling sites and the indicator parameters responsible for large variations in water quality. This study demonstrates the usefulness of multivariate statistical techniques for evaluation of temporal/spatial variations in water quality assessment and management.

  20. Robust multivariate analysis

    CERN Document Server

    J Olive, David

    2017-01-01

    This text presents methods that are robust to the assumption of a multivariate normal distribution or methods that are robust to certain types of outliers. Instead of using exact theory based on the multivariate normal distribution, the simpler and more applicable large sample theory is given.  The text develops among the first practical robust regression and robust multivariate location and dispersion estimators backed by theory.   The robust techniques  are illustrated for methods such as principal component analysis, canonical correlation analysis, and factor analysis.  A simple way to bootstrap confidence regions is also provided. Much of the research on robust multivariate analysis in this book is being published for the first time. The text is suitable for a first course in Multivariate Statistical Analysis or a first course in Robust Statistics. This graduate text is also useful for people who are familiar with the traditional multivariate topics, but want to know more about handling data sets with...

  1. Processing data collected from radiometric experiments by multivariate technique

    International Nuclear Information System (INIS)

    Urbanski, P.; Kowalska, E.; Machaj, B.; Jakowiuk, A.

    2005-01-01

    Multivariate techniques applied for processing data collected from radiometric experiments can provide more efficient extraction of the information contained in the spectra. Several techniques are considered: (i) multivariate calibration using Partial Least Square Regression and Artificial Neural Network, (ii) standardization of the spectra, (iii) smoothing of collected spectra were autocorrelation function and bootstrap were used for the assessment of the processed data, (iv) image processing using Principal Component Analysis. Application of these techniques is illustrated on examples of some industrial applications. (author)

  2. A guide to statistical analysis in microbial ecology: a community-focused, living review of multivariate data analyses.

    Science.gov (United States)

    Buttigieg, Pier Luigi; Ramette, Alban

    2014-12-01

    The application of multivariate statistical analyses has become a consistent feature in microbial ecology. However, many microbial ecologists are still in the process of developing a deep understanding of these methods and appreciating their limitations. As a consequence, staying abreast of progress and debate in this arena poses an additional challenge to many microbial ecologists. To address these issues, we present the GUide to STatistical Analysis in Microbial Ecology (GUSTA ME): a dynamic, web-based resource providing accessible descriptions of numerous multivariate techniques relevant to microbial ecologists. A combination of interactive elements allows users to discover and navigate between methods relevant to their needs and examine how they have been used by others in the field. We have designed GUSTA ME to become a community-led and -curated service, which we hope will provide a common reference and forum to discuss and disseminate analytical techniques relevant to the microbial ecology community. © 2014 The Authors. FEMS Microbiology Ecology published by John Wiley & Sons Ltd on behalf of Federation of European Microbiological Societies.

  3. Classification of Specialized Farms Applying Multivariate Statistical Methods

    Directory of Open Access Journals (Sweden)

    Zuzana Hloušková

    2017-01-01

    Full Text Available Classification of specialized farms applying multivariate statistical methods The paper is aimed at application of advanced multivariate statistical methods when classifying cattle breeding farming enterprises by their economic size. Advantage of the model is its ability to use a few selected indicators compared to the complex methodology of current classification model that requires knowledge of detailed structure of the herd turnover and structure of cultivated crops. Output of the paper is intended to be applied within farm structure research focused on future development of Czech agriculture. As data source, the farming enterprises database for 2014 has been used, from the FADN CZ system. The predictive model proposed exploits knowledge of actual size classes of the farms tested. Outcomes of the linear discriminatory analysis multifactor classification method have supported the chance of filing farming enterprises in the group of Small farms (98 % filed correctly, and the Large and Very Large enterprises (100 % filed correctly. The Medium Size farms have been correctly filed at 58.11 % only. Partial shortages of the process presented have been found when discriminating Medium and Small farms.

  4. DETERMINING INDICATORS OF URBAN HOUSEHOLD WATER CONSUMPTION THROUGH MULTIVARIATE STATISTICAL TECHNIQUES

    Directory of Open Access Journals (Sweden)

    Gledsneli Maria Lima Lins

    2010-12-01

    Full Text Available Water has a decisive influence on populations’ life quality – specifically in areas like urban supply, drainage, and effluents treatment – due to its sound impact over public health. Water rational use constitutes the greatest challenge faced by water demand management, mainly with regard to urban household water consumption. This makes it important to develop researches to assist water managers and public policy-makers in planning and formulating water demand measures which may allow urban water rational use to be met. This work utilized the multivariate techniques Factor Analysis and Multiple Linear Regression Analysis – in order to determine the participation level of socioeconomic and climatic variables in monthly urban household consumption changes – applying them to two districts of Campina Grande city (State of Paraíba, Brazil. The districts were chosen based on socioeconomic criterion (income level so as to evaluate their water consumer’s behavior. A 9-year monthly data series (from year 2000 up to 2008 was utilized, comprising family income, water tariff, and quantity of household connections (economies – as socioeconomic variables – and average temperature and precipitation, as climatic variables. For both the selected districts of Campina Grande city, the obtained results point out the variables “water tariff” and “family income” as indicators of these district’s household consumption.

  5. Assessment and rationalization of water quality monitoring network: a multivariate statistical approach to the Kabbini River (India).

    Science.gov (United States)

    Mavukkandy, Musthafa Odayooth; Karmakar, Subhankar; Harikumar, P S

    2014-09-01

    The establishment of an efficient surface water quality monitoring (WQM) network is a critical component in the assessment, restoration and protection of river water quality. A periodic evaluation of monitoring network is mandatory to ensure effective data collection and possible redesigning of existing network in a river catchment. In this study, the efficacy and appropriateness of existing water quality monitoring network in the Kabbini River basin of Kerala, India is presented. Significant multivariate statistical techniques like principal component analysis (PCA) and principal factor analysis (PFA) have been employed to evaluate the efficiency of the surface water quality monitoring network with monitoring stations as the evaluated variables for the interpretation of complex data matrix of the river basin. The main objective is to identify significant monitoring stations that must essentially be included in assessing annual and seasonal variations of river water quality. Moreover, the significance of seasonal redesign of the monitoring network was also investigated to capture valuable information on water quality from the network. Results identified few monitoring stations as insignificant in explaining the annual variance of the dataset. Moreover, the seasonal redesign of the monitoring network through a multivariate statistical framework was found to capture valuable information from the system, thus making the network more efficient. Cluster analysis (CA) classified the sampling sites into different groups based on similarity in water quality characteristics. The PCA/PFA identified significant latent factors standing for different pollution sources such as organic pollution, industrial pollution, diffuse pollution and faecal contamination. Thus, the present study illustrates that various multivariate statistical techniques can be effectively employed in sustainable management of water resources. The effectiveness of existing river water quality monitoring

  6. Multivariate statistical methods a primer

    CERN Document Server

    Manly, Bryan FJ

    2004-01-01

    THE MATERIAL OF MULTIVARIATE ANALYSISExamples of Multivariate DataPreview of Multivariate MethodsThe Multivariate Normal DistributionComputer ProgramsGraphical MethodsChapter SummaryReferencesMATRIX ALGEBRAThe Need for Matrix AlgebraMatrices and VectorsOperations on MatricesMatrix InversionQuadratic FormsEigenvalues and EigenvectorsVectors of Means and Covariance MatricesFurther Reading Chapter SummaryReferencesDISPLAYING MULTIVARIATE DATAThe Problem of Displaying Many Variables in Two DimensionsPlotting index VariablesThe Draftsman's PlotThe Representation of Individual Data P:ointsProfiles o

  7. Identification of mine waters by statistical multivariate methods

    Energy Technology Data Exchange (ETDEWEB)

    Mali, N [IGGG, Ljubljana (Slovenia)

    1992-01-01

    Three water-bearing aquifers are present in the Velenje lignite mine. The aquifer waters have differing chemical composition; a geochemical water analysis can therefore determine the source of mine water influx. Mine water samples from different locations in the mine were analyzed, the results of chemical content and of electric conductivity of mine water were statistically processed by means of MICROGAS, SPSS-X and IN STATPAC computer programs, which apply three multivariate statistical methods (discriminate, cluster and factor analysis). Reliability of calculated values was determined with the Kolmogorov and Smirnov tests. It is concluded that laboratory analysis of single water samples can produce measurement errors, but statistical processing of water sample data can identify origin and movement of mine water. 15 refs.

  8. Integrated GIS and multivariate statistical analysis for regional scale assessment of heavy metal soil contamination: A critical review.

    Science.gov (United States)

    Hou, Deyi; O'Connor, David; Nathanail, Paul; Tian, Li; Ma, Yan

    2017-12-01

    Heavy metal soil contamination is associated with potential toxicity to humans or ecotoxicity. Scholars have increasingly used a combination of geographical information science (GIS) with geostatistical and multivariate statistical analysis techniques to examine the spatial distribution of heavy metals in soils at a regional scale. A review of such studies showed that most soil sampling programs were based on grid patterns and composite sampling methodologies. Many programs intended to characterize various soil types and land use types. The most often used sampling depth intervals were 0-0.10 m, or 0-0.20 m, below surface; and the sampling densities used ranged from 0.0004 to 6.1 samples per km 2 , with a median of 0.4 samples per km 2 . The most widely used spatial interpolators were inverse distance weighted interpolation and ordinary kriging; and the most often used multivariate statistical analysis techniques were principal component analysis and cluster analysis. The review also identified several determining and correlating factors in heavy metal distribution in soils, including soil type, soil pH, soil organic matter, land use type, Fe, Al, and heavy metal concentrations. The major natural and anthropogenic sources of heavy metals were found to derive from lithogenic origin, roadway and transportation, atmospheric deposition, wastewater and runoff from industrial and mining facilities, fertilizer application, livestock manure, and sewage sludge. This review argues that the full potential of integrated GIS and multivariate statistical analysis for assessing heavy metal distribution in soils on a regional scale has not yet been fully realized. It is proposed that future research be conducted to map multivariate results in GIS to pinpoint specific anthropogenic sources, to analyze temporal trends in addition to spatial patterns, to optimize modeling parameters, and to expand the use of different multivariate analysis tools beyond principal component analysis

  9. Instrumental Neutron Activation Analysis and Multivariate Statistics for Pottery Provenance

    Science.gov (United States)

    Glascock, M. D.; Neff, H.; Vaughn, K. J.

    2004-06-01

    The application of instrumental neutron activation analysis and multivariate statistics to archaeological studies of ceramics and clays is described. A small pottery data set from the Nasca culture in southern Peru is presented for illustration.

  10. Instrumental Neutron Activation Analysis and Multivariate Statistics for Pottery Provenance

    International Nuclear Information System (INIS)

    Glascock, M. D.; Neff, H.; Vaughn, K. J.

    2004-01-01

    The application of instrumental neutron activation analysis and multivariate statistics to archaeological studies of ceramics and clays is described. A small pottery data set from the Nasca culture in southern Peru is presented for illustration.

  11. Instrumental Neutron Activation Analysis and Multivariate Statistics for Pottery Provenance

    Energy Technology Data Exchange (ETDEWEB)

    Glascock, M. D.; Neff, H. [University of Missouri, Research Reactor Center (United States); Vaughn, K. J. [Pacific Lutheran University, Department of Anthropology (United States)

    2004-06-15

    The application of instrumental neutron activation analysis and multivariate statistics to archaeological studies of ceramics and clays is described. A small pottery data set from the Nasca culture in southern Peru is presented for illustration.

  12. Multivariate Location Estimation Using Extension of $R$-Estimates Through $U$-Statistics Type Approach

    OpenAIRE

    Chaudhuri, Probal

    1992-01-01

    We consider a class of $U$-statistics type estimates for multivariate location. The estimates extend some $R$-estimates to multivariate data. In particular, the class of estimates includes the multivariate median considered by Gini and Galvani (1929) and Haldane (1948) and a multivariate extension of the well-known Hodges-Lehmann (1963) estimate. We explore large sample behavior of these estimates by deriving a Bahadur type representation for them. In the process of developing these asymptoti...

  13. Multivariate Statistical Methods as a Tool of Financial Analysis of Farm Business

    Czech Academy of Sciences Publication Activity Database

    Novák, J.; Sůvová, H.; Vondráček, Jiří

    2002-01-01

    Roč. 48, č. 1 (2002), s. 9-12 ISSN 0139-570X Institutional research plan: AV0Z1030915 Keywords : financial analysis * financial ratios * multivariate statistical methods * correlation analysis * discriminant analysis * cluster analysis Subject RIV: BB - Applied Statistics, Operational Research

  14. Statistical inference for a class of multivariate negative binomial distributions

    DEFF Research Database (Denmark)

    Rubak, Ege Holger; Møller, Jesper; McCullagh, Peter

    This paper considers statistical inference procedures for a class of models for positively correlated count variables called α-permanental random fields, and which can be viewed as a family of multivariate negative binomial distributions. Their appealing probabilistic properties have earlier been...

  15. Multivariate analysis: models and method

    International Nuclear Information System (INIS)

    Sanz Perucha, J.

    1990-01-01

    Data treatment techniques are increasingly used since computer methods result of wider access. Multivariate analysis consists of a group of statistic methods that are applied to study objects or samples characterized by multiple values. A final goal is decision making. The paper describes the models and methods of multivariate analysis

  16. Source Identification of Heavy Metals in Soils Surrounding the Zanjan Zinc Town by Multivariate Statistical Techniques

    Directory of Open Access Journals (Sweden)

    M.A. Delavar

    2016-02-01

    Full Text Available Introduction: The accumulation of heavy metals (HMs in the soil is of increasing concern due to food safety issues, potential health risks, and the detrimental effects on soil ecosystems. HMs may be considered as the most important soil pollutants, because they are not biodegradable and their physical movement through the soil profile is relatively limited. Therefore, root uptake process may provide a big chance for these pollutants to transfer from the surface soil to natural and cultivated plants, which may eventually steer them to human bodies. The general behavior of HMs in the environment, especially their bioavailability in the soil, is influenced by their origin. Hence, source apportionment of HMs may provide some essential information for better management of polluted soils to restrict the HMs entrance to the human food chain. This paper explores the applicability of multivariate statistical techniques in the identification of probable sources that can control the concentration and distribution of selected HMs in the soils surrounding the Zanjan Zinc Specialized Industrial Town (briefly Zinc Town. Materials and Methods: The area under investigation has a size of approximately 4000 ha.It is located around the Zinc Town, Zanjan province. A regular grid sampling pattern with an interval of 500 meters was applied to identify the sample location, and 184 topsoil samples (0-10 cm were collected. The soil samples were air-dried and sieved through a 2 mm polyethylene sieve and then, were digested using HNO3. The total concentrations of zinc (Zn, lead (Pb, cadmium (Cd, Nickel (Ni and copper (Cu in the soil solutions were determined via Atomic Absorption Spectroscopy (AAS. Data were statistically analyzed using the SPSS software version 17.0 for Windows. Correlation Matrix (CM, Principal Component Analyses (PCA and Factor Analyses (FA techniques were performed in order to identify the probable sources of HMs in the studied soils. Results and

  17. Multivariate statistical analysis of wildfires in Portugal

    Science.gov (United States)

    Costa, Ricardo; Caramelo, Liliana; Pereira, Mário

    2013-04-01

    Several studies demonstrate that wildfires in Portugal present high temporal and spatial variability as well as cluster behavior (Pereira et al., 2005, 2011). This study aims to contribute to the characterization of the fire regime in Portugal with the multivariate statistical analysis of the time series of number of fires and area burned in Portugal during the 1980 - 2009 period. The data used in the analysis is an extended version of the Rural Fire Portuguese Database (PRFD) (Pereira et al, 2011), provided by the National Forest Authority (Autoridade Florestal Nacional, AFN), the Portuguese Forest Service, which includes information for more than 500,000 fire records. There are many multiple advanced techniques for examining the relationships among multiple time series at the same time (e.g., canonical correlation analysis, principal components analysis, factor analysis, path analysis, multiple analyses of variance, clustering systems). This study compares and discusses the results obtained with these different techniques. Pereira, M.G., Trigo, R.M., DaCamara, C.C., Pereira, J.M.C., Leite, S.M., 2005: "Synoptic patterns associated with large summer forest fires in Portugal". Agricultural and Forest Meteorology. 129, 11-25. Pereira, M. G., Malamud, B. D., Trigo, R. M., and Alves, P. I.: The history and characteristics of the 1980-2005 Portuguese rural fire database, Nat. Hazards Earth Syst. Sci., 11, 3343-3358, doi:10.5194/nhess-11-3343-2011, 2011 This work is supported by European Union Funds (FEDER/COMPETE - Operational Competitiveness Programme) and by national funds (FCT - Portuguese Foundation for Science and Technology) under the project FCOMP-01-0124-FEDER-022692, the project FLAIR (PTDC/AAC-AMB/104702/2008) and the EU 7th Framework Program through FUME (contract number 243888).

  18. Statistical Inference for a Class of Multivariate Negative Binomial Distributions

    DEFF Research Database (Denmark)

    Rubak, Ege H.; Møller, Jesper; McCullagh, Peter

    This paper considers statistical inference procedures for a class of models for positively correlated count variables called -permanental random fields, and which can be viewed as a family of multivariate negative binomial distributions. Their appealing probabilistic properties have earlier been...... studied in the literature, while this is the first statistical paper on -permanental random fields. The focus is on maximum likelihood estimation, maximum quasi-likelihood estimation and on maximum composite likelihood estimation based on uni- and bivariate distributions. Furthermore, new results...

  19. Multivariate meta-analysis: a robust approach based on the theory of U-statistic.

    Science.gov (United States)

    Ma, Yan; Mazumdar, Madhu

    2011-10-30

    Meta-analysis is the methodology for combining findings from similar research studies asking the same question. When the question of interest involves multiple outcomes, multivariate meta-analysis is used to synthesize the outcomes simultaneously taking into account the correlation between the outcomes. Likelihood-based approaches, in particular restricted maximum likelihood (REML) method, are commonly utilized in this context. REML assumes a multivariate normal distribution for the random-effects model. This assumption is difficult to verify, especially for meta-analysis with small number of component studies. The use of REML also requires iterative estimation between parameters, needing moderately high computation time, especially when the dimension of outcomes is large. A multivariate method of moments (MMM) is available and is shown to perform equally well to REML. However, there is a lack of information on the performance of these two methods when the true data distribution is far from normality. In this paper, we propose a new nonparametric and non-iterative method for multivariate meta-analysis on the basis of the theory of U-statistic and compare the properties of these three procedures under both normal and skewed data through simulation studies. It is shown that the effect on estimates from REML because of non-normal data distribution is marginal and that the estimates from MMM and U-statistic-based approaches are very similar. Therefore, we conclude that for performing multivariate meta-analysis, the U-statistic estimation procedure is a viable alternative to REML and MMM. Easy implementation of all three methods are illustrated by their application to data from two published meta-analysis from the fields of hip fracture and periodontal disease. We discuss ideas for future research based on U-statistic for testing significance of between-study heterogeneity and for extending the work to meta-regression setting. Copyright © 2011 John Wiley & Sons, Ltd.

  20. Multivariate statistical analysis of major and trace element data for ...

    African Journals Online (AJOL)

    Multivariate statistical analysis of major and trace element data for niobium exploration in the peralkaline granites of the anorogenic ring-complex province of Nigeria. PO Ogunleye, EC Ike, I Garba. Abstract. No Abstract Available Journal of Mining and Geology Vol.40(2) 2004: 107-117. Full Text: EMAIL FULL TEXT EMAIL ...

  1. Applying contemporary statistical techniques

    CERN Document Server

    Wilcox, Rand R

    2003-01-01

    Applying Contemporary Statistical Techniques explains why traditional statistical methods are often inadequate or outdated when applied to modern problems. Wilcox demonstrates how new and more powerful techniques address these problems far more effectively, making these modern robust methods understandable, practical, and easily accessible.* Assumes no previous training in statistics * Explains how and why modern statistical methods provide more accurate results than conventional methods* Covers the latest developments on multiple comparisons * Includes recent advanc

  2. Combined data preprocessing and multivariate statistical analysis characterizes fed-batch culture of mouse hybridoma cells for rational medium design.

    Science.gov (United States)

    Selvarasu, Suresh; Kim, Do Yun; Karimi, Iftekhar A; Lee, Dong-Yup

    2010-10-01

    We present an integrated framework for characterizing fed-batch cultures of mouse hybridoma cells producing monoclonal antibody (mAb). This framework systematically combines data preprocessing, elemental balancing and statistical analysis technique. Initially, specific rates of cell growth, glucose/amino acid consumptions and mAb/metabolite productions were calculated via curve fitting using logistic equations, with subsequent elemental balancing of the preprocessed data indicating the presence of experimental measurement errors. Multivariate statistical analysis was then employed to understand physiological characteristics of the cellular system. The results from principal component analysis (PCA) revealed three major clusters of amino acids with similar trends in their consumption profiles: (i) arginine, threonine and serine, (ii) glycine, tyrosine, phenylalanine, methionine, histidine and asparagine, and (iii) lysine, valine and isoleucine. Further analysis using partial least square (PLS) regression identified key amino acids which were positively or negatively correlated with the cell growth, mAb production and the generation of lactate and ammonia. Based on these results, the optimal concentrations of key amino acids in the feed medium can be inferred, potentially leading to an increase in cell viability and productivity, as well as a decrease in toxic waste production. The study demonstrated how the current methodological framework using multivariate statistical analysis techniques can serve as a potential tool for deriving rational medium design strategies. Copyright © 2010 Elsevier B.V. All rights reserved.

  3. Multivariate statistical methods and data mining in particle physics (4/4)

    CERN Multimedia

    CERN. Geneva

    2008-01-01

    The lectures will cover multivariate statistical methods and their applications in High Energy Physics. The methods will be viewed in the framework of a statistical test, as used e.g. to discriminate between signal and background events. Topics will include an introduction to the relevant statistical formalism, linear test variables, neural networks, probability density estimation (PDE) methods, kernel-based PDE, decision trees and support vector machines. The methods will be evaluated with respect to criteria relevant to HEP analyses such as statistical power, ease of computation and sensitivity to systematic effects. Simple computer examples that can be extended to more complex analyses will be presented.

  4. Multivariate statistical methods and data mining in particle physics (2/4)

    CERN Multimedia

    CERN. Geneva

    2008-01-01

    The lectures will cover multivariate statistical methods and their applications in High Energy Physics. The methods will be viewed in the framework of a statistical test, as used e.g. to discriminate between signal and background events. Topics will include an introduction to the relevant statistical formalism, linear test variables, neural networks, probability density estimation (PDE) methods, kernel-based PDE, decision trees and support vector machines. The methods will be evaluated with respect to criteria relevant to HEP analyses such as statistical power, ease of computation and sensitivity to systematic effects. Simple computer examples that can be extended to more complex analyses will be presented.

  5. Multivariate statistical methods and data mining in particle physics (1/4)

    CERN Multimedia

    CERN. Geneva

    2008-01-01

    The lectures will cover multivariate statistical methods and their applications in High Energy Physics. The methods will be viewed in the framework of a statistical test, as used e.g. to discriminate between signal and background events. Topics will include an introduction to the relevant statistical formalism, linear test variables, neural networks, probability density estimation (PDE) methods, kernel-based PDE, decision trees and support vector machines. The methods will be evaluated with respect to criteria relevant to HEP analyses such as statistical power, ease of computation and sensitivity to systematic effects. Simple computer examples that can be extended to more complex analyses will be presented.

  6. Application of multivariate statistical technique for hydrogeochemical assessment of groundwater within the Lower Pra Basin, Ghana

    Science.gov (United States)

    Tay, C. K.; Hayford, E. K.; Hodgson, I. O. A.

    2017-06-01

    Multivariate statistical technique and hydrogeochemical approach were employed for groundwater assessment within the Lower Pra Basin. The main objective was to delineate the main processes that are responsible for the water chemistry and pollution of groundwater within the basin. Fifty-four (54) (No) boreholes were sampled in January 2012 for quality assessment. PCA using Varimax with Kaiser Normalization method of extraction for both rotated space and component matrix have been applied to the data. Results show that Spearman's correlation matrix of major ions revealed expected process-based relationships derived mainly from the geochemical processes, such as ion-exchange and silicate/aluminosilicate weathering within the aquifer. Three main principal components influence the water chemistry and pollution of groundwater within the basin. The three principal components have accounted for approximately 79% of the total variance in the hydrochemical data. Component 1 delineates the main natural processes (water-soil-rock interactions) through which groundwater within the basin acquires its chemical characteristics, Component 2 delineates the incongruent dissolution of silicate/aluminosilicates, while Component 3 delineates the prevalence of pollution principally from agricultural input as well as trace metal mobilization in groundwater within the basin. The loadings and score plots of the first two PCs show grouping pattern which indicates the strength of the mutual relation among the hydrochemical variables. In terms of proper management and development of groundwater within the basin, communities, where intense agriculture is taking place, should be monitored and protected from agricultural activities. especially where inorganic fertilizers are used by creating buffer zones. Monitoring of the water quality especially the water pH is recommended to ensure the acid neutralizing potential of groundwater within the basin thereby, curtailing further trace metal

  7. Monitoring a PVC batch process with multivariate statistical process control charts

    NARCIS (Netherlands)

    Tates, A. A.; Louwerse, D. J.; Smilde, A. K.; Koot, G. L. M.; Berndt, H.

    1999-01-01

    Multivariate statistical process control charts (MSPC charts) are developed for the industrial batch production process of poly(vinyl chloride) (PVC). With these MSPC charts different types of abnormal batch behavior were detected on-line. With batch contribution plots, the probable causes of these

  8. An exercise in model validation: Comparing univariate statistics and Monte Carlo-based multivariate statistics

    International Nuclear Information System (INIS)

    Weathers, J.B.; Luck, R.; Weathers, J.W.

    2009-01-01

    The complexity of mathematical models used by practicing engineers is increasing due to the growing availability of sophisticated mathematical modeling tools and ever-improving computational power. For this reason, the need to define a well-structured process for validating these models against experimental results has become a pressing issue in the engineering community. This validation process is partially characterized by the uncertainties associated with the modeling effort as well as the experimental results. The net impact of the uncertainties on the validation effort is assessed through the 'noise level of the validation procedure', which can be defined as an estimate of the 95% confidence uncertainty bounds for the comparison error between actual experimental results and model-based predictions of the same quantities of interest. Although general descriptions associated with the construction of the noise level using multivariate statistics exists in the literature, a detailed procedure outlining how to account for the systematic and random uncertainties is not available. In this paper, the methodology used to derive the covariance matrix associated with the multivariate normal pdf based on random and systematic uncertainties is examined, and a procedure used to estimate this covariance matrix using Monte Carlo analysis is presented. The covariance matrices are then used to construct approximate 95% confidence constant probability contours associated with comparison error results for a practical example. In addition, the example is used to show the drawbacks of using a first-order sensitivity analysis when nonlinear local sensitivity coefficients exist. Finally, the example is used to show the connection between the noise level of the validation exercise calculated using multivariate and univariate statistics.

  9. An exercise in model validation: Comparing univariate statistics and Monte Carlo-based multivariate statistics

    Energy Technology Data Exchange (ETDEWEB)

    Weathers, J.B. [Shock, Noise, and Vibration Group, Northrop Grumman Shipbuilding, P.O. Box 149, Pascagoula, MS 39568 (United States)], E-mail: James.Weathers@ngc.com; Luck, R. [Department of Mechanical Engineering, Mississippi State University, 210 Carpenter Engineering Building, P.O. Box ME, Mississippi State, MS 39762-5925 (United States)], E-mail: Luck@me.msstate.edu; Weathers, J.W. [Structural Analysis Group, Northrop Grumman Shipbuilding, P.O. Box 149, Pascagoula, MS 39568 (United States)], E-mail: Jeffrey.Weathers@ngc.com

    2009-11-15

    The complexity of mathematical models used by practicing engineers is increasing due to the growing availability of sophisticated mathematical modeling tools and ever-improving computational power. For this reason, the need to define a well-structured process for validating these models against experimental results has become a pressing issue in the engineering community. This validation process is partially characterized by the uncertainties associated with the modeling effort as well as the experimental results. The net impact of the uncertainties on the validation effort is assessed through the 'noise level of the validation procedure', which can be defined as an estimate of the 95% confidence uncertainty bounds for the comparison error between actual experimental results and model-based predictions of the same quantities of interest. Although general descriptions associated with the construction of the noise level using multivariate statistics exists in the literature, a detailed procedure outlining how to account for the systematic and random uncertainties is not available. In this paper, the methodology used to derive the covariance matrix associated with the multivariate normal pdf based on random and systematic uncertainties is examined, and a procedure used to estimate this covariance matrix using Monte Carlo analysis is presented. The covariance matrices are then used to construct approximate 95% confidence constant probability contours associated with comparison error results for a practical example. In addition, the example is used to show the drawbacks of using a first-order sensitivity analysis when nonlinear local sensitivity coefficients exist. Finally, the example is used to show the connection between the noise level of the validation exercise calculated using multivariate and univariate statistics.

  10. Multivariate statistical pattern recognition system for reactor noise analysis

    International Nuclear Information System (INIS)

    Gonzalez, R.C.; Howington, L.C.; Sides, W.H. Jr.; Kryter, R.C.

    1976-01-01

    A multivariate statistical pattern recognition system for reactor noise analysis was developed. The basis of the system is a transformation for decoupling correlated variables and algorithms for inferring probability density functions. The system is adaptable to a variety of statistical properties of the data, and it has learning, tracking, and updating capabilities. System design emphasizes control of the false-alarm rate. The ability of the system to learn normal patterns of reactor behavior and to recognize deviations from these patterns was evaluated by experiments at the ORNL High-Flux Isotope Reactor (HFIR). Power perturbations of less than 0.1 percent of the mean value in selected frequency ranges were detected by the system

  11. Multivariate statistical pattern recognition system for reactor noise analysis

    International Nuclear Information System (INIS)

    Gonzalez, R.C.; Howington, L.C.; Sides, W.H. Jr.; Kryter, R.C.

    1975-01-01

    A multivariate statistical pattern recognition system for reactor noise analysis was developed. The basis of the system is a transformation for decoupling correlated variables and algorithms for inferring probability density functions. The system is adaptable to a variety of statistical properties of the data, and it has learning, tracking, and updating capabilities. System design emphasizes control of the false-alarm rate. The ability of the system to learn normal patterns of reactor behavior and to recognize deviations from these patterns was evaluated by experiments at the ORNL High-Flux Isotope Reactor (HFIR). Power perturbations of less than 0.1 percent of the mean value in selected frequency ranges were detected by the system. 19 references

  12. A Hierarchical Multivariate Bayesian Approach to Ensemble Model output Statistics in Atmospheric Prediction

    Science.gov (United States)

    2017-09-01

    application of statistical inference. Even when human forecasters leverage their professional experience, which is often gained through long periods of... application throughout statistics and Bayesian data analysis. The multivariate form of 2( , )  (e.g., Figure 12) is similarly analytically...data (i.e., no systematic manipulations with analytical functions), it is common in the statistical literature to apply mathematical transformations

  13. Multivariate Analysis, Mass Balance Techniques, and Statistical Tests as Tools in Igneous Petrology: Application to the Sierra de las Cruces Volcanic Range (Mexican Volcanic Belt)

    Science.gov (United States)

    Velasco-Tapia, Fernando

    2014-01-01

    Magmatic processes have usually been identified and evaluated using qualitative or semiquantitative geochemical or isotopic tools based on a restricted number of variables. However, a more complete and quantitative view could be reached applying multivariate analysis, mass balance techniques, and statistical tests. As an example, in this work a statistical and quantitative scheme is applied to analyze the geochemical features for the Sierra de las Cruces (SC) volcanic range (Mexican Volcanic Belt). In this locality, the volcanic activity (3.7 to 0.5 Ma) was dominantly dacitic, but the presence of spheroidal andesitic enclaves and/or diverse disequilibrium features in majority of lavas confirms the operation of magma mixing/mingling. New discriminant-function-based multidimensional diagrams were used to discriminate tectonic setting. Statistical tests of discordancy and significance were applied to evaluate the influence of the subducting Cocos plate, which seems to be rather negligible for the SC magmas in relation to several major and trace elements. A cluster analysis following Ward's linkage rule was carried out to classify the SC volcanic rocks geochemical groups. Finally, two mass-balance schemes were applied for the quantitative evaluation of the proportion of the end-member components (dacitic and andesitic magmas) in the comingled lavas (binary mixtures). PMID:24737994

  14. Multivariate Analysis, Mass Balance Techniques, and Statistical Tests as Tools in Igneous Petrology: Application to the Sierra de las Cruces Volcanic Range (Mexican Volcanic Belt

    Directory of Open Access Journals (Sweden)

    Fernando Velasco-Tapia

    2014-01-01

    Full Text Available Magmatic processes have usually been identified and evaluated using qualitative or semiquantitative geochemical or isotopic tools based on a restricted number of variables. However, a more complete and quantitative view could be reached applying multivariate analysis, mass balance techniques, and statistical tests. As an example, in this work a statistical and quantitative scheme is applied to analyze the geochemical features for the Sierra de las Cruces (SC volcanic range (Mexican Volcanic Belt. In this locality, the volcanic activity (3.7 to 0.5 Ma was dominantly dacitic, but the presence of spheroidal andesitic enclaves and/or diverse disequilibrium features in majority of lavas confirms the operation of magma mixing/mingling. New discriminant-function-based multidimensional diagrams were used to discriminate tectonic setting. Statistical tests of discordancy and significance were applied to evaluate the influence of the subducting Cocos plate, which seems to be rather negligible for the SC magmas in relation to several major and trace elements. A cluster analysis following Ward’s linkage rule was carried out to classify the SC volcanic rocks geochemical groups. Finally, two mass-balance schemes were applied for the quantitative evaluation of the proportion of the end-member components (dacitic and andesitic magmas in the comingled lavas (binary mixtures.

  15. Source Evaluation and Trace Metal Contamination in Benthic Sediments from Equatorial Ecosystems Using Multivariate Statistical Techniques.

    Directory of Open Access Journals (Sweden)

    Nsikak U Benson

    Full Text Available Trace metals (Cd, Cr, Cu, Ni and Pb concentrations in benthic sediments were analyzed through multi-step fractionation scheme to assess the levels and sources of contamination in estuarine, riverine and freshwater ecosystems in Niger Delta (Nigeria. The degree of contamination was assessed using the individual contamination factors (ICF and global contamination factor (GCF. Multivariate statistical approaches including principal component analysis (PCA, cluster analysis and correlation test were employed to evaluate the interrelationships and associated sources of contamination. The spatial distribution of metal concentrations followed the pattern Pb>Cu>Cr>Cd>Ni. Ecological risk index by ICF showed significant potential mobility and bioavailability for Cu, Cu and Ni. The ICF contamination trend in the benthic sediments at all studied sites was Cu>Cr>Ni>Cd>Pb. The principal component and agglomerative clustering analyses indicate that trace metals contamination in the ecosystems was influenced by multiple pollution sources.

  16. Multivariate two-part statistics for analysis of correlated mass spectrometry data from multiple biological specimens.

    Science.gov (United States)

    Taylor, Sandra L; Ruhaak, L Renee; Weiss, Robert H; Kelly, Karen; Kim, Kyoungmi

    2017-01-01

    High through-put mass spectrometry (MS) is now being used to profile small molecular compounds across multiple biological sample types from the same subjects with the goal of leveraging information across biospecimens. Multivariate statistical methods that combine information from all biospecimens could be more powerful than the usual univariate analyses. However, missing values are common in MS data and imputation can impact between-biospecimen correlation and multivariate analysis results. We propose two multivariate two-part statistics that accommodate missing values and combine data from all biospecimens to identify differentially regulated compounds. Statistical significance is determined using a multivariate permutation null distribution. Relative to univariate tests, the multivariate procedures detected more significant compounds in three biological datasets. In a simulation study, we showed that multi-biospecimen testing procedures were more powerful than single-biospecimen methods when compounds are differentially regulated in multiple biospecimens but univariate methods can be more powerful if compounds are differentially regulated in only one biospecimen. We provide R functions to implement and illustrate our method as supplementary information CONTACT: sltaylor@ucdavis.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  17. Understanding the groundwater dynamics in the Southern Rift Valley Lakes Basin (Ethiopia). Multivariate statistical analysis method, oxygen (δ 18O) and deuterium (δ 2H)

    International Nuclear Information System (INIS)

    Girum Admasu Nadew; Zebene Lakew Tefera

    2013-01-01

    Multivariate statistical analysis is very important to classify waters of different hydrochemical groups. Statistical techniques, such as cluster analysis, can provide a powerful tool for analyzing water chemistry data. This method is used to test water quality data and determine if samples can be grouped into distinct populations that may be significant in the geologic context, as well as from a statistical point of view. Multivariate statistical analysis method is applied to the geochemical data in combination with δ 18 O and δ 2 H isotopes with the objective to understand the dynamics of groundwater using hierarchical clustering and isotope analyses. The geochemical and isotope data of the central and southern rift valley lakes have been collected and analyzed from different works. Isotope analysis shows that most springs and boreholes are recharged by July and August rainfalls. The different hydrochemical groups that resulted from the multivariate analysis are described and correlated with the geology of the area and whether it has any interaction with a system or not. (author)

  18. Resemblance profiles as clustering decision criteria: Estimating statistical power, error, and correspondence for a hypothesis test for multivariate structure.

    Science.gov (United States)

    Kilborn, Joshua P; Jones, David L; Peebles, Ernst B; Naar, David F

    2017-04-01

    Clustering data continues to be a highly active area of data analysis, and resemblance profiles are being incorporated into ecological methodologies as a hypothesis testing-based approach to clustering multivariate data. However, these new clustering techniques have not been rigorously tested to determine the performance variability based on the algorithm's assumptions or any underlying data structures. Here, we use simulation studies to estimate the statistical error rates for the hypothesis test for multivariate structure based on dissimilarity profiles (DISPROF). We concurrently tested a widely used algorithm that employs the unweighted pair group method with arithmetic mean (UPGMA) to estimate the proficiency of clustering with DISPROF as a decision criterion. We simulated unstructured multivariate data from different probability distributions with increasing numbers of objects and descriptors, and grouped data with increasing overlap, overdispersion for ecological data, and correlation among descriptors within groups. Using simulated data, we measured the resolution and correspondence of clustering solutions achieved by DISPROF with UPGMA against the reference grouping partitions used to simulate the structured test datasets. Our results highlight the dynamic interactions between dataset dimensionality, group overlap, and the properties of the descriptors within a group (i.e., overdispersion or correlation structure) that are relevant to resemblance profiles as a clustering criterion for multivariate data. These methods are particularly useful for multivariate ecological datasets that benefit from distance-based statistical analyses. We propose guidelines for using DISPROF as a clustering decision tool that will help future users avoid potential pitfalls during the application of methods and the interpretation of results.

  19. Assessment of Near-Bottom Water Quality of Southwestern Coast of Sarawak, Borneo, Malaysia: A Multivariate Statistical Approach

    Directory of Open Access Journals (Sweden)

    Chen-Lin Soo

    2017-01-01

    Full Text Available The study on Sarawak coastal water quality is scarce, not to mention the application of the multivariate statistical approach to investigate the spatial variation of water quality and to identify the pollution source in Sarawak coastal water. Hence, the present study aimed to evaluate the spatial variation of water quality along the coastline of the southwestern region of Sarawak using multivariate statistical techniques. Seventeen physicochemical parameters were measured at 11 stations along the coastline with approximately 225 km length. The coastal water quality showed spatial heterogeneity where the cluster analysis grouped the 11 stations into four different clusters. Deterioration in coastal water quality has been observed in different regions of Sarawak corresponding to land use patterns in the region. Nevertheless, nitrate-nitrogen exceeded the guideline value at all sampling stations along the coastline. The principal component analysis (PCA has determined a reduced number of five principal components that explained 89.0% of the data set variance. The first PC indicated that the nutrients were the dominant polluting factors, which is attributed to the domestic, agricultural, and aquaculture activities, followed by the suspended solids in the second PC which are related to the logging activities.

  20. Search for the top quark using multivariate analysis techniques

    International Nuclear Information System (INIS)

    Bhat, P.C.

    1994-08-01

    The D0 collaboration is developing top search strategies using multivariate analysis techniques. We report here on applications of the H-matrix method to the eμ channel and neural networks to the e+jets channel

  1. Multivariate statistical analyses demonstrate unique host immune responses to single and dual lentiviral infection.

    Directory of Open Access Journals (Sweden)

    Sunando Roy

    2009-10-01

    Full Text Available Feline immunodeficiency virus (FIV and human immunodeficiency virus (HIV are recently identified lentiviruses that cause progressive immune decline and ultimately death in infected cats and humans. It is of great interest to understand how to prevent immune system collapse caused by these lentiviruses. We recently described that disease caused by a virulent FIV strain in cats can be attenuated if animals are first infected with a feline immunodeficiency virus derived from a wild cougar. The detailed temporal tracking of cat immunological parameters in response to two viral infections resulted in high-dimensional datasets containing variables that exhibit strong co-variation. Initial analyses of these complex data using univariate statistical techniques did not account for interactions among immunological response variables and therefore potentially obscured significant effects between infection state and immunological parameters.Here, we apply a suite of multivariate statistical tools, including Principal Component Analysis, MANOVA and Linear Discriminant Analysis, to temporal immunological data resulting from FIV superinfection in domestic cats. We investigated the co-variation among immunological responses, the differences in immune parameters among four groups of five cats each (uninfected, single and dual infected animals, and the "immune profiles" that discriminate among them over the first four weeks following superinfection. Dual infected cats mount an immune response by 24 days post superinfection that is characterized by elevated levels of CD8 and CD25 cells and increased expression of IL4 and IFNgamma, and FAS. This profile discriminates dual infected cats from cats infected with FIV alone, which show high IL-10 and lower numbers of CD8 and CD25 cells.Multivariate statistical analyses demonstrate both the dynamic nature of the immune response to FIV single and dual infection and the development of a unique immunological profile in dual

  2. A multi-variate discrimination technique based on range-searching

    International Nuclear Information System (INIS)

    Carli, T.; Koblitz, B.

    2003-01-01

    We present a fast and transparent multi-variate event classification technique, called PDE-RS, which is based on sampling the signal and background densities in a multi-dimensional phase space using range-searching. The employed algorithm is presented in detail and its behaviour is studied with simple toy examples representing basic patterns of problems often encountered in High Energy Physics data analyses. In addition an example relevant for the search for instanton-induced processes in deep-inelastic scattering at HERA is discussed. For all studied examples, the new presented method performs as good as artificial Neural Networks and has furthermore the advantage to need less computation time. This allows to carefully select the best combination of observables which optimally separate the signal and background and for which the simulations describe the data best. Moreover, the systematic and statistical uncertainties can be easily evaluated. The method is therefore a powerful tool to find a small number of signal events in the large data samples expected at future particle colliders

  3. Flow prediction models using macroclimatic variables and multivariate statistical techniques in the Cauca River Valley

    International Nuclear Information System (INIS)

    Carvajal Escobar Yesid; Munoz, Flor Matilde

    2007-01-01

    The project this centred in the revision of the state of the art of the ocean-atmospheric phenomena that you affect the Colombian hydrology especially The Phenomenon Enos that causes a socioeconomic impact of first order in our country, it has not been sufficiently studied; therefore it is important to approach the thematic one, including the variable macroclimates associated to the Enos in the analyses of water planning. The analyses include revision of statistical techniques of analysis of consistency of hydrological data with the objective of conforming a database of monthly flow of the river reliable and homogeneous Cauca. Statistical methods are used (Analysis of data multivariante) specifically The analysis of principal components to involve them in the development of models of prediction of flows monthly means in the river Cauca involving the Lineal focus as they are the model autoregressive AR, ARX and Armax and the focus non lineal Net Artificial Network.

  4. Multivariate statistical treatment of PIXE analysis of some traditional Chinese medicines

    International Nuclear Information System (INIS)

    Xiaofeng Zhang; Jianguo Ma; Junfa Qin; Lun Xiao

    1991-01-01

    Elements in two kinds of 30 traditional Chinese medicines were analyzed by PIXE method, and the data were treated by multivariate statistical methods. The results show that these two kinds of traditional Chinese medicines are almost separable according to their elemental contents. The results are congruous with the traditional Chinese medicine practice. (author) 7 refs.; 2 figs.; 2 tabs

  5. Multivariate meta-analysis: Potential and promise

    Science.gov (United States)

    Jackson, Dan; Riley, Richard; White, Ian R

    2011-01-01

    The multivariate random effects model is a generalization of the standard univariate model. Multivariate meta-analysis is becoming more commonly used and the techniques and related computer software, although continually under development, are now in place. In order to raise awareness of the multivariate methods, and discuss their advantages and disadvantages, we organized a one day ‘Multivariate meta-analysis’ event at the Royal Statistical Society. In addition to disseminating the most recent developments, we also received an abundance of comments, concerns, insights, critiques and encouragement. This article provides a balanced account of the day's discourse. By giving others the opportunity to respond to our assessment, we hope to ensure that the various view points and opinions are aired before multivariate meta-analysis simply becomes another widely used de facto method without any proper consideration of it by the medical statistics community. We describe the areas of application that multivariate meta-analysis has found, the methods available, the difficulties typically encountered and the arguments for and against the multivariate methods, using four representative but contrasting examples. We conclude that the multivariate methods can be useful, and in particular can provide estimates with better statistical properties, but also that these benefits come at the price of making more assumptions which do not result in better inference in every case. Although there is evidence that multivariate meta-analysis has considerable potential, it must be even more carefully applied than its univariate counterpart in practice. Copyright © 2011 John Wiley & Sons, Ltd. PMID:21268052

  6. UNCOVERING THE FORMATION OF ULTRACOMPACT DWARF GALAXIES BY MULTIVARIATE STATISTICAL ANALYSIS

    International Nuclear Information System (INIS)

    Chattopadhyay, Tanuka; Sharina, Margarita; Davoust, Emmanuel; De, Tuli; Chattopadhyay, Asis Kumar

    2012-01-01

    We present a statistical analysis of the properties of a large sample of dynamically hot old stellar systems, from globular clusters (GCs) to giant ellipticals, which was performed in order to investigate the origin of ultracompact dwarf galaxies (UCDs). The data were mostly drawn from Forbes et al. We recalculated some of the effective radii, computed mean surface brightnesses and mass-to-light ratios, and estimated ages and metallicities. We completed the sample with GCs of M31. We used a multivariate statistical technique (K-Means clustering), together with a new algorithm (Gap Statistics) for finding the optimum number of homogeneous sub-groups in the sample, using a total of six parameters (absolute magnitude, effective radius, virial mass-to-light ratio, stellar mass-to-light ratio, and metallicity). We found six groups. FK1 and FK5 are composed of high- and low-mass elliptical galaxies, respectively. FK3 and FK6 are composed of high-metallicity and low-metallicity objects, respectively, and both include GCs and UCDs. Two very small groups, FK2 and FK4, are composed of Local Group dwarf spheroidals. Our groups differ in their mean masses and virial mass-to-light ratios. The relations between these two parameters are also different for the various groups. The probability density distributions of metallicity for the four groups of galaxies are similar to those of the GCs and UCDs. The brightest low-metallicity GCs and UCDs tend to follow the mass-metallicity relation like elliptical galaxies. The objects of FK3 are more metal-rich per unit effective luminosity density than high-mass ellipticals.

  7. UNCOVERING THE FORMATION OF ULTRACOMPACT DWARF GALAXIES BY MULTIVARIATE STATISTICAL ANALYSIS

    Energy Technology Data Exchange (ETDEWEB)

    Chattopadhyay, Tanuka [Department of Applied Mathematics, Calcutta University, 92 A.P.C. Road, Calcutta 700009 (India); Sharina, Margarita [Special Astrophysical Observatory, Russian Academy of Sciences, N. Arkhyz, KCh R 369167 (Russian Federation); Davoust, Emmanuel [IRAP, Universite de Toulouse, CNRS, 14 Avenue Edouard Belin, 31400 Toulouse (France); De, Tuli; Chattopadhyay, Asis Kumar, E-mail: tanuka@iucaa.ernet.in, E-mail: sme@sao.ru, E-mail: davoust@ast.obs-mip.fr, E-mail: akcstat@caluniv.ac.in [Department of Statistics, Calcutta University, 35 B.C. Road, Calcutta 700019 (India)

    2012-05-10

    We present a statistical analysis of the properties of a large sample of dynamically hot old stellar systems, from globular clusters (GCs) to giant ellipticals, which was performed in order to investigate the origin of ultracompact dwarf galaxies (UCDs). The data were mostly drawn from Forbes et al. We recalculated some of the effective radii, computed mean surface brightnesses and mass-to-light ratios, and estimated ages and metallicities. We completed the sample with GCs of M31. We used a multivariate statistical technique (K-Means clustering), together with a new algorithm (Gap Statistics) for finding the optimum number of homogeneous sub-groups in the sample, using a total of six parameters (absolute magnitude, effective radius, virial mass-to-light ratio, stellar mass-to-light ratio, and metallicity). We found six groups. FK1 and FK5 are composed of high- and low-mass elliptical galaxies, respectively. FK3 and FK6 are composed of high-metallicity and low-metallicity objects, respectively, and both include GCs and UCDs. Two very small groups, FK2 and FK4, are composed of Local Group dwarf spheroidals. Our groups differ in their mean masses and virial mass-to-light ratios. The relations between these two parameters are also different for the various groups. The probability density distributions of metallicity for the four groups of galaxies are similar to those of the GCs and UCDs. The brightest low-metallicity GCs and UCDs tend to follow the mass-metallicity relation like elliptical galaxies. The objects of FK3 are more metal-rich per unit effective luminosity density than high-mass ellipticals.

  8. Statistical analysis of management data

    CERN Document Server

    Gatignon, Hubert

    2013-01-01

    This book offers a comprehensive approach to multivariate statistical analyses. It provides theoretical knowledge of the concepts underlying the most important multivariate techniques and an overview of actual applications.

  9. Multivariate Statistical Analysis of Water Chemistry in Evaluating the Origin of Contamination in Many Devils Wash, Shiprock, New Mexico

    International Nuclear Information System (INIS)

    2012-01-01

    This report evaluates the chemistry of seep water occurring in three desert drainages near Shiprock, New Mexico: Many Devils Wash, Salt Creek Wash, and Eagle Nest Arroyo. Through the use of geochemical plotting tools and multivariate statistical analysis techniques, analytical results of samples collected from the three drainages are compared with the groundwater chemistry at a former uranium mill in the Shiprock area (the Shiprock site), managed by the U.S. Department of Energy Office of Legacy Management. The objective of this study was to determine, based on the water chemistry of the samples, if statistically significant patterns or groupings are apparent between the sample populations and, if so, whether there are any reasonable explanations for those groupings.

  10. Multivariate Statistical Analysis of Water Chemistry in Evaluating the Origin of Contamination in Many Devils Wash, Shiprock, New Mexico

    Energy Technology Data Exchange (ETDEWEB)

    None, None

    2012-12-31

    This report evaluates the chemistry of seep water occurring in three desert drainages near Shiprock, New Mexico: Many Devils Wash, Salt Creek Wash, and Eagle Nest Arroyo. Through the use of geochemical plotting tools and multivariate statistical analysis techniques, analytical results of samples collected from the three drainages are compared with the groundwater chemistry at a former uranium mill in the Shiprock area (the Shiprock site), managed by the U.S. Department of Energy Office of Legacy Management. The objective of this study was to determine, based on the water chemistry of the samples, if statistically significant patterns or groupings are apparent between the sample populations and, if so, whether there are any reasonable explanations for those groupings.

  11. Characterization of Lavandula spp. Honey Using Multivariate Techniques.

    Science.gov (United States)

    Estevinho, Leticia M; Chambó, Emerson Dechechi; Pereira, Ana Paula Rodrigues; Carvalho, Carlos Alfredo Lopes de; Toledo, Vagner de Alencar Arnaut de

    2016-01-01

    Traditionally, melissopalynological and physicochemical analyses have been the most used to determine the botanical origin of honey. However, when performed individually, these analyses may provide less unambiguous results, making it difficult to discriminate between mono and multifloral honeys. In this context, with the aim of better characterizing this beehive product, a selection of 112 Lavandula spp. monofloral honey samples from several regions were evaluated by association of multivariate statistical techniques with physicochemical, melissopalynological and phenolic compounds analysis. All honey samples fulfilled the quality standards recommended by international legislation, except regarding sucrose content and diastase activity. The content of sucrose and the percentage of Lavandula spp. pollen have a strong positive association. In fact, it was found that higher amounts of sucrose in honey are related with highest percentage of pollen of Lavandula spp.. The samples were very similar for most of the physicochemical parameters, except for proline, flavonoids and phenols (bioactive factors). Concerning the pollen spectrum, the variation of Lavandula spp. pollen percentage in honey had little contribution to the formation of samples groups. The formation of two groups regarding the physicochemical parameters suggests that the presence of other pollen types in small percentages influences the factor termed as "bioactive", which has been linked to diverse beneficial health effects.

  12. Generalized Tensor-Based Morphometry of HIV/AIDS Using Multivariate Statistics on Deformation Tensors

    OpenAIRE

    Lepore, Natasha; Brun, Caroline; Chou, Yi-Yu; Chiang, Ming-Chang; Dutton, Rebecca A.; Hayashi, Kiralee M.; Luders, Eileen; Lopez, Oscar L.; Aizenstein, Howard J.; Toga, Arthur W.; Becker, James T.; Thompson, Paul M.

    2008-01-01

    This paper investigates the performance of a new multivariate method for tensor-based morphometry (TBM). Statistics on Riemannian manifolds are developed that exploit the full information in deformation tensor fields. In TBM, multiple brain images are warped to a common neuroanatomical template via 3-D nonlinear registration; the resulting deformation fields are analyzed statistically to identify group differences in anatomy. Rather than study the Jacobian determinant (volume expansion factor...

  13. The iron bars from the ‘Gresham Ship’: employing multivariate statistics to further slag inclusion analysis of ferrous objects

    DEFF Research Database (Denmark)

    Birch, Thomas; Martinón-Torres, Marcos

    2015-01-01

    An assemblage of post-medieval iron bars was found with the Princes Channel wreck, salvaged from the Thames Estuary in 2003. They were recorded and studied, with a focus on metallography and slag inclusion analysis. The investigation provided an opportunity to explore the use of multivariate...... statistical techniques to analyse slag inclusion data. Cluster analysis supplemented by principal components analysis revealed two groups of iron, probably originating from different smelting systems, which were compared to those observed macroscopically and through metallography. The analyses reveal...

  14. Assessment of metals bioavailability to vegetables under field conditions using DGT, single extractions and multivariate statistics

    Science.gov (United States)

    2012-01-01

    Background The metals bioavailability in soils is commonly assessed by chemical extractions; however a generally accepted method is not yet established. In this study, the effectiveness of Diffusive Gradients in Thin-films (DGT) technique and single extractions in the assessment of metals bioaccumulation in vegetables, and the influence of soil parameters on phytoavailability were evaluated using multivariate statistics. Soil and plants grown in vegetable gardens from mining-affected rural areas, NW Romania, were collected and analysed. Results Pseudo-total metal content of Cu, Zn and Cd in soil ranged between 17.3-146 mg kg-1, 141–833 mg kg-1 and 0.15-2.05 mg kg-1, respectively, showing enriched contents of these elements. High degrees of metals extractability in 1M HCl and even in 1M NH4Cl were observed. Despite the relatively high total metal concentrations in soil, those found in vegetables were comparable to values typically reported for agricultural crops, probably due to the low concentrations of metals in soil solution (Csoln) and low effective concentrations (CE), assessed by DGT technique. Among the analysed vegetables, the highest metal concentrations were found in carrots roots. By applying multivariate statistics, it was found that CE, Csoln and extraction in 1M NH4Cl, were better predictors for metals bioavailability than the acid extractions applied in this study. Copper transfer to vegetables was strongly influenced by soil organic carbon (OC) and cation exchange capacity (CEC), while pH had a higher influence on Cd transfer from soil to plants. Conclusions The results showed that DGT can be used for general evaluation of the risks associated to soil contamination with Cu, Zn and Cd in field conditions. Although quantitative information on metals transfer from soil to vegetables was not observed. PMID:23079133

  15. Assessment of roadside surface water quality of Savar, Dhaka, Bangladesh using GIS and multivariate statistical techniques

    Science.gov (United States)

    Ahmed, Fahad; Fakhruddin, A. N. M.; Imam, MD. Toufick; Khan, Nasima; Abdullah, Abu Tareq Mohammad; Khan, Tanzir Ahmed; Rahman, Md. Mahfuzur; Uddin, Mohammad Nashir

    2017-11-01

    In this study, multivariate statistical techniques in collaboration with GIS are used to assess the roadside surface water quality of Savar region. Nineteen water samples were collected in dry season and 15 water quality parameters including TSS, TDS, pH, DO, BOD, Cl-, F-, NO3 2-, NO2 -, SO4 2-, Ca, Mg, K, Zn and Pb were measured. The univariate overview of water quality parameters are TSS 25.154 ± 8.674 mg/l, TDS 840.400 ± 311.081 mg/l, pH 7.574 ± 0.256 pH unit, DO 4.544 ± 0.933 mg/l, BOD 0.758 ± 0.179 mg/l, Cl- 51.494 ± 28.095 mg/l, F- 0.771 ± 0.153 mg/l, NO3 2- 2.211 ± 0.878 mg/l, NO2 - 4.692 ± 5.971 mg/l, SO4 2- 69.545 ± 53.873 mg/l, Ca 48.458 ± 22.690 mg/l, Mg 19.676 ± 7.361 mg/l, K 12.874 ± 11.382 mg/l, Zn 0.027 ± 0.029 mg/l, Pb 0.096 ± 0.154 mg/l. The water quality data were subjected to R-mode PCA which resulted in five major components. PC1 explains 28% of total variance and indicates the roadside and brick field dust settle down (TDS, TSS) in the nearby water body. PC2 explains 22.123% of total variance and indicates the agricultural influence (K, Ca, and NO2 -). PC3 describes the contribution of nonpoint pollution from agricultural and soil erosion processes (SO4 2-, Cl-, and K). PC4 depicts heavy positively loaded by vehicle emission and diffusion from battery stores (Zn, Pb). PC5 depicts strong positive loading of BOD and strong negative loading of pH. Cluster analysis represents three major clusters for both water parameters and sampling sites. The site based on cluster showed similar grouping pattern of R-mode factor score map. The present work reveals a new scope to monitor the roadside water quality for future research in Bangladesh.

  16. Generalized tensor-based morphometry of HIV/AIDS using multivariate statistics on deformation tensors.

    Science.gov (United States)

    Lepore, N; Brun, C; Chou, Y Y; Chiang, M C; Dutton, R A; Hayashi, K M; Luders, E; Lopez, O L; Aizenstein, H J; Toga, A W; Becker, J T; Thompson, P M

    2008-01-01

    This paper investigates the performance of a new multivariate method for tensor-based morphometry (TBM). Statistics on Riemannian manifolds are developed that exploit the full information in deformation tensor fields. In TBM, multiple brain images are warped to a common neuroanatomical template via 3-D nonlinear registration; the resulting deformation fields are analyzed statistically to identify group differences in anatomy. Rather than study the Jacobian determinant (volume expansion factor) of these deformations, as is common, we retain the full deformation tensors and apply a manifold version of Hotelling's $T(2) test to them, in a Log-Euclidean domain. In 2-D and 3-D magnetic resonance imaging (MRI) data from 26 HIV/AIDS patients and 14 matched healthy subjects, we compared multivariate tensor analysis versus univariate tests of simpler tensor-derived indices: the Jacobian determinant, the trace, geodesic anisotropy, and eigenvalues of the deformation tensor, and the angle of rotation of its eigenvectors. We detected consistent, but more extensive patterns of structural abnormalities, with multivariate tests on the full tensor manifold. Their improved power was established by analyzing cumulative p-value plots using false discovery rate (FDR) methods, appropriately controlling for false positives. This increased detection sensitivity may empower drug trials and large-scale studies of disease that use tensor-based morphometry.

  17. An Improvement of the Hotelling T2 Statistic in Monitoring Multivariate Quality Characteristics

    Directory of Open Access Journals (Sweden)

    Ashkan Shabbak

    2012-01-01

    Full Text Available The Hotelling T2 statistic is the most popular statistic used in multivariate control charts to monitor multiple qualities. However, this statistic is easily affected by the existence of more than one outlier in the data set. To rectify this problem, robust control charts, which are based on the minimum volume ellipsoid and the minimum covariance determinant, have been proposed. Most researchers assess the performance of multivariate control charts based on the number of signals without paying much attention to whether those signals are really outliers. With due respect, we propose to evaluate control charts not only based on the number of detected outliers but also with respect to their correct positions. In this paper, an Upper Control Limit based on the median and the median absolute deviation is also proposed. The results of this study signify that the proposed Upper Control Limit improves the detection of correct outliers but that it suffers from a swamping effect when the positions of outliers are not taken into consideration. Finally, a robust control chart based on the diagnostic robust generalised potential procedure is introduced to remedy this drawback.

  18. Study on loss detection algorithms for tank monitoring data using multivariate statistical analysis

    International Nuclear Information System (INIS)

    Suzuki, Mitsutoshi; Burr, Tom

    2009-01-01

    Evaluation of solution monitoring data to support material balance evaluation was proposed about a decade ago because of concerns regarding the large throughput planned at Rokkasho Reprocessing Plant (RRP). A numerical study using the simulation code (FACSIM) was done and significant increases in the detection probabilities (DP) for certain types of losses were shown. To be accepted internationally, it is very important to verify such claims using real solution monitoring data. However, a demonstrative study with real tank data has not been carried out due to the confidentiality of the tank data. This paper describes an experimental study that has been started using actual data from the Solution Measurement and Monitoring System (SMMS) in the Tokai Reprocessing Plant (TRP) and the Savannah River Site (SRS). Multivariate statistical methods, such as a vector cumulative sum and a multi-scale statistical analysis, have been applied to the real tank data that have superimposed simulated loss. Although quantitative conclusions have not been derived for the moment due to the difficulty of baseline evaluation, the multivariate statistical methods remain promising for abrupt and some types of protracted loss detection. (author)

  19. Application of Multivariate Statistical Analysis to Biomarkers in Se-Turkey Crude Oils

    Science.gov (United States)

    Gürgey, K.; Canbolat, S.

    2017-11-01

    Twenty-four crude oil samples were collected from the 24 oil fields distributed in different districts of SE-Turkey. API and Sulphur content (%), Stable Carbon Isotope, Gas Chromatography (GC), and Gas Chromatography-Mass Spectrometry (GC-MS) data were used to construct a geochemical data matrix. The aim of this study is to examine the genetic grouping or correlations in the crude oil samples, hence the number of source rocks present in the SE-Turkey. To achieve these aims, two of the multivariate statistical analysis techniques (Principle Component Analysis [PCA] and Cluster Analysis were applied to data matrix of 24 samples and 8 source specific biomarker variables/parameters. The results showed that there are 3 genetically different oil groups: Batman-Nusaybin Oils, Adıyaman-Kozluk Oils and Diyarbakir Oils, in addition to a one mixed group. These groupings imply that at least, three different source rocks are present in South-Eastern (SE) Turkey. Grouping of the crude oil samples appears to be consistent with the geographic locations of the oils fields, subsurface stratigraphy as well as geology of the area.

  20. APPLICATION OF MULTIVARIATE STATISTICAL ANALYSIS TO BIOMARKERS IN SE-TURKEY CRUDE OILS

    Directory of Open Access Journals (Sweden)

    K. Gürgey

    2017-11-01

    Full Text Available Twenty-four crude oil samples were collected from the 24 oil fields distributed in different districts of SE-Turkey. API and Sulphur content (%, Stable Carbon Isotope, Gas Chromatography (GC, and Gas Chromatography-Mass Spectrometry (GC-MS data were used to construct a geochemical data matrix. The aim of this study is to examine the genetic grouping or correlations in the crude oil samples, hence the number of source rocks present in the SE-Turkey. To achieve these aims, two of the multivariate statistical analysis techniques (Principle Component Analysis [PCA] and Cluster Analysis were applied to data matrix of 24 samples and 8 source specific biomarker variables/parameters. The results showed that there are 3 genetically different oil groups: Batman-Nusaybin Oils, Adıyaman-Kozluk Oils and Diyarbakir Oils, in addition to a one mixed group. These groupings imply that at least, three different source rocks are present in South-Eastern (SE Turkey. Grouping of the crude oil samples appears to be consistent with the geographic locations of the oils fields, subsurface stratigraphy as well as geology of the area.

  1. Multivariate statistical analysis of atom probe tomography data

    International Nuclear Information System (INIS)

    Parish, Chad M.; Miller, Michael K.

    2010-01-01

    The application of spectrum imaging multivariate statistical analysis methods, specifically principal component analysis (PCA), to atom probe tomography (APT) data has been investigated. The mathematical method of analysis is described and the results for two example datasets are analyzed and presented. The first dataset is from the analysis of a PM 2000 Fe-Cr-Al-Ti steel containing two different ultrafine precipitate populations. PCA properly describes the matrix and precipitate phases in a simple and intuitive manner. A second APT example is from the analysis of an irradiated reactor pressure vessel steel. Fine, nm-scale Cu-enriched precipitates having a core-shell structure were identified and qualitatively described by PCA. Advantages, disadvantages, and future prospects for implementing these data analysis methodologies for APT datasets, particularly with regard to quantitative analysis, are also discussed.

  2. Statistical Techniques for Project Control

    CERN Document Server

    Badiru, Adedeji B

    2012-01-01

    A project can be simple or complex. In each case, proven project management processes must be followed. In all cases of project management implementation, control must be exercised in order to assure that project objectives are achieved. Statistical Techniques for Project Control seamlessly integrates qualitative and quantitative tools and techniques for project control. It fills the void that exists in the application of statistical techniques to project control. The book begins by defining the fundamentals of project management then explores how to temper quantitative analysis with qualitati

  3. Multivariate Statistical Process Control Charts and the Problem of Interpretation: A Short Overview and Some Applications in Industry

    OpenAIRE

    Bersimis, Sotiris; Panaretos, John; Psarakis, Stelios

    2005-01-01

    Woodall and Montgomery [35] in a discussion paper, state that multivariate process control is one of the most rapidly developing sections of statistical process control. Nowadays, in industry, there are many situations in which the simultaneous monitoring or control, of two or more related quality - process characteristics is necessary. Process monitoring problems in which several related variables are of interest are collectively known as Multivariate Statistical Process Control (MSPC).This ...

  4. Pattern recognition by the use of multivariate statistical evaluation of macro- and micro-PIXE results

    International Nuclear Information System (INIS)

    Tapper, U.A.S.; Malmqvist, K.G.; Loevestam, N.E.G.; Swietlicki, E.; Salford, L.G.

    1991-01-01

    The importance of statistical evaluation of multielemental data is illustrated using the data collected in a macro- and micro-PIXE analysis of human brain tumours. By employing a multivariate statistical classification methodology (SIMCA) it was shown that the total information collected from each specimen separates three types of tissue: High malignant, less malignant and normal brain tissue. This makes a classification of a given specimen possible based on the elemental concentrations. Partial least squares regression (PLS), a multivariate regression method, made it possible to study the relative importance of the examined nine trace elements, the dry/wet weight ratio and the age of the patient in predicting the survival time after operation for patients with the high malignant form, astrocytomas grade III-IV. The elemental maps from a microprobe analysis were also subjected to multivariate analysis. This showed that the six elements sorted into maps could be presented in three maps containing all the relevant information. The intensity in these maps is proportional to the value (score) of the actual pixel along the calculated principal components. (orig.)

  5. The Inappropriate Symmetries of Multivariate Statistical Analysis in Geometric Morphometrics.

    Science.gov (United States)

    Bookstein, Fred L

    In today's geometric morphometrics the commonest multivariate statistical procedures, such as principal component analysis or regressions of Procrustes shape coordinates on Centroid Size, embody a tacit roster of symmetries -axioms concerning the homogeneity of the multiple spatial domains or descriptor vectors involved-that do not correspond to actual biological fact. These techniques are hence inappropriate for any application regarding which we have a-priori biological knowledge to the contrary (e.g., genetic/morphogenetic processes common to multiple landmarks, the range of normal in anatomy atlases, the consequences of growth or function for form). But nearly every morphometric investigation is motivated by prior insights of this sort. We therefore need new tools that explicitly incorporate these elements of knowledge, should they be quantitative, to break the symmetries of the classic morphometric approaches. Some of these are already available in our literature but deserve to be known more widely: deflated (spatially adaptive) reference distributions of Procrustes coordinates, Sewall Wright's century-old variant of factor analysis, the geometric algebra of importing explicit biomechanical formulas into Procrustes space. Other methods, not yet fully formulated, might involve parameterized models for strain in idealized forms under load, principled approaches to the separation of functional from Brownian aspects of shape variation over time, and, in general, a better understanding of how the formalism of landmarks interacts with the many other approaches to quantification of anatomy. To more powerfully organize inferences from the high-dimensional measurements that characterize so much of today's organismal biology, tomorrow's toolkit must rely neither on principal component analysis nor on the Procrustes distance formula, but instead on sound prior biological knowledge as expressed in formulas whose coefficients are not all the same. I describe the problems

  6. The Effect of the Multivariate Box-Cox Transformation on the Power of MANOVA.

    Science.gov (United States)

    Kirisci, Levent; Hsu, Tse-Chi

    Most of the multivariate statistical techniques rely on the assumption of multivariate normality. The effects of non-normality on multivariate tests are assumed to be negligible when variance-covariance matrices and sample sizes are equal. Therefore, in practice, investigators do not usually attempt to remove non-normality. In this simulation…

  7. A New Iteration Multivariate Pad e´ Approximation Technique for ...

    African Journals Online (AJOL)

    In this paper, the Laplace transform, the New iteration method and the Multivariate Pade´ approximation technique are employed to solve nonlinear fractional partial differential equations whose fractional derivatives are described in the sense of Caputo. The Laplace transform is used to ”fully” determine the initial iteration ...

  8. Multivariate statistical analysis of electron energy-loss spectroscopy in anisotropic materials

    International Nuclear Information System (INIS)

    Hu Xuerang; Sun Yuekui; Yuan Jun

    2008-01-01

    Recently, an expression has been developed to take into account the complex dependence of the fine structure in core-level electron energy-loss spectroscopy (EELS) in anisotropic materials on specimen orientation and spectral collection conditions [Y. Sun, J. Yuan, Phys. Rev. B 71 (2005) 125109]. One application of this expression is the development of a phenomenological theory of magic-angle electron energy-loss spectroscopy (MAEELS), which can be used to extract the isotropically averaged spectral information for materials with arbitrary anisotropy. Here we use this expression to extract not only the isotropically averaged spectral information, but also the anisotropic spectral components, without the restriction of MAEELS. The application is based on a multivariate statistical analysis of core-level EELS for anisotropic materials. To demonstrate the applicability of this approach, we have conducted a study on a set of carbon K-edge spectra of multi-wall carbon nanotube (MWCNT) acquired with energy-loss spectroscopic profiling (ELSP) technique and successfully extracted both the averaged and dichroic spectral components of the wrapped graphite-like sheets. Our result shows that this can be a practical alternative to MAEELS for the study of electronic structure of anisotropic materials, in particular for those nanostructures made of layered materials

  9. Multivariate Variables Recognition using Hotelling’s T2 and MEWMA via ANN’s

    Directory of Open Access Journals (Sweden)

    Chiñas-Sánchez Pamela

    2014-01-01

    Full Text Available In this article, a method for multivariate pattern recognition using artificial neural networks (ANN is proposed. The method is useful for monitoring multiple variables during the statistical process control. It employs descriptive statistics and multivariate control techniques. Three different ANN’s are evaluated to identify the network with higher efficiency during pattern recognition of multivariate variables tasks from data bases. Two data bases are analyzed; the first one is generated by simulation using the Montecarlo method, and the second data base was obtained from a public data base repository. The method consists of three stages: multivariate variables generation, multivariate analysis and pattern recognition using ANN’s. Several multivariate scenarios were generated using a combination of 2, 3 and 4 patterns in multivariate variables for the Hotelling’s T2 and MEWMA statistics that were analyzed to know its behavior and to determine their statistical characteristics. The pattern recognition task was evaluated using the ANN. In both study cases, experimental results showed an improved efficiency when using the Perceptron and the Backpropagation networks compared to the RBF network.

  10. Multivariate statistical techniques for the evaluation of surface water quality of the Himalayan foothills streams, Pakistan

    Science.gov (United States)

    Malik, Riffat Naseem; Hashmi, Muhammad Zaffar

    2017-10-01

    Himalayan foothills streams, Pakistan play an important role in living water supply and irrigation of farmlands; thus, the water quality is closely related to public health. Multivariate techniques were applied to check spatial and seasonal trends, and metals contamination sources of the Himalayan foothills streams, Pakistan. Grab surface water samples were collected from different sites (5-15 cm water depth) in pre-washed polyethylene containers. Fast Sequential Atomic Absorption Spectrophotometer (Varian FSAA-240) was used to measure the metals concentration. Concentrations of Ni, Cu, and Mn were high in pre-monsoon season than the post-monsoon season. Cluster analysis identified impaired, moderately impaired and least impaired clusters based on water parameters. Discriminant function analysis indicated spatial variability in water was due to temperature, electrical conductivity, nitrates, iron and lead whereas seasonal variations were correlated with 16 physicochemical parameters. Factor analysis identified municipal and poultry waste, automobile activities, surface runoff, and soil weathering as major sources of contamination. Levels of Mn, Cr, Fe, Pb, Cd, Zn and alkalinity were above the WHO and USEPA standards for surface water. The results of present study will help to higher authorities for the management of the Himalayan foothills streams.

  11. Statistical and Computational Techniques in Manufacturing

    CERN Document Server

    2012-01-01

    In recent years, interest in developing statistical and computational techniques for applied manufacturing engineering has been increased. Today, due to the great complexity of manufacturing engineering and the high number of parameters used, conventional approaches are no longer sufficient. Therefore, in manufacturing, statistical and computational techniques have achieved several applications, namely, modelling and simulation manufacturing processes, optimization manufacturing parameters, monitoring and control, computer-aided process planning, etc. The present book aims to provide recent information on statistical and computational techniques applied in manufacturing engineering. The content is suitable for final undergraduate engineering courses or as a subject on manufacturing at the postgraduate level. This book serves as a useful reference for academics, statistical and computational science researchers, mechanical, manufacturing and industrial engineers, and professionals in industries related to manu...

  12. Multivariate pattern dependence.

    Directory of Open Access Journals (Sweden)

    Stefano Anzellotti

    2017-11-01

    Full Text Available When we perform a cognitive task, multiple brain regions are engaged. Understanding how these regions interact is a fundamental step to uncover the neural bases of behavior. Most research on the interactions between brain regions has focused on the univariate responses in the regions. However, fine grained patterns of response encode important information, as shown by multivariate pattern analysis. In the present article, we introduce and apply multivariate pattern dependence (MVPD: a technique to study the statistical dependence between brain regions in humans in terms of the multivariate relations between their patterns of responses. MVPD characterizes the responses in each brain region as trajectories in region-specific multidimensional spaces, and models the multivariate relationship between these trajectories. We applied MVPD to the posterior superior temporal sulcus (pSTS and to the fusiform face area (FFA, using a searchlight approach to reveal interactions between these seed regions and the rest of the brain. Across two different experiments, MVPD identified significant statistical dependence not detected by standard functional connectivity. Additionally, MVPD outperformed univariate connectivity in its ability to explain independent variance in the responses of individual voxels. In the end, MVPD uncovered different connectivity profiles associated with different representational subspaces of FFA: the first principal component of FFA shows differential connectivity with occipital and parietal regions implicated in the processing of low-level properties of faces, while the second and third components show differential connectivity with anterior temporal regions implicated in the processing of invariant representations of face identity.

  13. Comparative Estimation of Russia’s Regions Investment Potential on the Base of the Multivariate Statistical Analysis

    Directory of Open Access Journals (Sweden)

    Victor V. Nikitin

    2013-01-01

    Full Text Available The article introduces the algorithm of Russia’s regions investment potential estimation, developed by means of multivariate statistical methods, determines the factors, reflecting regions investment state. The integral indicator was developed on their basis, using statistical data. The article presents regions’ classification on the basis of the integral index

  14. Relating N2O emissions during biological nitrogen removal with operating conditions using multivariate statistical techniques.

    Science.gov (United States)

    Vasilaki, V; Volcke, E I P; Nandi, A K; van Loosdrecht, M C M; Katsou, E

    2018-04-26

    Multivariate statistical analysis was applied to investigate the dependencies and underlying patterns between N 2 O emissions and online operational variables (dissolved oxygen and nitrogen component concentrations, temperature and influent flow-rate) during biological nitrogen removal from wastewater. The system under study was a full-scale reactor, for which hourly sensor data were available. The 15-month long monitoring campaign was divided into 10 sub-periods based on the profile of N 2 O emissions, using Binary Segmentation. The dependencies between operating variables and N 2 O emissions fluctuated according to Spearman's rank correlation. The correlation between N 2 O emissions and nitrite concentrations ranged between 0.51 and 0.78. Correlation >0.7 between N 2 O emissions and nitrate concentrations was observed at sub-periods with average temperature lower than 12 °C. Hierarchical k-means clustering and principal component analysis linked N 2 O emission peaks with precipitation events and ammonium concentrations higher than 2 mg/L, especially in sub-periods characterized by low N 2 O fluxes. Additionally, the highest ranges of measured N 2 O fluxes belonged to clusters corresponding with NO 3 -N concentration less than 1 mg/L in the upstream plug-flow reactor (middle of oxic zone), indicating slow nitrification rates. The results showed that the range of N 2 O emissions partially depends on the prior behavior of the system. The principal component analysis validated the findings from the clustering analysis and showed that ammonium, nitrate, nitrite and temperature explained a considerable percentage of the variance in the system for the majority of the sub-periods. The applied statistical methods, linked the different ranges of emissions with the system variables, provided insights on the effect of operating conditions on N 2 O emissions in each sub-period and can be integrated into N 2 O emissions data processing at wastewater treatment plants

  15. A statistical approach for segregating cognitive task stages from multivariate fMRI BOLD time series

    Directory of Open Access Journals (Sweden)

    Charmaine eDemanuele

    2015-10-01

    Full Text Available Multivariate pattern analysis can reveal new information from neuroimaging data to illuminate human cognition and its disturbances. Here, we develop a methodological approach, based on multivariate statistical/machine learning and time series analysis, to discern cognitive processing stages from fMRI blood oxygenation level dependent (BOLD time series. We apply this method to data recorded from a group of healthy adults whilst performing a virtual reality version of the delayed win-shift radial arm maze task. This task has been frequently used to study working memory and decision making in rodents. Using linear classifiers and multivariate test statistics in conjunction with time series bootstraps, we show that different cognitive stages of the task, as defined by the experimenter, namely, the encoding/retrieval, choice, reward and delay stages, can be statistically discriminated from the BOLD time series in brain areas relevant for decision making and working memory. Discrimination of these task stages was significantly reduced during poor behavioral performance in dorsolateral prefrontal cortex (DLPFC, but not in the primary visual cortex (V1. Experimenter-defined dissection of time series into class labels based on task structure was confirmed by an unsupervised, bottom-up approach based on Hidden Markov Models. Furthermore, we show that different groupings of recorded time points into cognitive event classes can be used to test hypotheses about the specific cognitive role of a given brain region during task execution. We found that whilst the DLPFC strongly differentiated between task stages associated with different memory loads, but not between different visual-spatial aspects, the reverse was true for V1. Our methodology illustrates how different aspects of cognitive information processing during one and the same task can be separated and attributed to specific brain regions based on information contained in multivariate patterns of voxel

  16. Multivariate statistical tools for the radiometric features of volcanic islands

    International Nuclear Information System (INIS)

    Basile, S.; Brai, M.; Marrale, M.; Micciche, S.; Lanzo, G.; Rizzo, S.

    2009-01-01

    The Aeolian Islands represents a Quaternary volcanic arc related to the subduction of the Ionian plate beneath the Calabrian Arc. The geochemical variability of the islands has led to a broad spectrum of magma rocks. Volcanic products from calc-alkaline (CA) to calc-alkaline high in potassium (HKCA) are present throughout the Archipelago, but products belonging to shoshonitic (SHO) and potassium (KS) series characterize the southern portion of Lipari, Vulcano and Stromboli. Tectonics also plays an important role in the process of the islands differentiation. In this work, we want to review and cross-analyze the data on Lipari, Stromboli and Vulcano, collected in measurement and sampling campaigns over the last years. Chemical data were obtained by X-ray fluorescence. High resolution gamma-ray spectrometry with germanium detectors was used to measure primordial radionuclide activities. The activity of primordial radionuclides in the volcanic products of these three islands is strongly dependent on their chemism. The highest contents are found in more differentiated products (rhyolites). The CA products have lower concentrations, while the HKCA and Shoshonitic product concentrations are in between. Calculated dose rates have been correlated with the petrochemical features in order to gain further insight in evolution and differentiation of volcanic products. Ratio matching technique and multivariate statistical analyses, such as Principal Component Analysis and Minimum Spanning Tree, have been applied as an additional tool helpful to better describe the lithological affinities of the samples. (Author)

  17. Seasonal rationalization of river water quality sampling locations: a comparative study of the modified Sanders and multivariate statistical approaches.

    Science.gov (United States)

    Varekar, Vikas; Karmakar, Subhankar; Jha, Ramakar

    2016-02-01

    The design of surface water quality sampling location is a crucial decision-making process for rationalization of monitoring network. The quantity, quality, and types of available dataset (watershed characteristics and water quality data) may affect the selection of appropriate design methodology. The modified Sanders approach and multivariate statistical techniques [particularly factor analysis (FA)/principal component analysis (PCA)] are well-accepted and widely used techniques for design of sampling locations. However, their performance may vary significantly with quantity, quality, and types of available dataset. In this paper, an attempt has been made to evaluate performance of these techniques by accounting the effect of seasonal variation, under a situation of limited water quality data but extensive watershed characteristics information, as continuous and consistent river water quality data is usually difficult to obtain, whereas watershed information may be made available through application of geospatial techniques. A case study of Kali River, Western Uttar Pradesh, India, is selected for the analysis. The monitoring was carried out at 16 sampling locations. The discrete and diffuse pollution loads at different sampling sites were estimated and accounted using modified Sanders approach, whereas the monitored physical and chemical water quality parameters were utilized as inputs for FA/PCA. The designed optimum number of sampling locations for monsoon and non-monsoon seasons by modified Sanders approach are eight and seven while that for FA/PCA are eleven and nine, respectively. Less variation in the number and locations of designed sampling sites were obtained by both techniques, which shows stability of results. A geospatial analysis has also been carried out to check the significance of designed sampling location with respect to river basin characteristics and land use of the study area. Both methods are equally efficient; however, modified Sanders

  18. Batch-to-batch quality consistency evaluation of botanical drug products using multivariate statistical analysis of the chromatographic fingerprint.

    Science.gov (United States)

    Xiong, Haoshu; Yu, Lawrence X; Qu, Haibin

    2013-06-01

    Botanical drug products have batch-to-batch quality variability due to botanical raw materials and the current manufacturing process. The rational evaluation and control of product quality consistency are essential to ensure the efficacy and safety. Chromatographic fingerprinting is an important and widely used tool to characterize the chemical composition of botanical drug products. Multivariate statistical analysis has showed its efficacy and applicability in the quality evaluation of many kinds of industrial products. In this paper, the combined use of multivariate statistical analysis and chromatographic fingerprinting is presented here to evaluate batch-to-batch quality consistency of botanical drug products. A typical botanical drug product in China, Shenmai injection, was selected as the example to demonstrate the feasibility of this approach. The high-performance liquid chromatographic fingerprint data of historical batches were collected from a traditional Chinese medicine manufacturing factory. Characteristic peaks were weighted by their variability among production batches. A principal component analysis model was established after outliers were modified or removed. Multivariate (Hotelling T(2) and DModX) control charts were finally successfully applied to evaluate the quality consistency. The results suggest useful applications for a combination of multivariate statistical analysis with chromatographic fingerprinting in batch-to-batch quality consistency evaluation for the manufacture of botanical drug products.

  19. Multivariate statistical modelling based on generalized linear models

    CERN Document Server

    Fahrmeir, Ludwig

    1994-01-01

    This book is concerned with the use of generalized linear models for univariate and multivariate regression analysis. Its emphasis is to provide a detailed introductory survey of the subject based on the analysis of real data drawn from a variety of subjects including the biological sciences, economics, and the social sciences. Where possible, technical details and proofs are deferred to an appendix in order to provide an accessible account for non-experts. Topics covered include: models for multi-categorical responses, model checking, time series and longitudinal data, random effects models, and state-space models. Throughout, the authors have taken great pains to discuss the underlying theoretical ideas in ways that relate well to the data at hand. As a result, numerous researchers whose work relies on the use of these models will find this an invaluable account to have on their desks. "The basic aim of the authors is to bring together and review a large part of recent advances in statistical modelling of m...

  20. Multivariate reference technique for quantitative analysis of fiber-optic tissue Raman spectroscopy.

    Science.gov (United States)

    Bergholt, Mads Sylvest; Duraipandian, Shiyamala; Zheng, Wei; Huang, Zhiwei

    2013-12-03

    We report a novel method making use of multivariate reference signals of fused silica and sapphire Raman signals generated from a ball-lens fiber-optic Raman probe for quantitative analysis of in vivo tissue Raman measurements in real time. Partial least-squares (PLS) regression modeling is applied to extract the characteristic internal reference Raman signals (e.g., shoulder of the prominent fused silica boson peak (~130 cm(-1)); distinct sapphire ball-lens peaks (380, 417, 646, and 751 cm(-1))) from the ball-lens fiber-optic Raman probe for quantitative analysis of fiber-optic Raman spectroscopy. To evaluate the analytical value of this novel multivariate reference technique, a rapid Raman spectroscopy system coupled with a ball-lens fiber-optic Raman probe is used for in vivo oral tissue Raman measurements (n = 25 subjects) under 785 nm laser excitation powers ranging from 5 to 65 mW. An accurate linear relationship (R(2) = 0.981) with a root-mean-square error of cross validation (RMSECV) of 2.5 mW can be obtained for predicting the laser excitation power changes based on a leave-one-subject-out cross-validation, which is superior to the normal univariate reference method (RMSE = 6.2 mW). A root-mean-square error of prediction (RMSEP) of 2.4 mW (R(2) = 0.985) can also be achieved for laser power prediction in real time when we applied the multivariate method independently on the five new subjects (n = 166 spectra). We further apply the multivariate reference technique for quantitative analysis of gelatin tissue phantoms that gives rise to an RMSEP of ~2.0% (R(2) = 0.998) independent of laser excitation power variations. This work demonstrates that multivariate reference technique can be advantageously used to monitor and correct the variations of laser excitation power and fiber coupling efficiency in situ for standardizing the tissue Raman intensity to realize quantitative analysis of tissue Raman measurements in vivo, which is particularly appealing in

  1. Multivariate analysis with LISREL

    CERN Document Server

    Jöreskog, Karl G; Y Wallentin, Fan

    2016-01-01

    This book traces the theory and methodology of multivariate statistical analysis and shows how it can be conducted in practice using the LISREL computer program. It presents not only the typical uses of LISREL, such as confirmatory factor analysis and structural equation models, but also several other multivariate analysis topics, including regression (univariate, multivariate, censored, logistic, and probit), generalized linear models, multilevel analysis, and principal component analysis. It provides numerous examples from several disciplines and discusses and interprets the results, illustrated with sections of output from the LISREL program, in the context of the example. The book is intended for masters and PhD students and researchers in the social, behavioral, economic and many other sciences who require a basic understanding of multivariate statistical theory and methods for their analysis of multivariate data. It can also be used as a textbook on various topics of multivariate statistical analysis.

  2. Visual classification of very fine-grained sediments: Evaluation through univariate and multivariate statistics

    Science.gov (United States)

    Hohn, M. Ed; Nuhfer, E.B.; Vinopal, R.J.; Klanderman, D.S.

    1980-01-01

    Classifying very fine-grained rocks through fabric elements provides information about depositional environments, but is subject to the biases of visual taxonomy. To evaluate the statistical significance of an empirical classification of very fine-grained rocks, samples from Devonian shales in four cored wells in West Virginia and Virginia were measured for 15 variables: quartz, illite, pyrite and expandable clays determined by X-ray diffraction; total sulfur, organic content, inorganic carbon, matrix density, bulk density, porosity, silt, as well as density, sonic travel time, resistivity, and ??-ray response measured from well logs. The four lithologic types comprised: (1) sharply banded shale, (2) thinly laminated shale, (3) lenticularly laminated shale, and (4) nonbanded shale. Univariate and multivariate analyses of variance showed that the lithologic classification reflects significant differences for the variables measured, difference that can be detected independently of stratigraphic effects. Little-known statistical methods found useful in this work included: the multivariate analysis of variance with more than one effect, simultaneous plotting of samples and variables on canonical variates, and the use of parametric ANOVA and MANOVA on ranked data. ?? 1980 Plenum Publishing Corporation.

  3. Assessment of Reservoir Water Quality Using Multivariate Statistical Techniques: A Case Study of Qiandao Lake, China

    Directory of Open Access Journals (Sweden)

    Qing Gu

    2016-03-01

    Full Text Available Qiandao Lake (Xin’an Jiang reservoir plays a significant role in drinking water supply for eastern China, and it is an attractive tourist destination. Three multivariate statistical methods were comprehensively applied to assess the spatial and temporal variations in water quality as well as potential pollution sources in Qiandao Lake. Data sets of nine parameters from 12 monitoring sites during 2010–2013 were obtained for analysis. Cluster analysis (CA was applied to classify the 12 sampling sites into three groups (Groups A, B and C and the 12 monitoring months into two clusters (April-July, and the remaining months. Discriminant analysis (DA identified Secchi disc depth, dissolved oxygen, permanganate index and total phosphorus as the significant variables for distinguishing variations of different years, with 79.9% correct assignments. Dissolved oxygen, pH and chlorophyll-a were determined to discriminate between the two sampling periods classified by CA, with 87.8% correct assignments. For spatial variation, DA identified Secchi disc depth and ammonia nitrogen as the significant discriminating parameters, with 81.6% correct assignments. Principal component analysis (PCA identified organic pollution, nutrient pollution, domestic sewage, and agricultural and surface runoff as the primary pollution sources, explaining 84.58%, 81.61% and 78.68% of the total variance in Groups A, B and C, respectively. These results demonstrate the effectiveness of integrated use of CA, DA and PCA for reservoir water quality evaluation and could assist managers in improving water resources management.

  4. Use of multivariate statistical tool for data processing in the analysis of Cu, Cr, Fe, Pb, Mo and Mg in lubricating oil by LIBS

    International Nuclear Information System (INIS)

    Alves, Luana F.N.; Sarkis, Jorge E.S.; Bordon, Isabela C.A.C.

    2015-01-01

    Analysis of industrial lubricants is widely used for monitoring and predicting maintenance requirements in a broad range of mechanical systems. Laser induced breakdown spectroscopy has been used to evaluate the potentiality of the technique for the determination of metals in lubricating oils. Prior to quantitative analysis, the LIBS system was calibrated using standard samples containing the elements investigated (Cu, Cr, Fe, Pb, Mo and Mg). This study presents the usefulness of multivariate statistical techniques for evaluation and interpretation of large complex data sets in order to get more information about concentration of metals in oils lubricants is related to engine wear. (author)

  5. Applicability of statistical process control techniques

    NARCIS (Netherlands)

    Schippers, W.A.J.

    1998-01-01

    This paper concerns the application of Process Control Techniques (PCTs) for the improvement of the technical performance of discrete production processes. Successful applications of these techniques, such as Statistical Process Control Techniques (SPC), can be found in the literature. However, some

  6. Rapid thyroid dysfunction screening based on serum surface-enhanced Raman scattering and multivariate statistical analysis

    Science.gov (United States)

    Tian, Dayong; Lü, Guodong; Zhai, Zhengang; Du, Guoli; Mo, Jiaqing; Lü, Xiaoyi

    2018-01-01

    In this paper, serum surface-enhanced Raman scattering and multivariate statistical analysis are used to investigate a rapid screening technique for thyroid function diseases. At present, the detection of thyroid function has become increasingly important, and it is urgently necessary to develop a rapid and portable method for the detection of thyroid function. Our experimental results show that, by using the Silmeco-based enhanced Raman signal, the signal strength greatly increases and the characteristic peak appears obviously. It is also observed that the Raman spectra of normal and anomalous thyroid function human serum are significantly different. Principal component analysis (PCA) combined with linear discriminant analysis (LDA) was used to diagnose thyroid dysfunction, and the diagnostic accuracy was 87.4%. The use of serum surface-enhanced Raman scattering technology combined with PCA-LDA shows good diagnostic performance for the rapid detection of thyroid function. By means of Raman technology, it is expected that a portable device for the rapid detection of thyroid function will be developed.

  7. EXPLORATORY DATA ANALYSIS AND MULTIVARIATE STRATEGIES FOR REVEALING MULTIVARIATE STRUCTURES IN CLIMATE DATA

    Directory of Open Access Journals (Sweden)

    2016-12-01

    Full Text Available This paper is on data analysis strategy in a complex, multidimensional, and dynamic domain. The focus is on the use of data mining techniques to explore the importance of multivariate structures; using climate variables which influences climate change. Techniques involved in data mining exercise vary according to the data structures. The multivariate analysis strategy considered here involved choosing an appropriate tool to analyze a process. Factor analysis is introduced into data mining technique in order to reveal the influencing impacts of factors involved as well as solving for multicolinearity effect among the variables. The temporal nature and multidimensionality of the target variables is revealed in the model using multidimensional regression estimates. The strategy of integrating the method of several statistical techniques, using climate variables in Nigeria was employed. R2 of 0.518 was obtained from the ordinary least square regression analysis carried out and the test was not significant at 5% level of significance. However, factor analysis regression strategy gave a good fit with R2 of 0.811 and the test was significant at 5% level of significance. Based on this study, model building should go beyond the usual confirmatory data analysis (CDA, rather it should be complemented with exploratory data analysis (EDA in order to achieve a desired result.

  8. Classification of Malaysia aromatic rice using multivariate statistical analysis

    Energy Technology Data Exchange (ETDEWEB)

    Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A. [School of Mechatronic Engineering, Universiti Malaysia Perlis, Kampus Pauh Putra, 02600 Arau, Perlis (Malaysia); Omar, O. [Malaysian Agriculture Research and Development Institute (MARDI), Persiaran MARDI-UPM, 43400 Serdang, Selangor (Malaysia)

    2015-05-15

    Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC–MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.

  9. Classification of Malaysia aromatic rice using multivariate statistical analysis

    Science.gov (United States)

    Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A.; Omar, O.

    2015-05-01

    Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC-MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.

  10. Classification of Malaysia aromatic rice using multivariate statistical analysis

    International Nuclear Information System (INIS)

    Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A.; Omar, O.

    2015-01-01

    Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC–MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties

  11. An Application of Multivariate Statistical Analysis for Query-Driven Visualization

    Energy Technology Data Exchange (ETDEWEB)

    Gosink, Luke J. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Garth, Christoph [Univ. of California, Davis, CA (United States); Anderson, John C. [Univ. of California, Davis, CA (United States); Bethel, E. Wes [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Joy, Kenneth I. [Univ. of California, Davis, CA (United States)

    2011-03-01

    Driven by the ability to generate ever-larger, increasingly complex data, there is an urgent need in the scientific community for scalable analysis methods that can rapidly identify salient trends in scientific data. Query-Driven Visualization (QDV) strategies are among the small subset of techniques that can address both large and highly complex datasets. This paper extends the utility of QDV strategies with a statistics-based framework that integrates non-parametric distribution estimation techniques with a new segmentation strategy to visually identify statistically significant trends and features within the solution space of a query. In this framework, query distribution estimates help users to interactively explore their query's solution and visually identify the regions where the combined behavior of constrained variables is most important, statistically, to their inquiry. Our new segmentation strategy extends the distribution estimation analysis by visually conveying the individual importance of each variable to these regions of high statistical significance. We demonstrate the analysis benefits these two strategies provide and show how they may be used to facilitate the refinement of constraints over variables expressed in a user's query. We apply our method to datasets from two different scientific domains to demonstrate its broad applicability.

  12. Review of the Statistical Techniques in Medical Sciences | Okeh ...

    African Journals Online (AJOL)

    ... medical researcher in selecting the appropriate statistical techniques. Of course, all statistical techniques have certain underlying assumptions, which must be checked before the technique is applied. Keywords: Variable, Prospective Studies, Retrospective Studies, Statistical significance. Bio-Research Vol. 6 (1) 2008: pp.

  13. Multivariate Statistical Analysis of Water Quality data in Indian River Lagoon, Florida

    Science.gov (United States)

    Sayemuzzaman, M.; Ye, M.

    2015-12-01

    The Indian River Lagoon, is part of the longest barrier island complex in the United States, is a region of particular concern to the environmental scientist because of the rapid rate of human development throughout the region and the geographical position in between the colder temperate zone and warmer sub-tropical zone. Thus, the surface water quality analysis in this region always brings the newer information. In this present study, multivariate statistical procedures were applied to analyze the spatial and temporal water quality in the Indian River Lagoon over the period 1998-2013. Twelve parameters have been analyzed on twelve key water monitoring stations in and beside the lagoon on monthly datasets (total of 27,648 observations). The dataset was treated using cluster analysis (CA), principle component analysis (PCA) and non-parametric trend analysis. The CA was used to cluster twelve monitoring stations into four groups, with stations on the similar surrounding characteristics being in the same group. The PCA was then applied to the similar groups to find the important water quality parameters. The principal components (PCs), PC1 to PC5 was considered based on the explained cumulative variances 75% to 85% in each cluster groups. Nutrient species (phosphorus and nitrogen), salinity, specific conductivity and erosion factors (TSS, Turbidity) were major variables involved in the construction of the PCs. Statistical significant positive or negative trends and the abrupt trend shift were detected applying Mann-Kendall trend test and Sequential Mann-Kendall (SQMK), for each individual stations for the important water quality parameters. Land use land cover change pattern, local anthropogenic activities and extreme climate such as drought might be associated with these trends. This study presents the multivariate statistical assessment in order to get better information about the quality of surface water. Thus, effective pollution control/management of the surface

  14. Impact of statistical learning methods on the predictive power of multivariate normal tissue complication probability models

    NARCIS (Netherlands)

    Xu, Cheng-Jian; van der Schaaf, Arjen; Schilstra, Cornelis; Langendijk, Johannes A.; van t Veld, Aart A.

    2012-01-01

    PURPOSE: To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. METHODS AND MATERIALS: In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator

  15. Multivariate Statistical Analysis: a tool for groundwater quality assessment in the hidrogeologic region of the Ring of Cenotes, Yucatan, Mexico.

    Science.gov (United States)

    Ye, M.; Pacheco Castro, R. B.; Pacheco Avila, J.; Cabrera Sansores, A.

    2014-12-01

    The karstic aquifer of Yucatan is a vulnerable and complex system. The first fifteen meters of this aquifer have been polluted, due to this the protection of this resource is important because is the only source of potable water of the entire State. Through the assessment of groundwater quality we can gain some knowledge about the main processes governing water chemistry as well as spatial patterns which are important to establish protection zones. In this work multivariate statistical techniques are used to assess the groundwater quality of the supply wells (30 to 40 meters deep) in the hidrogeologic region of the Ring of Cenotes, located in Yucatan, Mexico. Cluster analysis and principal component analysis are applied in groundwater chemistry data of the study area. Results of principal component analysis show that the main sources of variation in the data are due sea water intrusion and the interaction of the water with the carbonate rocks of the system and some pollution processes. The cluster analysis shows that the data can be divided in four clusters. The spatial distribution of the clusters seems to be random, but is consistent with sea water intrusion and pollution with nitrates. The overall results show that multivariate statistical analysis can be successfully applied in the groundwater quality assessment of this karstic aquifer.

  16. Dating and classification of Syrian excavated pottery from Tell Saka Site, by means of thermoluminescence analysis, and multivariate statistical methods, based on PIXE analysis

    International Nuclear Information System (INIS)

    Bakraji, E.H.; Ahmad, M.; Salman, N.; Haloum, D.; Boutros, N.; Abboud, R.

    2011-01-01

    Thermoluminescence (TL) dating and Proton Induced X-ray Emission (PIXE) techniques have been utilized for the study of archaeological pottery fragment samples from Tell Saka Site, which is located at 25 km south east of Damascus city, Syria. Four samples were chosen randomly from the site, two from third level and two from fourth level for dating using TL technique and the results were in good agreement with the date assigned by archaeologists. Twenty-eight sherds were analyzed using PIXE technique in order to identify and characterize the elemental composition of pottery excavated from third and fourth levels, using 3 MV tandem accelerator in Damascus. The analysis provided almost 20 elements (Na, Mg, Al, Si, P, S, K, Ca, Ti, Mn, Fe, Co, Ni, Cu, Zn, Rb, Sr, Y, Zr, Nb). However, only 14 elements as follows: K, Ca, Ti, Mn, Fe, Co, Ni, Cu, Zn, Rb, Sr, Y, Zr, Nb were chosen for statistical analysis and have been processed using two multivariate statistical methods, Cluster and Factor analysis. The studied pottery were classify into two well defined groups. (author)

  17. Projection operator techniques in nonequilibrium statistical mechanics

    International Nuclear Information System (INIS)

    Grabert, H.

    1982-01-01

    This book is an introduction to the application of the projection operator technique to the statistical mechanics of irreversible processes. After a general introduction to the projection operator technique and statistical thermodynamics the Fokker-Planck and the master equation approach are described together with the response theory. Then, as applications the damped harmonic oscillator, simple fluids, and the spin relaxation are considered. (HSI)

  18. Multivariate statistical analysis - an application to lunar materials

    International Nuclear Information System (INIS)

    Deb, M.

    1978-01-01

    The compositional characteristics of clinopyroxenes and spinels - two minerals considered to be very useful in deciphering lunar history, have been studied using the multivariate statistical method of principal component analysis. The mineral-chemical data used are from certain lunar rocks and fines collected by Apollo 11, 12, 14 and 15 and Luna 16 and 20 missions, representing mainly the mare basalts and also non-mare basalts, breccia and rock fragments from the highland regions, in which a large number of these minerals have been analyzed. The correlations noted in the mineral compositions, indicating substitutional relationships, have been interpreted on the basis of available crystal-chemical and petrological informations. Compositional trends for individual specimens have been delineated and compared by producing ''principal latent vector diagrams''. The percent variance of the principal components denoted by the eigenvalues, have been evaluated in terms of the crystallization history of the samples. Some of the major petrogenetic implications of this study concern the role of early formed cumulate phases in the near-surface fractionation of mare basalts, mixing of mineral compositions in the highland regolith and the subsolidus reduction trends in lunar spinels. (auth.)

  19. Chemometric and multivariate statistical analysis of time-of-flight secondary ion mass spectrometry spectra from complex Cu-Fe sulfides.

    Science.gov (United States)

    Kalegowda, Yogesh; Harmer, Sarah L

    2012-03-20

    Time-of-flight secondary ion mass spectrometry (TOF-SIMS) spectra of mineral samples are complex, comprised of large mass ranges and many peaks. Consequently, characterization and classification analysis of these systems is challenging. In this study, different chemometric and statistical data evaluation methods, based on monolayer sensitive TOF-SIMS data, have been tested for the characterization and classification of copper-iron sulfide minerals (chalcopyrite, chalcocite, bornite, and pyrite) at different flotation pulp conditions (feed, conditioned feed, and Eh modified). The complex mass spectral data sets were analyzed using the following chemometric and statistical techniques: principal component analysis (PCA); principal component-discriminant functional analysis (PC-DFA); soft independent modeling of class analogy (SIMCA); and k-Nearest Neighbor (k-NN) classification. PCA was found to be an important first step in multivariate analysis, providing insight into both the relative grouping of samples and the elemental/molecular basis for those groupings. For samples exposed to oxidative conditions (at Eh ~430 mV), each technique (PCA, PC-DFA, SIMCA, and k-NN) was found to produce excellent classification. For samples at reductive conditions (at Eh ~ -200 mV SHE), k-NN and SIMCA produced the most accurate classification. Phase identification of particles that contain the same elements but a different crystal structure in a mixed multimetal mineral system has been achieved.

  20. A guide to statistical analysis in microbial ecology: a community-focused, living review of multivariate data analyses

    OpenAIRE

    Buttigieg, Pier Luigi; Ramette, Alban Nicolas

    2014-01-01

    The application of multivariate statistical analyses has become a consistent feature in microbial ecology. However, many microbial ecologists are still in the process of developing a deep understanding of these methods and appreciating their limitations. As a consequence, staying abreast of progress and debate in this arena poses an additional challenge to many microbial ecologists. To address these issues, we present the GUide to STatistical Analysis in Microbial Ecology (GUSTA ME): a dynami...

  1. Essentials of multivariate data analysis

    CERN Document Server

    Spencer, Neil H

    2013-01-01

    ""… this text provides an overview at an introductory level of several methods in multivariate data analysis. It contains in-depth examples from one data set woven throughout the text, and a free [Excel] Add-In to perform the analyses in Excel, with step-by-step instructions provided for each technique. … could be used as a text (possibly supplemental) for courses in other fields where researchers wish to apply these methods without delving too deeply into the underlying statistics.""-The American Statistician, February 2015

  2. Quantitative Evaluation of Hybrid Aspen Xylem and Immunolabeling Patterns Using Image Analysis and Multivariate Statistics

    Directory of Open Access Journals (Sweden)

    David Sandquist

    2015-06-01

    Full Text Available A new method is presented for quantitative evaluation of hybrid aspen genotype xylem morphology and immunolabeling micro-distribution. This method can be used as an aid in assessing differences in genotypes from classic tree breeding studies, as well as genetically engineered plants. The method is based on image analysis, multivariate statistical evaluation of light, and immunofluorescence microscopy images of wood xylem cross sections. The selected immunolabeling antibodies targeted five different epitopes present in aspen xylem cell walls. Twelve down-regulated hybrid aspen genotypes were included in the method development. The 12 knock-down genotypes were selected based on pre-screening by pyrolysis-IR of global chemical content. The multivariate statistical evaluations successfully identified comparative trends for modifications in the down-regulated genotypes compared to the unmodified control, even when no definitive conclusions could be drawn from individual studied variables alone. Of the 12 genotypes analyzed, three genotypes showed significant trends for modifications in both morphology and immunolabeling. Six genotypes showed significant trends for modifications in either morphology or immunocoverage. The remaining three genotypes did not show any significant trends for modification.

  3. Using support vector machines in the multivariate state estimation technique

    International Nuclear Information System (INIS)

    Zavaljevski, N.; Gross, K.C.

    1999-01-01

    One approach to validate nuclear power plant (NPP) signals makes use of pattern recognition techniques. This approach often assumes that there is a set of signal prototypes that are continuously compared with the actual sensor signals. These signal prototypes are often computed based on empirical models with little or no knowledge about physical processes. A common problem of all data-based models is their limited ability to make predictions on the basis of available training data. Another problem is related to suboptimal training algorithms. Both of these potential shortcomings with conventional approaches to signal validation and sensor operability validation are successfully resolved by adopting a recently proposed learning paradigm called the support vector machine (SVM). The work presented here is a novel application of SVM for data-based modeling of system state variables in an NPP, integrated with a nonlinear, nonparametric technique called the multivariate state estimation technique (MSET), an algorithm developed at Argonne National Laboratory for a wide range of nuclear plant applications

  4. Evaluation of strategies to promote learning using ICT: the case of a course on Topics of Multivariate Statistics

    Directory of Open Access Journals (Sweden)

    Mario Miguel Ojeda Ramírez

    2017-01-01

    Full Text Available Currently some teachers implement different methods in order to promote education linked to reality, to provide more effective training and a meaningful learning. Activemethods aim to increase motivation and create scenarios in which student participation is central to achieve a more meaningful learning. This paper reports on the implementation of a process of educational innovation in the course of Topics of Multivariate Statistics offered in the degree in Statistical Sciences and Techniques at the Universidad Veracruzana (Mexico. The strategies used as sets for data collection, design and project development and realization of individual and group presentations are described. Information and communication technologies (ICT used are: EMINUS, distributed education platform of the Universidad Veracruzana, and managing files with Dropbox, plus communication via WhatsApp. The R software was used for statistical analysis and for making presentations in academic forums. To explore students' perceptions depth interviews were conducted and indicators for evaluating the student satisfaction were defined; the results show positive evidence, concluding that students were satisfied with the way that the course was designed and implemented. They also stated that they feel able to apply what they have learned. The opinions put that using these strategies they were feeling in preparation for their professional life. Finally, some suggestions for improving the course in future editions are included.

  5. Multivariate statistical assessments of greenhouse-gas-induced climatic change and comparison with results from general circulation models

    International Nuclear Information System (INIS)

    Schoenwiese, C.D.

    1990-01-01

    Based on univariate correction and coherence analyses, including techniques moving in time, and taking account of the physical basis of the relationships, a simple multivariate concept is presented which correlates observational climatic time series simultaneously with solar, volcanic, ENSO (El Nino/Souther Oscillation) and anthropogenic greenhouse-gas forcing. The climatic elements considered are air temperature (near the ground and stratosphere), sea surface temperature, sea level and precipitation, and cover at least the period 1881-1980 (stratospheric temperature only since 1960). The climate signal assessments which may be hypothetically attributed to the observed CO 2 or equivalent CO 2 (implying additional greenhouse gases) increase are compared with those resulting from GCM experiments. In case of the Northern hemisphere air temperature these comparisons are performed not only in respect to hemispheric and global means, but also in respect to the regional and seasonal patterns. Autocorrelations and phase shifts of the climate response to natural and anthropogenic forcing complicate the statistical assessments

  6. Quality characterization and pollution source identification of surface water using multivariate statistical techniques, Nalagarh Valley, Himachal Pradesh, India

    Science.gov (United States)

    Herojeet, Rajkumar; Rishi, Madhuri S.; Lata, Renu; Dolma, Konchok

    2017-09-01

    multivariate techniques for reliable quality characterization of surface water quality to develop effective pollution reduction strategies and maintain a fine balance between the industrialization and ecological integrity.

  7. Multivariate statistical approximation of the in situ gamma-ray spectrometry of the State of Zacatecas, Mexico

    International Nuclear Information System (INIS)

    Lopez I, J. F.; Rios M, C.; Mireles G, F.; Saucedo A, S.; Davila R, I.; Pinedo, J.L.

    2017-09-01

    The environmental radioactivity evaluation is a key point in the assessment of the environmental quality. Through this, it can be found possible radioactive contamination, locate possible Uranium and Thorium deposits and evaluate the primordial isotopes concentration due to human activities. A radioactive map of the Zacatecas State, Mexico is under construction based on in situ gamma-ray spectrometry. The present work reports the results of the multivariate statistical approximation of the measured activity data. Based on Pearson correlation, the 228 Ac and 208 Tl activities are statistically significant, while the 214 Bi and 214 Pb activities are not statistically significant. These can be due to the existence or not of secular equilibrium in the Thorium and Uranium series. (Author)

  8. Determination of geographic provenance of cotton fibres using multi-isotope profiles and multivariate statistical analysis

    Science.gov (United States)

    Daeid, N. Nic; Meier-Augenstein, W.; Kemp, H. F.

    2012-04-01

    The analysis of cotton fibres can be particularly challenging within a forensic science context where discrimination of one fibre from another is of importance. Normally cotton fibre analysis examines the morphological structure of the recovered material and compares this with that of a known fibre from a particular source of interest. However, the conventional microscopic and chemical analysis of fibres and any associated dyes is generally unsuccessful because of the similar morphology of the fibres. Analysis of the dyes which may have been applied to the cotton fibre can also be undertaken though this can be difficult and unproductive in terms of discriminating one fibre from another. In the study presented here we have explored the potential for Isotope Ratio Mass Spectrometry (IRMS) to be utilised as an additional tool for cotton fibre analysis in an attempt to reveal further discriminatory information. This work has concentrated on un-dyed cotton fibres of known origin in order to expose the potential of the analytical technique. We report the results of a pilot study aimed at testing the hypothesis that multi-element stable isotope analysis of cotton fibres in conjunction with multivariate statistical analysis of the resulting isotopic abundance data using well established chemometric techniques permits sample provenancing based on the determination of where the cotton was grown and as such will facilitate sample discrimination. To date there is no recorded literature of this type of application of IRMS to cotton samples, which may be of forensic science relevance.

  9. Integration of ecological indices in the multivariate evaluation of an urban inventory of street trees

    Science.gov (United States)

    J. Grabinsky; A. Aldama; A. Chacalo; H. J. Vazquez

    2000-01-01

    Inventory data of Mexico City's street trees were studied using classical statistical arboricultural and ecological statistical approaches. Multivariate techniques were applied to both. Results did not differ substantially and were complementary. It was possible to reduce inventory data and to group species, boroughs, blocks, and variables.

  10. The intervals method: a new approach to analyse finite element outputs using multivariate statistics

    Directory of Open Access Journals (Sweden)

    Jordi Marcé-Nogué

    2017-10-01

    Full Text Available Background In this paper, we propose a new method, named the intervals’ method, to analyse data from finite element models in a comparative multivariate framework. As a case study, several armadillo mandibles are analysed, showing that the proposed method is useful to distinguish and characterise biomechanical differences related to diet/ecomorphology. Methods The intervals’ method consists of generating a set of variables, each one defined by an interval of stress values. Each variable is expressed as a percentage of the area of the mandible occupied by those stress values. Afterwards these newly generated variables can be analysed using multivariate methods. Results Applying this novel method to the biological case study of whether armadillo mandibles differ according to dietary groups, we show that the intervals’ method is a powerful tool to characterize biomechanical performance and how this relates to different diets. This allows us to positively discriminate between specialist and generalist species. Discussion We show that the proposed approach is a useful methodology not affected by the characteristics of the finite element mesh. Additionally, the positive discriminating results obtained when analysing a difficult case study suggest that the proposed method could be a very useful tool for comparative studies in finite element analysis using multivariate statistical approaches.

  11. The intervals method: a new approach to analyse finite element outputs using multivariate statistics

    Science.gov (United States)

    De Esteban-Trivigno, Soledad; Püschel, Thomas A.; Fortuny, Josep

    2017-01-01

    Background In this paper, we propose a new method, named the intervals’ method, to analyse data from finite element models in a comparative multivariate framework. As a case study, several armadillo mandibles are analysed, showing that the proposed method is useful to distinguish and characterise biomechanical differences related to diet/ecomorphology. Methods The intervals’ method consists of generating a set of variables, each one defined by an interval of stress values. Each variable is expressed as a percentage of the area of the mandible occupied by those stress values. Afterwards these newly generated variables can be analysed using multivariate methods. Results Applying this novel method to the biological case study of whether armadillo mandibles differ according to dietary groups, we show that the intervals’ method is a powerful tool to characterize biomechanical performance and how this relates to different diets. This allows us to positively discriminate between specialist and generalist species. Discussion We show that the proposed approach is a useful methodology not affected by the characteristics of the finite element mesh. Additionally, the positive discriminating results obtained when analysing a difficult case study suggest that the proposed method could be a very useful tool for comparative studies in finite element analysis using multivariate statistical approaches. PMID:29043107

  12. Multivariate analysis methods in physics

    International Nuclear Information System (INIS)

    Wolter, M.

    2007-01-01

    A review of multivariate methods based on statistical training is given. Several multivariate methods useful in high-energy physics analysis are discussed. Selected examples from current research in particle physics are discussed, both from the on-line trigger selection and from the off-line analysis. Also statistical training methods are presented and some new application are suggested [ru

  13. Introductory statistical inference

    CERN Document Server

    Mukhopadhyay, Nitis

    2014-01-01

    This gracefully organized text reveals the rigorous theory of probability and statistical inference in the style of a tutorial, using worked examples, exercises, figures, tables, and computer simulations to develop and illustrate concepts. Drills and boxed summaries emphasize and reinforce important ideas and special techniques.Beginning with a review of the basic concepts and methods in probability theory, moments, and moment generating functions, the author moves to more intricate topics. Introductory Statistical Inference studies multivariate random variables, exponential families of dist

  14. Multivariate statistical approximation of the in situ gamma-ray spectrometry of the State of Zacatecas, Mexico

    Energy Technology Data Exchange (ETDEWEB)

    Lopez I, J. F.; Rios M, C.; Mireles G, F.; Saucedo A, S.; Davila R, I.; Pinedo, J.L., E-mail: fernandolf498@gmail.com [Universidad Autonoma de Zacatecas, Unidad Academica de Estudios Nucleares, Cipres No. 10, Fracc. La Penuela, 98060 Zacatecas, Zac. (Mexico)

    2017-09-15

    The environmental radioactivity evaluation is a key point in the assessment of the environmental quality. Through this, it can be found possible radioactive contamination, locate possible Uranium and Thorium deposits and evaluate the primordial isotopes concentration due to human activities. A radioactive map of the Zacatecas State, Mexico is under construction based on in situ gamma-ray spectrometry. The present work reports the results of the multivariate statistical approximation of the measured activity data. Based on Pearson correlation, the {sup 228}Ac and {sup 208}Tl activities are statistically significant, while the {sup 214}Bi and {sup 214}Pb activities are not statistically significant. These can be due to the existence or not of secular equilibrium in the Thorium and Uranium series. (Author)

  15. Tuberous root characteristics of sweet potato clones using multivariate techniques for selection of superior genotypes

    Directory of Open Access Journals (Sweden)

    Jackson da Silva

    2018-01-01

    Full Text Available The objective of this study was to evaluate the tuberous root characteristics of sweet potato clones using multivariate techniques for selection of superior genotypes, the present research was carried out in the Experimental area of the Plant Genetic Breeding Sector of the Agrarian Sciences Center of the Federal University of Alagoas (SMGP/CECA/UFAL. Were evaluated 44 new clones originated from progenies of half-siblings and germanic siblings, in addition to the cultivar Sergipana Vermelha, in lines of 5 m in length, spacing 1.0 mx 0.5 m, totaling a total area of 5 m²/clone. The harvest was done at 120 days after planting the branches, in which the production of non-commercial tuberous roots (PRTNC was evaluated, production of commercial tuberous roots (PRTC, production of tuberous roots (PTRT, total number of tuberous roots (NTRT, average weight of commercial tuberous roots (PMRTC, predominant color of tuberous root skin (CPPERT and predominant color of the tuberosal root pulp (CPPORT. Descriptive statistics, correlation technique and principal component analysis were used. It was observed that clones 23, 36, 17 and 37 presented interesting agronomic characteristics, being recommended for the cultivation and in the analysis of main components, the variables PTRT and PRTC presented greater importance, reflecting that they discriminate the clones satisfactorily.

  16. Comparative urine analysis by liquid chromatography-mass spectrometry and multivariate statistics : Method development, evaluation, and application to proteinuria

    NARCIS (Netherlands)

    Kemperman, Ramses F. J.; Horvatovich, Peter L.; Hoekman, Berend; Reijmers, Theo H.; Muskiet, Frits A. J.; Bischoff, Rainer

    2007-01-01

    We describe a platform for the comparative profiling of urine using reversed-phase liquid chromatography-mass spectrometry (LC-MS) and multivariate statistical data analysis. Urinary compounds were separated by gradient elution and subsequently detected by electrospray Ion-Trap MS. The lower limit

  17. Multivariate Statistical Analysis of Orthogonal Mass Spectral Data for the Identification of Chemical Attribution Signatures of 3-Methylfentanyl

    Energy Technology Data Exchange (ETDEWEB)

    Mayer, B. P. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Valdez, C. A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); DeHope, A. J. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Spackman, P. E. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Sanner, R. D. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Martinez, H. P. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Williams, A. M. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2016-11-28

    Critical to many modern forensic investigations is the chemical attribution of the origin of an illegal drug. This process greatly relies on identification of compounds indicative of its clandestine or commercial production. The results of these studies can yield detailed information on method of manufacture, sophistication of the synthesis operation, starting material source, and final product. In the present work, chemical attribution signatures (CAS) associated with the synthesis of the analgesic 3- methylfentanyl, N-(3-methyl-1-phenethylpiperidin-4-yl)-N-phenylpropanamide, were investigated. Six synthesis methods were studied in an effort to identify and classify route-specific signatures. These methods were chosen to minimize the use of scheduled precursors, complicated laboratory equipment, number of overall steps, and demanding reaction conditions. Using gas and liquid chromatographies combined with mass spectrometric methods (GC-QTOF and LC-QTOF) in conjunction with inductivelycoupled plasma mass spectrometry (ICP-MS), over 240 distinct compounds and elements were monitored. As seen in our previous work with CAS of fentanyl synthesis the complexity of the resultant data matrix necessitated the use of multivariate statistical analysis. Using partial least squares discriminant analysis (PLS-DA), 62 statistically significant, route-specific CAS were identified. Statistical classification models using a variety of machine learning techniques were then developed with the ability to predict the method of 3-methylfentanyl synthesis from three blind crude samples generated by synthetic chemists without prior experience with these methods.

  18. Statistical evaluation of vibration analysis techniques

    Science.gov (United States)

    Milner, G. Martin; Miller, Patrice S.

    1987-01-01

    An evaluation methodology is presented for a selection of candidate vibration analysis techniques applicable to machinery representative of the environmental control and life support system of advanced spacecraft; illustrative results are given. Attention is given to the statistical analysis of small sample experiments, the quantification of detection performance for diverse techniques through the computation of probability of detection versus probability of false alarm, and the quantification of diagnostic performance.

  19. PIXE-quantified AXSIA: Elemental mapping by multivariate spectral analysis

    International Nuclear Information System (INIS)

    Doyle, B.L.; Provencio, P.P.; Kotula, P.G.; Antolak, A.J.; Ryan, C.G.; Campbell, J.L.; Barrett, K.

    2006-01-01

    Automated, nonbiased, multivariate statistical analysis techniques are useful for converting very large amounts of data into a smaller, more manageable number of chemical components (spectra and images) that are needed to describe the measurement. We report the first use of the multivariate spectral analysis program AXSIA (Automated eXpert Spectral Image Analysis) developed at Sandia National Laboratories to quantitatively analyze micro-PIXE data maps. AXSIA implements a multivariate curve resolution technique that reduces the spectral image data sets into a limited number of physically realizable and easily interpretable components (including both spectra and images). We show that the principal component spectra can be further analyzed using conventional PIXE programs to convert the weighting images into quantitative concentration maps. A common elemental data set has been analyzed using three different PIXE analysis codes and the results compared to the cases when each of these codes is used to separately analyze the associated AXSIA principal component spectral data. We find that these comparisons are in good quantitative agreement with each other

  20. Southeast Atlantic Cloud Properties in a Multivariate Statistical Model - How Relevant is Air Mass History for Local Cloud Properties?

    Science.gov (United States)

    Fuchs, Julia; Cermak, Jan; Andersen, Hendrik

    2017-04-01

    This study aims at untangling the impacts of external dynamics and local conditions on cloud properties in the Southeast Atlantic (SEA) by combining satellite and reanalysis data using multivariate statistics. The understanding of clouds and their determinants at different scales is important for constraining the Earth's radiative budget, and thus prominent in climate-system research. In this study, SEA stratocumulus cloud properties are observed not only as the result of local environmental conditions but also as affected by external dynamics and spatial origins of air masses entering the study area. In order to assess to what extent cloud properties are impacted by aerosol concentration, air mass history, and meteorology, a multivariate approach is conducted using satellite observations of aerosol and cloud properties (MODIS, SEVIRI), information on aerosol species composition (MACC) and meteorological context (ERA-Interim reanalysis). To account for the often-neglected but important role of air mass origin, information on air mass history based on HYSPLIT modeling is included in the statistical model. This multivariate approach is intended to lead to a better understanding of the physical processes behind observed stratocumulus cloud properties in the SEA.

  1. The outlier sample effects on multivariate statistical data processing geochemical stream sediment survey (Moghangegh region, North West of Iran)

    International Nuclear Information System (INIS)

    Ghanbari, Y.; Habibnia, A.; Memar, A.

    2009-01-01

    In geochemical stream sediment surveys in Moghangegh Region in north west of Iran, sheet 1:50,000, 152 samples were collected and after the analyze and processing of data, it revealed that Yb, Sc, Ni, Li, Eu, Cd, Co, as contents in one sample is far higher than other samples. After detecting this sample as an outlier sample, the effect of this sample on multivariate statistical data processing for destructive effects of outlier sample in geochemical exploration was investigated. Pearson and Spear man correlation coefficient methods and cluster analysis were used for multivariate studies and the scatter plot of some elements together the regression profiles are given in case of 152 and 151 samples and the results are compared. After investigation of multivariate statistical data processing results, it was realized that results of existence of outlier samples may appear as the following relations between elements: - true relation between two elements, which have no outlier frequency in the outlier sample. - false relation between two elements which one of them has outlier frequency in the outlier sample. - complete false relation between two elements which both have outlier frequency in the outlier sample

  2. Application of instrumental neutron activation analysis and multivariate statistical methods to archaeological Syrian ceramics

    International Nuclear Information System (INIS)

    Bakraji, E. H.; Othman, I.; Sarhil, A.; Al-Somel, N.

    2002-01-01

    Instrumental neutron activation analysis (INAA) has been utilized in the analysis of thirty-seven archaeological ceramics fragment samples collected from Tal AI-Wardiate site, Missiaf town, Hamma city, Syria. 36 chemical elements were determined. These elemental concentrations have been processed using two multivariate statistical methods, cluster and factor analysis in order to determine similarities and correlation between the various samples. Factor analysis confirms that samples were correctly classified by cluster analysis. The results showed that samples can be considered to be manufactured using three different sources of raw material. (author)

  3. Provenance Study of Archaeological Ceramics from Syria Using XRF Multivariate Statistical Analysis and Thermoluminescence Dating

    OpenAIRE

    Bakraji, Elias Hanna; Abboud, Rana; Issa, Haissm

    2014-01-01

    Thermoluminescence (TL) dating and multivariate statistical methods based on radioisotope X-ray fluorescence analysis have been utilized to date and classify Syrian archaeological ceramics fragment from Tel Jamous site. 54 samples were analyzed by radioisotope X-ray fluorescence; 51 of them come from Tel Jamous archaeological site in Sahel Akkar region, Syria, which fairly represent ceramics belonging to the Middle Bronze Age (2150 to 1600 B.C.) and the remaining three samples come from Mar-T...

  4. Multivariate Analysis Techniques for Optimal Vision System Design

    DEFF Research Database (Denmark)

    Sharifzadeh, Sara

    The present thesis considers optimization of the spectral vision systems used for quality inspection of food items. The relationship between food quality, vision based techniques and spectral signature are described. The vision instruments for food analysis as well as datasets of the food items...... used in this thesis are described. The methodological strategies are outlined including sparse regression and pre-processing based on feature selection and extraction methods, supervised versus unsupervised analysis and linear versus non-linear approaches. One supervised feature selection algorithm...... (SSPCA) and DCT based characterization of the spectral diffused reflectance images for wavelength selection and discrimination. These methods together with some other state-of-the-art statistical and mathematical analysis techniques are applied on datasets of different food items; meat, diaries, fruits...

  5. One Hundred Ways to be Non-Fickian - A Rigorous Multi-Variate Statistical Analysis of Pore-Scale Transport

    Science.gov (United States)

    Most, Sebastian; Nowak, Wolfgang; Bijeljic, Branko

    2015-04-01

    Fickian transport in groundwater flow is the exception rather than the rule. Transport in porous media is frequently simulated via particle methods (i.e. particle tracking random walk (PTRW) or continuous time random walk (CTRW)). These methods formulate transport as a stochastic process of particle position increments. At the pore scale, geometry and micro-heterogeneities prohibit the commonly made assumption of independent and normally distributed increments to represent dispersion. Many recent particle methods seek to loosen this assumption. Hence, it is important to get a better understanding of the processes at pore scale. For our analysis we track the positions of 10.000 particles migrating through the pore space over time. The data we use come from micro CT scans of a homogeneous sandstone and encompass about 10 grain sizes. Based on those images we discretize the pore structure and simulate flow at the pore scale based on the Navier-Stokes equation. This flow field realistically describes flow inside the pore space and we do not need to add artificial dispersion during the transport simulation. Next, we use particle tracking random walk and simulate pore-scale transport. Finally, we use the obtained particle trajectories to do a multivariate statistical analysis of the particle motion at the pore scale. Our analysis is based on copulas. Every multivariate joint distribution is a combination of its univariate marginal distributions. The copula represents the dependence structure of those univariate marginals and is therefore useful to observe correlation and non-Gaussian interactions (i.e. non-Fickian transport). The first goal of this analysis is to better understand the validity regions of commonly made assumptions. We are investigating three different transport distances: 1) The distance where the statistical dependence between particle increments can be modelled as an order-one Markov process. This would be the Markovian distance for the process, where

  6. Improved detection of incipient anomalies via multivariate memory monitoring charts: Application to an air flow heating system

    KAUST Repository

    Harrou, Fouzi

    2016-08-11

    Detecting anomalies is important for reliable operation of several engineering systems. Multivariate statistical monitoring charts are an efficient tool for checking the quality of a process by identifying abnormalities. Principal component analysis (PCA) was shown effective in monitoring processes with highly correlated data. Traditional PCA-based methods, nevertheless, often are relatively inefficient at detecting incipient anomalies. Here, we propose a statistical approach that exploits the advantages of PCA and those of multivariate memory monitoring schemes, like the multivariate cumulative sum (MCUSUM) and multivariate exponentially weighted moving average (MEWMA) monitoring schemes to better detect incipient anomalies. Memory monitoring charts are sensitive to incipient anomalies in process mean, which significantly improve the performance of PCA method and enlarge its profitability, and to utilize these improvements in various applications. The performance of PCA-based MEWMA and MCUSUM control techniques are demonstrated and compared with traditional PCA-based monitoring methods. Using practical data gathered from a heating air-flow system, we demonstrate the greater sensitivity and efficiency of the developed method over the traditional PCA-based methods. Results indicate that the proposed techniques have potential for detecting incipient anomalies in multivariate data. © 2016 Elsevier Ltd

  7. A multivariate statistical study with a factor analysis of recent planktonic foraminiferal distribution in the Coromandel Coast of India

    Digital Repository Service at National Institute of Oceanography (India)

    Jayalakshmy, K.V.; Rao, K.K.

    A study of planktonic foraminiferal assemblages from 19 stations in the neritic and oceanic regions off the Coromandel Coast, Bay of Bengal has been made using a multivariate statistical method termed as factor analysis. On the basis of abundance...

  8. Evaluation of multivariate statistical analyses for monitoring and prediction of processes in an seawater reverse osmosis desalination plant

    Energy Technology Data Exchange (ETDEWEB)

    Kolluri, Srinivas Sahan; Esfahani, Iman Janghorban; Garikiparthy, Prithvi Sai Nadh; Yoo, Chang Kyoo [Kyung Hee University, Yongin (Korea, Republic of)

    2015-08-15

    Our aim was to analyze, monitor, and predict the outcomes of processes in a full-scale seawater reverse osmosis (SWRO) desalination plant using multivariate statistical techniques. Multivariate analysis of variance (MANOVA) was used to investigate the performance and efficiencies of two SWRO processes, namely, pore controllable fiber filterreverse osmosis (PCF-SWRO) and sand filtration-ultra filtration-reverse osmosis (SF-UF-SWRO). Principal component analysis (PCA) was applied to monitor the two SWRO processes. PCA monitoring revealed that the SF-UF-SWRO process could be analyzed reliably with a low number of outliers and disturbances. Partial least squares (PLS) analysis was then conducted to predict which of the seven input parameters of feed flow rate, PCF/SF-UF filtrate flow rate, temperature of feed water, turbidity feed, pH, reverse osmosis (RO)flow rate, and pressure had a significant effect on the outcome variables of permeate flow rate and concentration. Root mean squared errors (RMSEs) of the PLS models for permeate flow rates were 31.5 and 28.6 for the PCF-SWRO process and SF-UF-SWRO process, respectively, while RMSEs of permeate concentrations were 350.44 and 289.4, respectively. These results indicate that the SF-UF-SWRO process can be modeled more accurately than the PCF-SWRO process, because the RMSE values of permeate flowrate and concentration obtained using a PLS regression model of the SF-UF-SWRO process were lower than those obtained for the PCF-SWRO process.

  9. Evaluation of multivariate statistical analyses for monitoring and prediction of processes in an seawater reverse osmosis desalination plant

    International Nuclear Information System (INIS)

    Kolluri, Srinivas Sahan; Esfahani, Iman Janghorban; Garikiparthy, Prithvi Sai Nadh; Yoo, Chang Kyoo

    2015-01-01

    Our aim was to analyze, monitor, and predict the outcomes of processes in a full-scale seawater reverse osmosis (SWRO) desalination plant using multivariate statistical techniques. Multivariate analysis of variance (MANOVA) was used to investigate the performance and efficiencies of two SWRO processes, namely, pore controllable fiber filterreverse osmosis (PCF-SWRO) and sand filtration-ultra filtration-reverse osmosis (SF-UF-SWRO). Principal component analysis (PCA) was applied to monitor the two SWRO processes. PCA monitoring revealed that the SF-UF-SWRO process could be analyzed reliably with a low number of outliers and disturbances. Partial least squares (PLS) analysis was then conducted to predict which of the seven input parameters of feed flow rate, PCF/SF-UF filtrate flow rate, temperature of feed water, turbidity feed, pH, reverse osmosis (RO)flow rate, and pressure had a significant effect on the outcome variables of permeate flow rate and concentration. Root mean squared errors (RMSEs) of the PLS models for permeate flow rates were 31.5 and 28.6 for the PCF-SWRO process and SF-UF-SWRO process, respectively, while RMSEs of permeate concentrations were 350.44 and 289.4, respectively. These results indicate that the SF-UF-SWRO process can be modeled more accurately than the PCF-SWRO process, because the RMSE values of permeate flowrate and concentration obtained using a PLS regression model of the SF-UF-SWRO process were lower than those obtained for the PCF-SWRO process.

  10. Statistical Theory of the Vector Random Decrement Technique

    DEFF Research Database (Denmark)

    Asmussen, J. C.; Brincker, Rune; Ibrahim, S. R.

    1999-01-01

    decays. Due to the speed and/or accuracy of the Vector Random Decrement technique, it was introduced as an attractive alternative to the Random Decrement technique. In this paper, the theory of the Vector Random Decrement technique is extended by applying a statistical description of the stochastic...

  11. Multivariate Statistical Inference of Lightning Occurrence, and Using Lightning Observations

    Science.gov (United States)

    Boccippio, Dennis

    2004-01-01

    Two classes of multivariate statistical inference using TRMM Lightning Imaging Sensor, Precipitation Radar, and Microwave Imager observation are studied, using nonlinear classification neural networks as inferential tools. The very large and globally representative data sample provided by TRMM allows both training and validation (without overfitting) of neural networks with many degrees of freedom. In the first study, the flashing / or flashing condition of storm complexes is diagnosed using radar, passive microwave and/or environmental observations as neural network inputs. The diagnostic skill of these simple lightning/no-lightning classifiers can be quite high, over land (above 80% Probability of Detection; below 20% False Alarm Rate). In the second, passive microwave and lightning observations are used to diagnose radar reflectivity vertical structure. A priori diagnosis of hydrometeor vertical structure is highly important for improved rainfall retrieval from either orbital radars (e.g., the future Global Precipitation Mission "mothership") or radiometers (e.g., operational SSM/I and future Global Precipitation Mission passive microwave constellation platforms), we explore the incremental benefit to such diagnosis provided by lightning observations.

  12. Data classification and MTBF prediction with a multivariate analysis approach

    International Nuclear Information System (INIS)

    Braglia, Marcello; Carmignani, Gionata; Frosolini, Marco; Zammori, Francesco

    2012-01-01

    The paper presents a multivariate statistical approach that supports the classification of mechanical components, subjected to specific operating conditions, in terms of the Mean Time Between Failure (MTBF). Assessing the influence of working conditions and/or environmental factors on the MTBF is a prerequisite for the development of an effective preventive maintenance plan. However, this task may be demanding and it is generally performed with ad-hoc experimental methods, lacking of statistical rigor. To solve this common problem, a step by step multivariate data classification technique is proposed. Specifically, a set of structured failure data are classified in a meaningful way by means of: (i) cluster analysis, (ii) multivariate analysis of variance, (iii) feature extraction and (iv) predictive discriminant analysis. This makes it possible not only to define the MTBF of the analyzed components, but also to identify the working parameters that explain most of the variability of the observed data. The approach is finally demonstrated on 126 centrifugal pumps installed in an oil refinery plant; obtained results demonstrate the quality of the final discrimination, in terms of data classification and failure prediction.

  13. Fault detection of a spur gear using vibration signal with multivariable statistical parameters

    Directory of Open Access Journals (Sweden)

    Songpon Klinchaeam

    2014-10-01

    Full Text Available This paper presents a condition monitoring technique of a spur gear fault detection using vibration signal analysis based on time domain. Vibration signals were acquired from gearboxes and used to simulate various faults on spur gear tooth. In this study, vibration signals were applied to monitor a normal and various fault conditions of a spur gear such as normal, scuffing defect, crack defect and broken tooth. The statistical parameters of vibration signal were used to compare and evaluate the value of fault condition. This technique can be applied to set alarm limit of the signal condition based on statistical parameter such as variance, kurtosis, rms and crest factor. These parameters can be used to set as a boundary decision of signal condition. From the results, the vibration signal analysis with single statistical parameter is unclear to predict fault of the spur gears. The using at least two statistical parameters can be clearly used to separate in every case of fault detection. The boundary decision of statistical parameter with the 99.7% certainty ( 3   from 300 referenced dataset and detected the testing condition with 99.7% ( 3   accuracy and had an error of less than 0.3 % using 50 testing dataset.

  14. A PERFORMANCE COMPARISON BETWEEN ARTIFICIAL NEURAL NETWORKS AND MULTIVARIATE STATISTICAL METHODS IN FORECASTING FINANCIAL STRENGTH RATING IN TURKISH BANKING SECTOR

    Directory of Open Access Journals (Sweden)

    MELEK ACAR BOYACIOĞLU

    2013-06-01

    Full Text Available Financial strength rating indicates the fundamental financial strength of a bank. The aim of financial strength rating is to measure a bank’s fundamental financial strength excluding the external factors. External factors can stem from the working environment or can be linked with the outside protective support mechanisms. With the evaluation, the rating of a bank free from outside supportive factors is being sought. Also the financial fundamental, franchise value, the variety of assets and working environment of a bank are being evaluated in this context. In this study, a model has been developed in order to predict the financial strength rating of Turkish banks. The methodology of this study is as follows: Selecting variables to be used in the model, creating a data set, choosing the techniques to be used and the evaluation of classification success of the techniques. It is concluded that the artificial neural network system shows a better performance in terms of classification of financial strength rating in comparison to multivariate statistical methods in the raining set. On the other hand, there is no meaningful difference could be found in the validation set in which the prediction performances of the employed techniques are tested.

  15. Performance of some supervised and unsupervised multivariate techniques for grouping authentic and unauthentic Viagra and Cialis

    Directory of Open Access Journals (Sweden)

    Michel J. Anzanello

    2014-09-01

    Full Text Available A typical application of multivariate techniques in forensic analysis consists of discriminating between authentic and unauthentic samples of seized drugs, in addition to finding similar properties in the unauthentic samples. In this paper, the performance of several methods belonging to two different classes of multivariate techniques–supervised and unsupervised techniques–were compared. The supervised techniques (ST are the k-Nearest Neighbor (KNN, Support Vector Machine (SVM, Probabilistic Neural Networks (PNN and Linear Discriminant Analysis (LDA; the unsupervised techniques are the k-Means CA and the Fuzzy C-Means (FCM. The methods are applied to Infrared Spectroscopy by Fourier Transform (FTIR from authentic and unauthentic Cialis and Viagra. The FTIR data are also transformed by Principal Components Analysis (PCA and kernel functions aimed at improving the grouping performance. ST proved to be a more reasonable choice when the analysis is conducted on the original data, while the UT led to better results when applied to transformed data.

  16. TECHNIQUE OF THE STATISTICAL ANALYSIS OF INVESTMENT APPEAL OF THE REGION

    Directory of Open Access Journals (Sweden)

    А. А. Vershinina

    2014-01-01

    Full Text Available The technique of the statistical analysis of investment appeal of the region is given in scientific article for direct foreign investments. Definition of a technique of the statistical analysis is given, analysis stages reveal, the mathematico-statistical tools are considered.

  17. Multivariate nonparametric regression and visualization with R and applications to finance

    CERN Document Server

    Klemelä, Jussi

    2014-01-01

    A modern approach to statistical learning and its applications through visualization methods With a unique and innovative presentation, Multivariate Nonparametric Regression and Visualization provides readers with the core statistical concepts to obtain complete and accurate predictions when given a set of data. Focusing on nonparametric methods to adapt to the multiple types of data generatingmechanisms, the book begins with an overview of classification and regression. The book then introduces and examines various tested and proven visualization techniques for learning samples and functio

  18. Are conventional statistical techniques exhaustive for defining metal background concentrations in harbour sediments? A case study: The Coastal Area of Bari (Southeast Italy).

    Science.gov (United States)

    Mali, Matilda; Dell'Anna, Maria Michela; Mastrorilli, Piero; Damiani, Leonardo; Ungaro, Nicola; Belviso, Claudia; Fiore, Saverio

    2015-11-01

    Sediment contamination by metals poses significant risks to coastal ecosystems and is considered to be problematic for dredging operations. The determination of the background values of metal and metalloid distribution based on site-specific variability is fundamental in assessing pollution levels in harbour sediments. The novelty of the present work consists of addressing the scope and limitation of analysing port sediments through the use of conventional statistical techniques (such as: linear regression analysis, construction of cumulative frequency curves and the iterative 2σ technique), that are commonly employed for assessing Regional Geochemical Background (RGB) values in coastal sediments. This study ascertained that although the tout court use of such techniques in determining the RGB values in harbour sediments seems appropriate (the chemical-physical parameters of port sediments fit well with statistical equations), it should nevertheless be avoided because it may be misleading and can mask key aspects of the study area that can only be revealed by further investigations, such as mineralogical and multivariate statistical analyses. Copyright © 2015 Elsevier Ltd. All rights reserved.

  19. Extending multivariate distance matrix regression with an effect size measure and the asymptotic null distribution of the test statistic.

    Science.gov (United States)

    McArtor, Daniel B; Lubke, Gitta H; Bergeman, C S

    2017-12-01

    Person-centered methods are useful for studying individual differences in terms of (dis)similarities between response profiles on multivariate outcomes. Multivariate distance matrix regression (MDMR) tests the significance of associations of response profile (dis)similarities and a set of predictors using permutation tests. This paper extends MDMR by deriving and empirically validating the asymptotic null distribution of its test statistic, and by proposing an effect size for individual outcome variables, which is shown to recover true associations. These extensions alleviate the computational burden of permutation tests currently used in MDMR and render more informative results, thus making MDMR accessible to new research domains.

  20. A multivariate statistical study on a diversified data gathering system for nuclear power plants

    International Nuclear Information System (INIS)

    Samanta, P.K.; Teichmann, T.; Levine, M.M.; Kato, W.Y.

    1989-02-01

    In this report, multivariate statistical methods are presented and applied to demonstrate their use in analyzing nuclear power plant operational data. For analyses of nuclear power plant events, approaches are presented for detecting malfunctions and degradations within the course of the event. At the system level, approaches are investigated as a means of diagnosis of system level performance. This involves the detection of deviations from normal performance of the system. The input data analyzed are the measurable physical parameters, such as steam generator level, pressurizer water level, auxiliary feedwater flow, etc. The study provides the methodology and illustrative examples based on data gathered from simulation of nuclear power plant transients and computer simulation of a plant system performance (due to lack of easily accessible operational data). Such an approach, once fully developed, can be used to explore statistically the detection of failure trends and patterns and prevention of conditions with serious safety implications. 33 refs., 18 figs., 9 tabs

  1. Application of Multivariate Statistical Analysis in Evaluation of Surface River Water Quality of a Tropical River

    Directory of Open Access Journals (Sweden)

    Teck-Yee Ling

    2017-01-01

    Full Text Available The present study evaluated the spatial variations of surface water quality in a tropical river using multivariate statistical techniques, including cluster analysis (CA and principal component analysis (PCA. Twenty physicochemical parameters were measured at 30 stations along the Batang Baram and its tributaries. The water quality of the Batang Baram was categorized as “slightly polluted” where the chemical oxygen demand and total suspended solids were the most deteriorated parameters. The CA grouped the 30 stations into four clusters which shared similar characteristics within the same cluster, representing the upstream, middle, and downstream regions of the main river and the tributaries from the middle to downstream regions of the river. The PCA has determined a reduced number of six principal components that explained 83.6% of the data set variance. The first PC indicated that the total suspended solids, turbidity, and hydrogen sulphide were the dominant polluting factors which is attributed to the logging activities, followed by the five-day biochemical oxygen demand, total phosphorus, organic nitrogen, and nitrate-nitrogen in the second PC which are related to the discharges from domestic wastewater. The components also imply that logging activities are the major anthropogenic activities responsible for water quality variations in the Batang Baram when compared to the domestic wastewater discharge.

  2. Multivariate statistical assessment of heavy metal pollution sources of groundwater around a lead and zinc plant

    Directory of Open Access Journals (Sweden)

    Zamani Abbas Ali

    2012-12-01

    Full Text Available Abstract The contamination of groundwater by heavy metal ions around a lead and zinc plant has been studied. As a case study groundwater contamination in Bonab Industrial Estate (Zanjan-Iran for iron, cobalt, nickel, copper, zinc, cadmium and lead content was investigated using differential pulse polarography (DPP. Although, cobalt, copper and zinc were found correspondingly in 47.8%, 100.0%, and 100.0% of the samples, they did not contain these metals above their maximum contaminant levels (MCLs. Cadmium was detected in 65.2% of the samples and 17.4% of them were polluted by this metal. All samples contained detectable levels of lead and iron with 8.7% and 13.0% of the samples higher than their MCLs. Nickel was also found in 78.3% of the samples, out of which 8.7% were polluted. In general, the results revealed the contamination of groundwater sources in the studied zone. The higher health risks are related to lead, nickel, and cadmium ions. Multivariate statistical techniques were applied for interpreting the experimental data and giving a description for the sources. The data analysis showed correlations and similarities between investigated heavy metals and helps to classify these ion groups. Cluster analysis identified five clusters among the studied heavy metals. Cluster 1 consisted of Pb, Cu, and cluster 3 included Cd, Fe; also each of the elements Zn, Co and Ni was located in groups with single member. The same results were obtained by factor analysis. Statistical investigations revealed that anthropogenic factors and notably lead and zinc plant and pedo-geochemical pollution sources are influencing water quality in the studied area.

  3. Multivariate statistical assessment of heavy metal pollution sources of groundwater around a lead and zinc plant.

    Science.gov (United States)

    Zamani, Abbas Ali; Yaftian, Mohammad Reza; Parizanganeh, Abdolhossein

    2012-12-17

    The contamination of groundwater by heavy metal ions around a lead and zinc plant has been studied. As a case study groundwater contamination in Bonab Industrial Estate (Zanjan-Iran) for iron, cobalt, nickel, copper, zinc, cadmium and lead content was investigated using differential pulse polarography (DPP). Although, cobalt, copper and zinc were found correspondingly in 47.8%, 100.0%, and 100.0% of the samples, they did not contain these metals above their maximum contaminant levels (MCLs). Cadmium was detected in 65.2% of the samples and 17.4% of them were polluted by this metal. All samples contained detectable levels of lead and iron with 8.7% and 13.0% of the samples higher than their MCLs. Nickel was also found in 78.3% of the samples, out of which 8.7% were polluted. In general, the results revealed the contamination of groundwater sources in the studied zone. The higher health risks are related to lead, nickel, and cadmium ions. Multivariate statistical techniques were applied for interpreting the experimental data and giving a description for the sources. The data analysis showed correlations and similarities between investigated heavy metals and helps to classify these ion groups. Cluster analysis identified five clusters among the studied heavy metals. Cluster 1 consisted of Pb, Cu, and cluster 3 included Cd, Fe; also each of the elements Zn, Co and Ni was located in groups with single member. The same results were obtained by factor analysis. Statistical investigations revealed that anthropogenic factors and notably lead and zinc plant and pedo-geochemical pollution sources are influencing water quality in the studied area.

  4. The application of statistical and/or non-statistical sampling techniques by internal audit functions in the South African banking industry

    Directory of Open Access Journals (Sweden)

    D.P. van der Nest

    2015-03-01

    Full Text Available This article explores the use by internal audit functions of audit sampling techniques in order to test the effectiveness of controls in the banking sector. The article focuses specifically on the use of statistical and/or non-statistical sampling techniques by internal auditors. The focus of the research for this article was internal audit functions in the banking sector of South Africa. The results discussed in the article indicate that audit sampling is still used frequently as an audit evidence-gathering technique. Non-statistical sampling techniques are used more frequently than statistical sampling techniques for the evaluation of the sample. In addition, both techniques are regarded as important for the determination of the sample size and the selection of the sample items

  5. Multivariate strategies in functional magnetic resonance imaging

    DEFF Research Database (Denmark)

    Hansen, Lars Kai

    2007-01-01

    We discuss aspects of multivariate fMRI modeling, including the statistical evaluation of multivariate models and means for dimensional reduction. In a case study we analyze linear and non-linear dimensional reduction tools in the context of a `mind reading' predictive multivariate fMRI model....

  6. Comparison of multivariate and univariate statistical process control and monitoring methods

    International Nuclear Information System (INIS)

    Leger, R.P.; Garland, WM.J.; Macgregor, J.F.

    1996-01-01

    Work in recent years has lead to the development of multivariate process monitoring schemes which use Principal Component Analysis (PCA). This research compares the performance of a univariate scheme and a multivariate PCA scheme used for monitoring a simple process with 11 measured variables. The multivariate PCA scheme was able to adequately represent the process using two principal components. This resulted in a PCA monitoring scheme which used two charts as opposed to 11 charts for the univariate scheme and therefore had distinct advantages in terms of both data representation, presentation, and fault diagnosis capabilities. (author)

  7. Study of Syrian archaeological pottery by the combined application of thermoluminescence (TL) dating, X-ray fluorescence analysis and statistical multivariate analysis

    International Nuclear Information System (INIS)

    Bakraji, E.H.

    2012-01-01

    X-ray fluorescence method and the technique of thermoluminescence (TL) dating have been utilized for the study of archaeological pottery fragment samples, fairly representative of Romanian period between 1 st century B.C. and 4th century A.D, from Judaidet Yabous site, which is located north-west of Damascus city, Syria. Four samples were chosen randomly among the forty six samples for dating using thermoluminescence technique and the results were in good agreement with the date assigned by archaeologists. The samples were irradiated for 1000 s live time twice, first using a Mo X-ray Tube and second using a 109 Cd radioactive source. Fifteen elements (K, Ca, Ti, Mn, Fe, Ni, Cu, Zn, Ga, Rb, Sr, Y, Zr, Nb, and Pb) were determined. The elemental concentrations have been processed using two multivariate statistical methods. The purpose of the study was to characterize by means of elements contents the pottery paste from Judaidet Yabous archaeological site and providing new data to the Syrian databases for future studies. From an archaeological point of view the results indicated that most of the potteries, were locally produced. (author)

  8. A Framework for Establishing Standard Reference Scale of Texture by Multivariate Statistical Analysis Based on Instrumental Measurement and Sensory Evaluation.

    Science.gov (United States)

    Zhi, Ruicong; Zhao, Lei; Xie, Nan; Wang, Houyin; Shi, Bolin; Shi, Jingye

    2016-01-13

    A framework of establishing standard reference scale (texture) is proposed by multivariate statistical analysis according to instrumental measurement and sensory evaluation. Multivariate statistical analysis is conducted to rapidly select typical reference samples with characteristics of universality, representativeness, stability, substitutability, and traceability. The reasonableness of the framework method is verified by establishing standard reference scale of texture attribute (hardness) with Chinese well-known food. More than 100 food products in 16 categories were tested using instrumental measurement (TPA test), and the result was analyzed with clustering analysis, principal component analysis, relative standard deviation, and analysis of variance. As a result, nine kinds of foods were determined to construct the hardness standard reference scale. The results indicate that the regression coefficient between the estimated sensory value and the instrumentally measured value is significant (R(2) = 0.9765), which fits well with Stevens's theory. The research provides reliable a theoretical basis and practical guide for quantitative standard reference scale establishment on food texture characteristics.

  9. Multivariate spatial Gaussian mixture modeling for statistical clustering of hemodynamic parameters in functional MRI

    International Nuclear Information System (INIS)

    Fouque, A.L.; Ciuciu, Ph.; Risser, L.; Fouque, A.L.; Ciuciu, Ph.; Risser, L.

    2009-01-01

    In this paper, a novel statistical parcellation of intra-subject functional MRI (fMRI) data is proposed. The key idea is to identify functionally homogenous regions of interest from their hemodynamic parameters. To this end, a non-parametric voxel-based estimation of hemodynamic response function is performed as a prerequisite. Then, the extracted hemodynamic features are entered as the input data of a Multivariate Spatial Gaussian Mixture Model (MSGMM) to be fitted. The goal of the spatial aspect is to favor the recovery of connected components in the mixture. Our statistical clustering approach is original in the sense that it extends existing works done on univariate spatially regularized Gaussian mixtures. A specific Gibbs sampler is derived to account for different covariance structures in the feature space. On realistic artificial fMRI datasets, it is shown that our algorithm is helpful for identifying a parsimonious functional parcellation required in the context of joint detection estimation of brain activity. This allows us to overcome the classical assumption of spatial stationarity of the BOLD signal model. (authors)

  10. Genetic divergence of rubber tree estimated by multivariate techniques and microsatellite markers

    Directory of Open Access Journals (Sweden)

    Lígia Regina Lima Gouvêa

    2010-01-01

    Full Text Available Genetic diversity of 60 Hevea genotypes, consisting of Asiatic, Amazonian, African and IAC clones, and pertaining to the genetic breeding program of the Agronomic Institute (IAC, Brazil, was estimated. Analyses were based on phenotypic multivariate parameters and microsatellites. Five agronomic descriptors were employed in multivariate procedures, such as Standard Euclidian Distance, Tocher clustering and principal component analysis. Genetic variability among the genotypes was estimated with 68 selected polymorphic SSRs, by way of Modified Rogers Genetic Distance and UPGMA clustering. Structure software in a Bayesian approach was used in discriminating among groups. Genetic diversity was estimated through Nei's statistics. The genotypes were clustered into 12 groups according to the Tocher method, while the molecular analysis identified six groups. In the phenotypic and microsatellite analyses, the Amazonian and IAC genotypes were distributed in several groups, whereas the Asiatic were in only a few. Observed heterozygosity ranged from 0.05 to 0.96. Both high total diversity (H T' = 0.58 and high gene differentiation (Gst' = 0.61 were observed, and indicated high genetic variation among the 60 genotypes, which may be useful for breeding programs. The analyzed agronomic parameters and SSRs markers were effective in assessing genetic diversity among Hevea genotypes, besides proving to be useful for characterizing genetic variability.

  11. Research Update: Spatially resolved mapping of electronic structure on atomic level by multivariate statistical analysis

    International Nuclear Information System (INIS)

    Belianinov, Alex; Ganesh, Panchapakesan; Lin, Wenzhi; Jesse, Stephen; Pan, Minghu; Kalinin, Sergei V.; Sales, Brian C.; Sefat, Athena S.

    2014-01-01

    Atomic level spatial variability of electronic structure in Fe-based superconductor FeTe 0.55 Se 0.45 (T c = 15 K) is explored using current-imaging tunneling-spectroscopy. Multivariate statistical analysis of the data differentiates regions of dissimilar electronic behavior that can be identified with the segregation of chalcogen atoms, as well as boundaries between terminations and near neighbor interactions. Subsequent clustering analysis allows identification of the spatial localization of these dissimilar regions. Similar statistical analysis of modeled calculated density of states of chemically inhomogeneous FeTe 1−x Se x structures further confirms that the two types of chalcogens, i.e., Te and Se, can be identified by their electronic signature and differentiated by their local chemical environment. This approach allows detailed chemical discrimination of the scanning tunneling microscopy data including separation of atomic identities, proximity, and local configuration effects and can be universally applicable to chemically and electronically inhomogeneous surfaces

  12. Multivariate analysis of the cleaning efficacy of different final irrigation techniques in the canal and isthmus of mandibular posterior teeth

    Directory of Open Access Journals (Sweden)

    Yeon-Jee Yoo

    2013-08-01

    Full Text Available Objectives The aim of this study was to compare the cleaning efficacy of different final irrigation regimens in canal and isthmus of mandibular molars, and to evaluate the influence of related variables on cleaning efficacy of the irrigation systems. Materials and Methods Mesial root canals from 60 mandibular molars were prepared and divided into 4 experimental groups according to the final irrigation technique: Group C, syringe irrigation; Group U, ultrasonics activation; Group SC, VPro StreamClean irrigation; Group EV, EndoVac irrigation. Cross-sections at 1, 3 and 5 mm levels from the apex were examined to calculate remaining debris area in the canal and isthmus spaces. Statistical analysis was completed by using Kruskal-Wallis test and Mann-Whitney U test for comparison among groups, and multivariate linear analysis to identify the significant variables (regular replenishment of irrigant, vapor lock management, and ultrasonic activation of irrigant affecting the cleaning efficacy of the experimental groups. Results Group SC and EV showed significantly higher canal cleanliness values than group C and U at 1 mm level (p < 0.05, and higher isthmus cleanliness values than group U at 3 mm and all levels of group C (p < 0.05. Multivariate linear regression analysis demonstrated that all variables had independent positive correlation at 1 mm level of canal and at all levels of isthmus with statistical significances. Conclusions Both VPro StreamClean and EndoVac system showed favorable result as final irrigation regimens for cleaning debris in the complicated root canal system having curved canal and/or isthmus. The debridement of the isthmi significantly depends on the variables rather than the canals.

  13. Gasoline classification using near infrared (NIR) spectroscopy data: Comparison of multivariate techniques

    Energy Technology Data Exchange (ETDEWEB)

    Balabin, Roman M., E-mail: balabin@org.chem.ethz.ch [Department of Chemistry and Applied Biosciences, ETH Zurich, 8093 Zurich (Switzerland); Safieva, Ravilya Z. [Gubkin Russian State University of Oil and Gas, 119991 Moscow (Russian Federation); Lomakina, Ekaterina I. [Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, 119992 Moscow (Russian Federation)

    2010-06-25

    Near infrared (NIR) spectroscopy is a non-destructive (vibrational spectroscopy based) measurement technique for many multicomponent chemical systems, including products of petroleum (crude oil) refining and petrochemicals, food products (tea, fruits, e.g., apples, milk, wine, spirits, meat, bread, cheese, etc.), pharmaceuticals (drugs, tablets, bioreactor monitoring, etc.), and combustion products. In this paper we have compared the abilities of nine different multivariate classification methods: linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), regularized discriminant analysis (RDA), soft independent modeling of class analogy (SIMCA), partial least squares (PLS) classification, K-nearest neighbor (KNN), support vector machines (SVM), probabilistic neural network (PNN), and multilayer perceptron (ANN-MLP) - for gasoline classification. Three sets of near infrared (NIR) spectra (450, 415, and 345 spectra) were used for classification of gasolines into 3, 6, and 3 classes, respectively, according to their source (refinery or process) and type. The 14,000-8000 cm{sup -1} NIR spectral region was chosen. In all cases NIR spectroscopy was found to be effective for gasoline classification purposes, when compared with nuclear magnetic resonance (NMR) spectroscopy or gas chromatography (GC). KNN, SVM, and PNN techniques for classification were found to be among the most effective ones. Artificial neural network (ANN-MLP) approach based on principal component analysis (PCA), which was believed to be efficient, has shown much worse results. We hope that the results obtained in this study will help both further chemometric (multivariate data analysis) investigations and investigations in the sphere of applied vibrational (infrared/IR, near-IR, and Raman) spectroscopy of sophisticated multicomponent systems.

  14. Gasoline classification using near infrared (NIR) spectroscopy data: Comparison of multivariate techniques

    International Nuclear Information System (INIS)

    Balabin, Roman M.; Safieva, Ravilya Z.; Lomakina, Ekaterina I.

    2010-01-01

    Near infrared (NIR) spectroscopy is a non-destructive (vibrational spectroscopy based) measurement technique for many multicomponent chemical systems, including products of petroleum (crude oil) refining and petrochemicals, food products (tea, fruits, e.g., apples, milk, wine, spirits, meat, bread, cheese, etc.), pharmaceuticals (drugs, tablets, bioreactor monitoring, etc.), and combustion products. In this paper we have compared the abilities of nine different multivariate classification methods: linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), regularized discriminant analysis (RDA), soft independent modeling of class analogy (SIMCA), partial least squares (PLS) classification, K-nearest neighbor (KNN), support vector machines (SVM), probabilistic neural network (PNN), and multilayer perceptron (ANN-MLP) - for gasoline classification. Three sets of near infrared (NIR) spectra (450, 415, and 345 spectra) were used for classification of gasolines into 3, 6, and 3 classes, respectively, according to their source (refinery or process) and type. The 14,000-8000 cm -1 NIR spectral region was chosen. In all cases NIR spectroscopy was found to be effective for gasoline classification purposes, when compared with nuclear magnetic resonance (NMR) spectroscopy or gas chromatography (GC). KNN, SVM, and PNN techniques for classification were found to be among the most effective ones. Artificial neural network (ANN-MLP) approach based on principal component analysis (PCA), which was believed to be efficient, has shown much worse results. We hope that the results obtained in this study will help both further chemometric (multivariate data analysis) investigations and investigations in the sphere of applied vibrational (infrared/IR, near-IR, and Raman) spectroscopy of sophisticated multicomponent systems.

  15. Multivariate statistical analysis to investigate the subduction zone parameters favoring the occurrence of giant megathrust earthquakes

    Science.gov (United States)

    Brizzi, S.; Sandri, L.; Funiciello, F.; Corbi, F.; Piromallo, C.; Heuret, A.

    2018-03-01

    The observed maximum magnitude of subduction megathrust earthquakes is highly variable worldwide. One key question is which conditions, if any, favor the occurrence of giant earthquakes (Mw ≥ 8.5). Here we carry out a multivariate statistical study in order to investigate the factors affecting the maximum magnitude of subduction megathrust earthquakes. We find that the trench-parallel extent of subduction zones and the thickness of trench sediments provide the largest discriminating capability between subduction zones that have experienced giant earthquakes and those having significantly lower maximum magnitude. Monte Carlo simulations show that the observed spatial distribution of giant earthquakes cannot be explained by pure chance to a statistically significant level. We suggest that the combination of a long subduction zone with thick trench sediments likely promotes a great lateral rupture propagation, characteristic of almost all giant earthquakes.

  16. Multiparametric statistics

    CERN Document Server

    Serdobolskii, Vadim Ivanovich

    2007-01-01

    This monograph presents mathematical theory of statistical models described by the essentially large number of unknown parameters, comparable with sample size but can also be much larger. In this meaning, the proposed theory can be called "essentially multiparametric". It is developed on the basis of the Kolmogorov asymptotic approach in which sample size increases along with the number of unknown parameters.This theory opens a way for solution of central problems of multivariate statistics, which up until now have not been solved. Traditional statistical methods based on the idea of an infinite sampling often break down in the solution of real problems, and, dependent on data, can be inefficient, unstable and even not applicable. In this situation, practical statisticians are forced to use various heuristic methods in the hope the will find a satisfactory solution.Mathematical theory developed in this book presents a regular technique for implementing new, more efficient versions of statistical procedures. ...

  17. Graph-theoretic measures of multivariate association and prediction

    International Nuclear Information System (INIS)

    Friedman, J.H.; Rafsky, L.C.

    1983-01-01

    Interpoint-distance-based graphs can be used to define measures of association that extend Kendall's notion of a generalized correlation coefficient. The authors present particular statistics that provide distribution-free tests of independence sensitive to alternatives involving non-monotonic relationships. Moreover, since ordering plays no essential role, the ideas that fully applicable in a multivariate setting. The authors also define an asymmetric coefficient measuring the extent to which (a vector) X can be used to make single-valued predictions of (a vector) Y. The authors discuss various techniques for proving that such statistics are asymptotically normal. As an example of the effectiveness of their approach, the authors present an application to the examination of residuals from multiple regression. 18 references, 2 figures, 1 table

  18. Batch-to-Batch Quality Consistency Evaluation of Botanical Drug Products Using Multivariate Statistical Analysis of the Chromatographic Fingerprint

    OpenAIRE

    Xiong, Haoshu; Yu, Lawrence X.; Qu, Haibin

    2013-01-01

    Botanical drug products have batch-to-batch quality variability due to botanical raw materials and the current manufacturing process. The rational evaluation and control of product quality consistency are essential to ensure the efficacy and safety. Chromatographic fingerprinting is an important and widely used tool to characterize the chemical composition of botanical drug products. Multivariate statistical analysis has showed its efficacy and applicability in the quality evaluation of many ...

  19. Spatial and temporal variation of water quality of a segment of Marikina River using multivariate statistical methods.

    Science.gov (United States)

    Chounlamany, Vanseng; Tanchuling, Maria Antonia; Inoue, Takanobu

    2017-09-01

    Payatas landfill in Quezon City, Philippines, releases leachate to the Marikina River through a creek. Multivariate statistical techniques were applied to study temporal and spatial variations in water quality of a segment of the Marikina River. The data set included 12 physico-chemical parameters for five monitoring stations over a year. Cluster analysis grouped the monitoring stations into four clusters and identified January-May as dry season and June-September as wet season. Principal components analysis showed that three latent factors are responsible for the data set explaining 83% of its total variance. The chemical oxygen demand, biochemical oxygen demand, total dissolved solids, Cl - and PO 4 3- are influenced by anthropogenic impact/eutrophication pollution from point sources. Total suspended solids, turbidity and SO 4 2- are influenced by rain and soil erosion. The highest state of pollution is at the Payatas creek outfall from March to May, whereas at downstream stations it is in May. The current study indicates that the river monitoring requires only four stations, nine water quality parameters and testing over three specific months of the year. The findings of this study imply that Payatas landfill requires a proper leachate collection and treatment system to reduce its impact on the Marikina River.

  20. PIXE multivariate statistics and OSL investigation for the classification and dating of archaeological pottery excavated at Tell Al-Rawda site, Syria

    Energy Technology Data Exchange (ETDEWEB)

    Bakraji, E.H., E-mail: cscientificl@aec.org.sy [Archaeometry Laboratory, Chemistry Department, Atomic Energy Commission of Syria, P. O. Box 6091, Damascus (Syrian Arab Republic); Rihawy, M.S. [Archaeometry Laboratory, Chemistry Department, Atomic Energy Commission of Syria, P. O. Box 6091, Damascus (Syrian Arab Republic); Castel, C. [CNRS – Maison de l’Orient et de la Méditerranée, Laboratoire “Archéorient”, CNRS/Université Lumière-Lyon 2 (France); Abboud, R. [Archaeometry Laboratory, Chemistry Department, Atomic Energy Commission of Syria, P. O. Box 6091, Damascus (Syrian Arab Republic)

    2015-03-15

    Highlights: •PIXE and OSL methods were used to classify and date pottery from Tell Al-Rawda site. •Three groups were classified using PIXE, which suggest different sources of the clay. •OSL was used for dating the site and the date found was consistent with typology. -- Abstract: Particle Induced X-ray Emission (PIXE) technique has been utilised to study 48 Syrian ancient pottery fragments taken from excavations at Tell Al-Rawda site. Eighteen elements (Mg, Al, Si, P, S, K, Ca, Ti, Mn, Fe, Ni, Zn, As, Br, Rb, Sr, Y, and Pb) were determined. The elements concentrations have been processed using two multivariate statistical methods, to classify the pottery where one main group and other two small groups were defined. In addition, four samples from different places on the site were subjected to optically stimulated luminescence (OSL) dating. The average age obtained using a single aliquot regeneration (SAR) protocol was found to be 4350 ± 240 year.

  1. PIXE multivariate statistics and OSL investigation for the classification and dating of archaeological pottery excavated at Tell Al-Rawda site, Syria

    International Nuclear Information System (INIS)

    Bakraji, E.H.; Rihawy, M.S.; Castel, C.; Abboud, R.

    2015-01-01

    Highlights: •PIXE and OSL methods were used to classify and date pottery from Tell Al-Rawda site. •Three groups were classified using PIXE, which suggest different sources of the clay. •OSL was used for dating the site and the date found was consistent with typology. -- Abstract: Particle Induced X-ray Emission (PIXE) technique has been utilised to study 48 Syrian ancient pottery fragments taken from excavations at Tell Al-Rawda site. Eighteen elements (Mg, Al, Si, P, S, K, Ca, Ti, Mn, Fe, Ni, Zn, As, Br, Rb, Sr, Y, and Pb) were determined. The elements concentrations have been processed using two multivariate statistical methods, to classify the pottery where one main group and other two small groups were defined. In addition, four samples from different places on the site were subjected to optically stimulated luminescence (OSL) dating. The average age obtained using a single aliquot regeneration (SAR) protocol was found to be 4350 ± 240 year

  2. Hydrochemical evolution and groundwater flow processes in the Galilee and Eromanga basins, Great Artesian Basin, Australia: a multivariate statistical approach.

    Science.gov (United States)

    Moya, Claudio E; Raiber, Matthias; Taulis, Mauricio; Cox, Malcolm E

    2015-03-01

    The Galilee and Eromanga basins are sub-basins of the Great Artesian Basin (GAB). In this study, a multivariate statistical approach (hierarchical cluster analysis, principal component analysis and factor analysis) is carried out to identify hydrochemical patterns and assess the processes that control hydrochemical evolution within key aquifers of the GAB in these basins. The results of the hydrochemical assessment are integrated into a 3D geological model (previously developed) to support the analysis of spatial patterns of hydrochemistry, and to identify the hydrochemical and hydrological processes that control hydrochemical variability. In this area of the GAB, the hydrochemical evolution of groundwater is dominated by evapotranspiration near the recharge area resulting in a dominance of the Na-Cl water types. This is shown conceptually using two selected cross-sections which represent discrete groundwater flow paths from the recharge areas to the deeper parts of the basins. With increasing distance from the recharge area, a shift towards a dominance of carbonate (e.g. Na-HCO3 water type) has been observed. The assessment of hydrochemical changes along groundwater flow paths highlights how aquifers are separated in some areas, and how mixing between groundwater from different aquifers occurs elsewhere controlled by geological structures, including between GAB aquifers and coal bearing strata of the Galilee Basin. The results of this study suggest that distinct hydrochemical differences can be observed within the previously defined Early Cretaceous-Jurassic aquifer sequence of the GAB. A revision of the two previously recognised hydrochemical sequences is being proposed, resulting in three hydrochemical sequences based on systematic differences in hydrochemistry, salinity and dominant hydrochemical processes. The integrated approach presented in this study which combines different complementary multivariate statistical techniques with a detailed assessment of the

  3. Multivariate analysis of heavy metal contamination using river sediment cores of Nankan River, northern Taiwan

    Science.gov (United States)

    Lee, An-Sheng; Lu, Wei-Li; Huang, Jyh-Jaan; Chang, Queenie; Wei, Kuo-Yen; Lin, Chin-Jung; Liou, Sofia Ya Hsuan

    2016-04-01

    Through the geology and climate characteristic in Taiwan, generally rivers carry a lot of suspended particles. After these particles settled, they become sediments which are good sorbent for heavy metals in river system. Consequently, sediments can be found recording contamination footprint at low flow energy region, such as estuary. Seven sediment cores were collected along Nankan River, northern Taiwan, which is seriously contaminated by factory, household and agriculture input. Physico-chemical properties of these cores were derived from Itrax-XRF Core Scanner and grain size analysis. In order to interpret these complex data matrices, the multivariate statistical techniques (cluster analysis, factor analysis and discriminant analysis) were introduced to this study. Through the statistical determination, the result indicates four types of sediment. One of them represents contamination event which shows high concentration of Cu, Zn, Pb, Ni and Fe, and low concentration of Si and Zr. Furthermore, three possible contamination sources of this type of sediment were revealed by Factor Analysis. The combination of sediment analysis and multivariate statistical techniques used provides new insights into the contamination depositional history of Nankan River and could be similarly applied to other river systems to determine the scale of anthropogenic contamination.

  4. The statistical chopper in the time-of-flight technique

    International Nuclear Information System (INIS)

    Albuquerque Vieira, J. de.

    1975-12-01

    A detailed study of the 'statistical' chopper and of the method of analysis of the data obtained by this technique is made. The study includes the basic ideas behind correlation methods applied in time-of-flight techniques; comparisons with the conventional chopper made by an analysis of statistical errors; the development of a FORTRAN computer programme to analyse experimental results; the presentation of the related fields of work to demonstrate the potential of this method and suggestions for future study together with the criteria for a time-of-flight experiment using the method being studied [pt

  5. Application of combined multivariate techniques for the description of time-resolved powder X-ray diffraction data

    Czech Academy of Sciences Publication Activity Database

    Taris, A.; Grosso, M.; Brundu, M.; Guida, V.; Viani, Alberto

    2017-01-01

    Roč. 50, č. 2 (2017), s. 451-461 ISSN 1600-5767 R&D Projects: GA MŠk(CZ) LO1219 Keywords : in situ X-ray powder diffraction * amorphous content * chemically bonded ceramic s * statistical total correlation spectroscopy * multivariate curve resolution Subject RIV: JJ - Other Materials OBOR OECD: Materials engineering Impact factor: 2.495, year: 2016 http://journals.iucr.org/j/issues/2017/02/00/ap5006/index.html

  6. Data base for the analysis of compositional characteristics of coal seams and macerals. Final report - Part 10. Variability in the inorganic content of United States' coals: a multivariate statistical study

    Energy Technology Data Exchange (ETDEWEB)

    Glick, D.C.; Davis, A.

    1984-07-01

    The multivariate statistical techniques of correlation coefficients, factor analysis, and cluster analysis, implemented by computer programs, can be used to process a large data set and produce a summary of relationships between variables and between samples. These techniques were used to find relationships for data on the inorganic constituents of US coals. Three hundred thirty-five whole-seam channel samples from six US coal provinces were analyzed for inorganic variables. After consideration of the attributes of data expressed on ash basis and whole-coal basis, it was decided to perform complete statistical analyses on both data sets. Thirty variables expressed on whole-coal basis and twenty-six variables expressed on ash basis were used. For each inorganic variable, a frequency distribution histogram and a set of summary statistics was produced. These were subdivided to reveal the manner in which concentrations of inorganic constituents vary between coal provinces and between coal regions. Data collected on 124 samples from three stratigraphic groups (Pottsville, Monongahela, Allegheny) in the Appalachian region were studied using analysis of variance to determine degree of variability between stratigraphic levels. Most variables showed differences in mean values between the three groups. 193 references, 71 figures, 54 tables.

  7. Statistic techniques of process control for MTR type

    International Nuclear Information System (INIS)

    Oliveira, F.S.; Ferrufino, F.B.J.; Santos, G.R.T.; Lima, R.M.

    2002-01-01

    This work aims at introducing some improvements on the fabrication of MTR type fuel plates, applying statistic techniques of process control. The work was divided into four single steps and their data were analyzed for: fabrication of U 3 O 8 fuel plates; fabrication of U 3 Si 2 fuel plates; rolling of small lots of fuel plates; applying statistic tools and standard specifications to perform a comparative study of these processes. (author)

  8. Multivariate analysis techniques

    Energy Technology Data Exchange (ETDEWEB)

    Bendavid, Josh [European Organization for Nuclear Research (CERN), Geneva (Switzerland); Fisher, Wade C. [Michigan State Univ., East Lansing, MI (United States); Junk, Thomas R. [Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)

    2016-01-01

    The end products of experimental data analysis are designed to be simple and easy to understand: hypothesis tests and measurements of parameters. But, the experimental data themselves are voluminous and complex. Furthermore, in modern collider experiments, many petabytes of data must be processed in search of rare new processes which occur together with much more copious background processes that are of less interest to the task at hand. The systematic uncertainties on the background may be larger than the expected signal in many cases. The statistical power of an analysis and its sensitivity to systematic uncertainty can therefore usually both be improved by separating signal events from background events with higher efficiency and purity.

  9. Extracting bb Higgs Decay Signals using Multivariate Techniques

    Energy Technology Data Exchange (ETDEWEB)

    Smith, W Clarke; /George Washington U. /SLAC

    2012-08-28

    For low-mass Higgs boson production at ATLAS at {radical}s = 7 TeV, the hard subprocess gg {yields} h{sup 0} {yields} b{bar b} dominates but is in turn drowned out by background. We seek to exploit the intrinsic few-MeV mass width of the Higgs boson to observe it above the background in b{bar b}-dijet mass plots. The mass resolution of existing mass-reconstruction algorithms is insufficient for this purpose due to jet combinatorics, that is, the algorithms cannot identify every jet that results from b{bar b} Higgs decay. We combine these algorithms using the neural net (NN) and boosted regression tree (BDT) multivariate methods in attempt to improve the mass resolution. Events involving gg {yields} h{sup 0} {yields} b{bar b} are generated using Monte Carlo methods with Pythia and then the Toolkit for Multivariate Analysis (TMVA) is used to train and test NNs and BDTs. For a 120 GeV Standard Model Higgs boson, the m{sub h{sup 0}}-reconstruction width is reduced from 8.6 to 6.5 GeV. Most importantly, however, the methods used here allow for more advanced m{sub h{sup 0}}-reconstructions to be created in the future using multivariate methods.

  10. Multivariate moment closure techniques for stochastic kinetic models

    International Nuclear Information System (INIS)

    Lakatos, Eszter; Ale, Angelique; Kirk, Paul D. W.; Stumpf, Michael P. H.

    2015-01-01

    Stochastic effects dominate many chemical and biochemical processes. Their analysis, however, can be computationally prohibitively expensive and a range of approximation schemes have been proposed to lighten the computational burden. These, notably the increasingly popular linear noise approximation and the more general moment expansion methods, perform well for many dynamical regimes, especially linear systems. At higher levels of nonlinearity, it comes to an interplay between the nonlinearities and the stochastic dynamics, which is much harder to capture correctly by such approximations to the true stochastic processes. Moment-closure approaches promise to address this problem by capturing higher-order terms of the temporally evolving probability distribution. Here, we develop a set of multivariate moment-closures that allows us to describe the stochastic dynamics of nonlinear systems. Multivariate closure captures the way that correlations between different molecular species, induced by the reaction dynamics, interact with stochastic effects. We use multivariate Gaussian, gamma, and lognormal closure and illustrate their use in the context of two models that have proved challenging to the previous attempts at approximating stochastic dynamics: oscillations in p53 and Hes1. In addition, we consider a larger system, Erk-mediated mitogen-activated protein kinases signalling, where conventional stochastic simulation approaches incur unacceptably high computational costs

  11. Multivariate moment closure techniques for stochastic kinetic models

    Energy Technology Data Exchange (ETDEWEB)

    Lakatos, Eszter, E-mail: e.lakatos13@imperial.ac.uk; Ale, Angelique; Kirk, Paul D. W.; Stumpf, Michael P. H., E-mail: m.stumpf@imperial.ac.uk [Department of Life Sciences, Centre for Integrative Systems Biology and Bioinformatics, Imperial College London, London SW7 2AZ (United Kingdom)

    2015-09-07

    Stochastic effects dominate many chemical and biochemical processes. Their analysis, however, can be computationally prohibitively expensive and a range of approximation schemes have been proposed to lighten the computational burden. These, notably the increasingly popular linear noise approximation and the more general moment expansion methods, perform well for many dynamical regimes, especially linear systems. At higher levels of nonlinearity, it comes to an interplay between the nonlinearities and the stochastic dynamics, which is much harder to capture correctly by such approximations to the true stochastic processes. Moment-closure approaches promise to address this problem by capturing higher-order terms of the temporally evolving probability distribution. Here, we develop a set of multivariate moment-closures that allows us to describe the stochastic dynamics of nonlinear systems. Multivariate closure captures the way that correlations between different molecular species, induced by the reaction dynamics, interact with stochastic effects. We use multivariate Gaussian, gamma, and lognormal closure and illustrate their use in the context of two models that have proved challenging to the previous attempts at approximating stochastic dynamics: oscillations in p53 and Hes1. In addition, we consider a larger system, Erk-mediated mitogen-activated protein kinases signalling, where conventional stochastic simulation approaches incur unacceptably high computational costs.

  12. Use of multivariate statistics to identify unreliable data obtained using CASA.

    Science.gov (United States)

    Martínez, Luis Becerril; Crispín, Rubén Huerta; Mendoza, Maximino Méndez; Gallegos, Oswaldo Hernández; Martínez, Andrés Aragón

    2013-06-01

    In order to identify unreliable data in a dataset of motility parameters obtained from a pilot study acquired by a veterinarian with experience in boar semen handling, but without experience in the operation of a computer assisted sperm analysis (CASA) system, a multivariate graphical and statistical analysis was performed. Sixteen boar semen samples were aliquoted then incubated with varying concentrations of progesterone from 0 to 3.33 µg/ml and analyzed in a CASA system. After standardization of the data, Chernoff faces were pictured for each measurement, and a principal component analysis (PCA) was used to reduce the dimensionality and pre-process the data before hierarchical clustering. The first twelve individual measurements showed abnormal features when Chernoff faces were drawn. PCA revealed that principal components 1 and 2 explained 63.08% of the variance in the dataset. Values of principal components for each individual measurement of semen samples were mapped to identify differences among treatment or among boars. Twelve individual measurements presented low values of principal component 1. Confidence ellipses on the map of principal components showed no statistically significant effects for treatment or boar. Hierarchical clustering realized on two first principal components produced three clusters. Cluster 1 contained evaluations of the two first samples in each treatment, each one of a different boar. With the exception of one individual measurement, all other measurements in cluster 1 were the same as observed in abnormal Chernoff faces. Unreliable data in cluster 1 are probably related to the operator inexperience with a CASA system. These findings could be used to objectively evaluate the skill level of an operator of a CASA system. This may be particularly useful in the quality control of semen analysis using CASA systems.

  13. Multivariate calibration applied to the quantitative analysis of infrared spectra

    Energy Technology Data Exchange (ETDEWEB)

    Haaland, D.M.

    1991-01-01

    Multivariate calibration methods are very useful for improving the precision, accuracy, and reliability of quantitative spectral analyses. Spectroscopists can more effectively use these sophisticated statistical tools if they have a qualitative understanding of the techniques involved. A qualitative picture of the factor analysis multivariate calibration methods of partial least squares (PLS) and principal component regression (PCR) is presented using infrared calibrations based upon spectra of phosphosilicate glass thin films on silicon wafers. Comparisons of the relative prediction abilities of four different multivariate calibration methods are given based on Monte Carlo simulations of spectral calibration and prediction data. The success of multivariate spectral calibrations is demonstrated for several quantitative infrared studies. The infrared absorption and emission spectra of thin-film dielectrics used in the manufacture of microelectronic devices demonstrate rapid, nondestructive at-line and in-situ analyses using PLS calibrations. Finally, the application of multivariate spectral calibrations to reagentless analysis of blood is presented. We have found that the determination of glucose in whole blood taken from diabetics can be precisely monitored from the PLS calibration of either mind- or near-infrared spectra of the blood. Progress toward the non-invasive determination of glucose levels in diabetics is an ultimate goal of this research. 13 refs., 4 figs.

  14. Multivariate Process Control with Autocorrelated Data

    DEFF Research Database (Denmark)

    Kulahci, Murat

    2011-01-01

    As sensor and computer technology continues to improve, it becomes a normal occurrence that we confront with high dimensional data sets. As in many areas of industrial statistics, this brings forth various challenges in statistical process control and monitoring. This new high dimensional data...... often exhibit not only cross-­‐correlation among the quality characteristics of interest but also serial dependence as a consequence of high sampling frequency and system dynamics. In practice, the most common method of monitoring multivariate data is through what is called the Hotelling’s T2 statistic....... In this paper, we discuss the effect of autocorrelation (when it is ignored) on multivariate control charts based on these methods and provide some practical suggestions and remedies to overcome this problem....

  15. Multivariate stochastic simulation with subjective multivariate normal distributions

    Science.gov (United States)

    P. J. Ince; J. Buongiorno

    1991-01-01

    In many applications of Monte Carlo simulation in forestry or forest products, it may be known that some variables are correlated. However, for simplicity, in most simulations it has been assumed that random variables are independently distributed. This report describes an alternative Monte Carlo simulation technique for subjectively assesed multivariate normal...

  16. [Retrospective statistical analysis of clinical factors of recurrence in chronic subdural hematoma: correlation between univariate and multivariate analysis].

    Science.gov (United States)

    Takayama, Motoharu; Terui, Keita; Oiwa, Yoshitsugu

    2012-10-01

    Chronic subdural hematoma is common in elderly individuals and surgical procedures are simple. The recurrence rate of chronic subdural hematoma, however, varies from 9.2 to 26.5% after surgery. The authors studied factors of the recurrence using univariate and multivariate analyses in patients with chronic subdural hematoma We retrospectively reviewed 239 consecutive cases of chronic subdural hematoma who received burr-hole surgery with irrigation and closed-system drainage. We analyzed the relationships between recurrence of chronic subdural hematoma and factors such as sex, age, laterality, bleeding tendency, other complicated diseases, density on CT, volume of the hematoma, residual air in the hematoma cavity, use of artificial cerebrospinal fluid. Twenty-one patients (8.8%) experienced a recurrence of chronic subdural hematoma. Multiple logistic regression found that the recurrence rate was higher in patients with a large volume of the residual air, and was lower in patients using artificial cerebrospinal fluid. No statistical differences were found in bleeding tendency. Techniques to reduce the air in the hematoma cavity are important for good outcome in surgery of chronic subdural hematoma. Also, the use of artificial cerebrospinal fluid reduces recurrence of chronic subdural hematoma. The surgical procedures can be the same for patients with bleeding tendencies.

  17. An overview of multivariate gamma distributions as seen from a (multivariate) matrix exponential perspective

    DEFF Research Database (Denmark)

    Bladt, Mogens; Nielsen, Bo Friis

    2012-01-01

    Laplace transform. In a longer perspective stochastic and statistical analysis for MVME will in particular apply to any of the previously defined distributions. Multivariate gamma distributions have been used in a variety of fields like hydrology, [11], [10], [6], space (wind modeling) [9] reliability [3......Numerous definitions of multivariate exponential and gamma distributions can be retrieved from the literature [4]. These distribtuions belong to the class of Multivariate Matrix-- Exponetial Distributions (MVME) whenever their joint Laplace transform is a rational function. The majority...... of these distributions further belongs to an important subclass of MVME distributions [5, 1] where the multivariate random vector can be interpreted as a number of simultaneously collected rewards during sojourns in a the states of a Markov chain with one absorbing state, the rest of the states being transient. We...

  18. Application of multivariate statistical analysis in the pollution and health risk of traffic-related heavy metals.

    Science.gov (United States)

    Ebqa'ai, Mohammad; Ibrahim, Bashar

    2017-12-01

    This study aims to analyse the heavy metal pollutants in Jeddah, the second largest city in the Gulf Cooperation Council with a population exceeding 3.5 million, and many vehicles. Ninety-eight street dust samples were collected seasonally from the six major roads as well as the Jeddah Beach, and subsequently digested using modified Leeds Public Analyst method. The heavy metals (Fe, Zn, Mn, Cu, Cd, and Pb) were extracted from the ash using methyl isobutyl ketone as solvent extraction and eventually analysed by atomic absorption spectroscopy. Multivariate statistical techniques, principal component analysis (PCA), and hierarchical cluster analysis were applied to these data. Heavy metal concentrations were ranked according to the following descending order: Fe > Zn > Mn > Cu > Pb > Cd. In order to study the pollution and health risk from these heavy metals as well as estimating their effect on the environment, pollution indices, integrated pollution index, enrichment factor, daily dose average, hazard quotient, and hazard index were all analysed. The PCA showed high levels of Zn, Fe, and Cd in Al Kurnish road, while these elements were consistently detected on King Abdulaziz and Al Madina roads. The study indicates that high levels of Zn and Pb pollution were recorded for major roads in Jeddah. Six out of seven roads had high pollution indices. This study is the first step towards further investigations into current health problems in Jeddah, such as anaemia and asthma.

  19. Mulch materials in processing tomato: a multivariate approach

    Directory of Open Access Journals (Sweden)

    Marta María Moreno

    2013-08-01

    Full Text Available Mulch materials of different origins have been introduced into the agricultural sector in recent years alternatively to the standard polyethylene due to its environmental impact. This study aimed to evaluate the multivariate response of mulch materials over three consecutive years in a processing tomato (Solanum lycopersicon L. crop in Central Spain. Two biodegradable plastic mulches (BD1, BD2, one oxo-biodegradable material (OB, two types of paper (PP1, PP2, and one barley straw cover (BS were compared using two control treatments (standard black polyethylene [PE] and manual weed control [MW]. A total of 17 variables relating to yield, fruit quality, and weed control were investigated. Several multivariate statistical techniques were applied, including principal component analysis, cluster analysis, and discriminant analysis. A group of mulch materials comprised of OB and BD2 was found to be comparable to black polyethylene regarding all the variables considered. The weed control variables were found to be an important source of discrimination. The two paper mulches tested did not share the same treatment group membership in any case: PP2 presented a multivariate response more similar to the biodegradable plastics, while PP1 was more similar to BS and MW. Based on our multivariate approach, the materials OB and BD2 can be used as an effective, more environmentally friendly alternative to polyethylene mulches.

  20. Multivariate data analysis approach to understand magnetic properties of perovskite manganese oxides

    International Nuclear Information System (INIS)

    Imamura, N.; Mizoguchi, T.; Yamauchi, H.; Karppinen, M.

    2008-01-01

    Here we apply statistical multivariate data analysis techniques to obtain some insights into the complex structure-property relations in antiferromagnetic (AFM) and ferromagnetic (FM) manganese perovskite systems, AMnO 3 . The 131 samples included in the present analyses are described by 21 crystal-structure or crystal-chemical (CS/CC) parameters. Principal component analysis (PCA), carried out separately for the AFM and FM compounds, is used to model and evaluate the various relationships among the magnetic properties and the various CS/CC parameters. Moreover, for the AFM compounds, PLS (partial least squares projections to latent structures) analysis is performed so as to predict the magnitude of the Neel temperature on the bases of the CS/CC parameters. Finally, so-called PLS-DA (PLS discriminant analysis) method is employed to find out the most influential/characteristic CS/CC parameters that differentiate the two classes of compounds from each other. - Graphical abstract: Statistical multivariate data analysis techniques are applied to detect structure-property relations in antiferromagnetic (AFM) and ferromagnetic (FM) manganese perovskites. For AFM compounds, partial least squares projections to latent structures analysis predict the magnitude of the Neel temperature on the bases of structural parameters only. Moreover, AFM and FM compounds are well separated by means of so-called partial least squares discriminant analysis method

  1. Forensic classification of counterfeit banknote paper by X-ray fluorescence and multivariate statistical methods.

    Science.gov (United States)

    Guo, Hongling; Yin, Baohua; Zhang, Jie; Quan, Yangke; Shi, Gaojun

    2016-09-01

    Counterfeiting of banknotes is a crime and seriously harmful to economy. Examination of the paper, ink and toners used to make counterfeit banknotes can provide useful information to classify and link different cases in which the suspects use the same raw materials. In this paper, 21 paper samples of counterfeit banknotes seized from 13 cases were analyzed by wavelength dispersive X-ray fluorescence. After measuring the elemental composition in paper semi-quantitatively, the normalized weight percentage data of 10 elements were processed by multivariate statistical methods of cluster analysis and principle component analysis. All these paper samples were mainly classified into 3 groups. Nine separate cases were successfully linked. It is demonstrated that elemental composition measured by XRF is a useful way to compare and classify papers used in different cases. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  2. Multivariate performance reliability prediction in real-time

    International Nuclear Information System (INIS)

    Lu, S.; Lu, H.; Kolarik, W.J.

    2001-01-01

    This paper presents a technique for predicting system performance reliability in real-time considering multiple failure modes. The technique includes on-line multivariate monitoring and forecasting of selected performance measures and conditional performance reliability estimates. The performance measures across time are treated as a multivariate time series. A state-space approach is used to model the multivariate time series. Recursive forecasting is performed by adopting Kalman filtering. The predicted mean vectors and covariance matrix of performance measures are used for the assessment of system survival/reliability with respect to the conditional performance reliability. The technique and modeling protocol discussed in this paper provide a means to forecast and evaluate the performance of an individual system in a dynamic environment in real-time. The paper also presents an example to demonstrate the technique

  3. Lasso and probabilistic inequalities for multivariate point processes

    DEFF Research Database (Denmark)

    Hansen, Niels Richard; Reynaud-Bouret, Patricia; Rivoirard, Vincent

    2015-01-01

    Due to its low computational cost, Lasso is an attractive regularization method for high-dimensional statistical settings. In this paper, we consider multivariate counting processes depending on an unknown function parameter to be estimated by linear combinations of a fixed dictionary. To select...... for multivariate Hawkes processes are proven, which allows us to check these assumptions by considering general dictionaries based on histograms, Fourier or wavelet bases. Motivated by problems of neuronal activity inference, we finally carry out a simulation study for multivariate Hawkes processes and compare our...... methodology with the adaptive Lasso procedure proposed by Zou in (J. Amer. Statist. Assoc. 101 (2006) 1418–1429). We observe an excellent behavior of our procedure. We rely on theoretical aspects for the essential question of tuning our methodology. Unlike adaptive Lasso of (J. Amer. Statist. Assoc. 101 (2006...

  4. Advanced structural equation modeling issues and techniques

    CERN Document Server

    Marcoulides, George A

    2013-01-01

    By focusing primarily on the application of structural equation modeling (SEM) techniques in example cases and situations, this book provides an understanding and working knowledge of advanced SEM techniques with a minimum of mathematical derivations. The book was written for a broad audience crossing many disciplines, assumes an understanding of graduate level multivariate statistics, including an introduction to SEM.

  5. A standards-based method for compositional analysis by energy dispersive X-ray spectrometry using multivariate statistical analysis: application to multicomponent alloys.

    Science.gov (United States)

    Rathi, Monika; Ahrenkiel, S P; Carapella, J J; Wanlass, M W

    2013-02-01

    Given an unknown multicomponent alloy, and a set of standard compounds or alloys of known composition, can one improve upon popular standards-based methods for energy dispersive X-ray (EDX) spectrometry to quantify the elemental composition of the unknown specimen? A method is presented here for determining elemental composition of alloys using transmission electron microscopy-based EDX with appropriate standards. The method begins with a discrete set of related reference standards of known composition, applies multivariate statistical analysis to those spectra, and evaluates the compositions with a linear matrix algebra method to relate the spectra to elemental composition. By using associated standards, only limited assumptions about the physical origins of the EDX spectra are needed. Spectral absorption corrections can be performed by providing an estimate of the foil thickness of one or more reference standards. The technique was applied to III-V multicomponent alloy thin films: composition and foil thickness were determined for various III-V alloys. The results were then validated by comparing with X-ray diffraction and photoluminescence analysis, demonstrating accuracy of approximately 1% in atomic fraction.

  6. Lightweight and Statistical Techniques for Petascale PetaScale Debugging

    Energy Technology Data Exchange (ETDEWEB)

    Miller, Barton

    2014-06-30

    This project investigated novel techniques for debugging scientific applications on petascale architectures. In particular, we developed lightweight tools that narrow the problem space when bugs are encountered. We also developed techniques that either limit the number of tasks and the code regions to which a developer must apply a traditional debugger or that apply statistical techniques to provide direct suggestions of the location and type of error. We extend previous work on the Stack Trace Analysis Tool (STAT), that has already demonstrated scalability to over one hundred thousand MPI tasks. We also extended statistical techniques developed to isolate programming errors in widely used sequential or threaded applications in the Cooperative Bug Isolation (CBI) project to large scale parallel applications. Overall, our research substantially improved productivity on petascale platforms through a tool set for debugging that complements existing commercial tools. Previously, Office Of Science application developers relied either on primitive manual debugging techniques based on printf or they use tools, such as TotalView, that do not scale beyond a few thousand processors. However, bugs often arise at scale and substantial effort and computation cycles are wasted in either reproducing the problem in a smaller run that can be analyzed with the traditional tools or in repeated runs at scale that use the primitive techniques. New techniques that work at scale and automate the process of identifying the root cause of errors were needed. These techniques significantly reduced the time spent debugging petascale applications, thus leading to a greater overall amount of time for application scientists to pursue the scientific objectives for which the systems are purchased. We developed a new paradigm for debugging at scale: techniques that reduced the debugging scenario to a scale suitable for traditional debuggers, e.g., by narrowing the search for the root-cause analysis

  7. A multivariate statistical methodology for detection of degradation and failure trends using nuclear power plant operational data

    International Nuclear Information System (INIS)

    Samanta, P.K.; Teichmann, T.

    1990-01-01

    In this paper, a multivariate statistical method is presented and demonstrated as a means for analyzing nuclear power plant transients (or events) and safety system performance for detection of malfunctions and degradations within the course of the event based on operational data. The study provides the methodology and illustrative examples based on data gathered from simulation of nuclear power plant transients (due to lack of easily accessible operational data). Such an approach, once fully developed, can be used to detect failure trends and patterns and so can lead to prevention of conditions with serious safety implications

  8. Multivariate Regression Analysis and Slaughter Livestock,

    Science.gov (United States)

    AGRICULTURE, *ECONOMICS), (*MEAT, PRODUCTION), MULTIVARIATE ANALYSIS, REGRESSION ANALYSIS , ANIMALS, WEIGHT, COSTS, PREDICTIONS, STABILITY, MATHEMATICAL MODELS, STORAGE, BEEF, PORK, FOOD, STATISTICAL DATA, ACCURACY

  9. An Automated Energy Detection Algorithm Based on Morphological and Statistical Processing Techniques

    Science.gov (United States)

    2018-01-09

    100 kHz, 1 MHz 100 MHz–1 GHz 1 100 kHz 3. Statistical Processing 3.1 Statistical Analysis Statistical analysis is the mathematical science...quantitative terms. In commercial prognostics and diagnostic vibrational monitoring applications , statistical techniques that are mainly used for alarm...Balakrishnan N, editors. Handbook of statistics . Amsterdam (Netherlands): Elsevier Science; 1998. p 555–602; (Order statistics and their applications

  10. Analysis and assessment on heavy metal sources in the coastal soils developed from alluvial deposits using multivariate statistical methods.

    Science.gov (United States)

    Li, Jinling; He, Ming; Han, Wei; Gu, Yifan

    2009-05-30

    An investigation on heavy metal sources, i.e., Cu, Zn, Ni, Pb, Cr, and Cd in the coastal soils of Shanghai, China, was conducted using multivariate statistical methods (principal component analysis, clustering analysis, and correlation analysis). All the results of the multivariate analysis showed that: (i) Cu, Ni, Pb, and Cd had anthropogenic sources (e.g., overuse of chemical fertilizers and pesticides, industrial and municipal discharges, animal wastes, sewage irrigation, etc.); (ii) Zn and Cr were associated with parent materials and therefore had natural sources (e.g., the weathering process of parent materials and subsequent pedo-genesis due to the alluvial deposits). The effect of heavy metals in the soils was greatly affected by soil formation, atmospheric deposition, and human activities. These findings provided essential information on the possible sources of heavy metals, which would contribute to the monitoring and assessment process of agricultural soils in worldwide regions.

  11. I - Multivariate Classification and Machine Learning in HEP

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    Traditional multivariate methods for classification (Stochastic Gradient Boosted Decision Trees and Multi-Layer Perceptrons) are explained in theory and practise using examples from HEP. General aspects of multivariate classification are discussed, in particular different regularisation techniques. Afterwards, data-driven techniques are introduced and compared to MC-based methods.

  12. Multivariate statistical analysis of stream sediments for mineral resources from the Craig NTMS Quadrangle, Colorado

    International Nuclear Information System (INIS)

    Beyth, M.; McInteer, C.; Broxton, D.E.; Bolivar, S.L.; Luke, M.E.

    1980-06-01

    Multivariate statistical analyses were carried out on Hydrogeochemical and Stream Sediment Reconnaissance data from the Craig quadrangle, Colorado, to support the National Uranium Resource Evaluation and to evaluate strategic or other important commercial mineral resources. A few areas for favorable uranium mineralization are suggested for parts of the Wyoming Basin, Park Range, and Gore Range. Six potential source rocks for uranium are postulated based on factor score mapping. Vanadium in stream sediments is suggested as a pathfinder for carnotite-type mineralization. A probable northwest trend of lead-zinc-copper mineralization associated with Tertiary intrusions is suggested. A few locations are mapped where copper is associated with cobalt. Concentrations of placer sands containing rare earth elements, probably of commercial value, are indicated for parts of the Sand Wash Basin

  13. A Cyber-Attack Detection Model Based on Multivariate Analyses

    Science.gov (United States)

    Sakai, Yuto; Rinsaka, Koichiro; Dohi, Tadashi

    In the present paper, we propose a novel cyber-attack detection model based on two multivariate-analysis methods to the audit data observed on a host machine. The statistical techniques used here are the well-known Hayashi's quantification method IV and cluster analysis method. We quantify the observed qualitative audit event sequence via the quantification method IV, and collect similar audit event sequence in the same groups based on the cluster analysis. It is shown in simulation experiments that our model can improve the cyber-attack detection accuracy in some realistic cases where both normal and attack activities are intermingled.

  14. Sparse multivariate measures of similarity between intra-modal neuroimaging datasets

    Directory of Open Access Journals (Sweden)

    Maria J. Rosa

    2015-10-01

    Full Text Available An increasing number of neuroimaging studies are now based on either combining more than one data modality (inter-modal or combining more than one measurement from the same modality (intra-modal. To date, most intra-modal studies using multivariate statistics have focused on differences between datasets, for instance relying on classifiers to differentiate between effects in the data. However, to fully characterize these effects, multivariate methods able to measure similarities between datasets are needed. One classical technique for estimating the relationship between two datasets is canonical correlation analysis (CCA. However, in the context of high-dimensional data the application of CCA is extremely challenging. A recent extension of CCA, sparse CCA (SCCA, overcomes this limitation, by regularizing the model parameters while yielding a sparse solution. In this work, we modify SCCA with the aim of facilitating its application to high-dimensional neuroimaging data and finding meaningful multivariate image-to-image correspondences in intra-modal studies. In particular, we show how the optimal subset of variables can be estimated independently and we look at the information encoded in more than one set of SCCA transformations. We illustrate our framework using Arterial Spin Labelling data to investigate multivariate similarities between the effects of two antipsychotic drugs on cerebral blood flow.

  15. Towards the disease biomarker in an individual patient using statistical health monitoring

    NARCIS (Netherlands)

    Engel, J.; Blanchet, L.M.; Engelke, U.F.; Wevers, R.A.; Buydens, L.M.

    2014-01-01

    In metabolomics, identification of complex diseases is often based on application of (multivariate) statistical techniques to the data. Commonly, each disease requires its own specific diagnostic model, separating healthy and diseased individuals, which is not very practical in a diagnostic setting.

  16. The classification of secondary colorectal liver cancer in human biopsy samples using angular dispersive x-ray diffraction and multivariate analysis

    International Nuclear Information System (INIS)

    Theodorakou, Chrysoula; Farquharson, Michael J

    2009-01-01

    The motivation behind this study is to assess whether angular dispersive x-ray diffraction (ADXRD) data, processed using multivariate analysis techniques, can be used for classifying secondary colorectal liver cancer tissue and normal surrounding liver tissue in human liver biopsy samples. The ADXRD profiles from a total of 60 samples of normal liver tissue and colorectal liver metastases were measured using a synchrotron radiation source. The data were analysed for 56 samples using nonlinear peak-fitting software. Four peaks were fitted to all of the ADXRD profiles, and the amplitude, area, amplitude and area ratios for three of the four peaks were calculated and used for the statistical and multivariate analysis. The statistical analysis showed that there are significant differences between all the peak-fitting parameters and ratios between the normal and the diseased tissue groups. The technique of soft independent modelling of class analogy (SIMCA) was used to classify normal liver tissue and colorectal liver metastases resulting in 67% of the normal tissue samples and 60% of the secondary colorectal liver tissue samples being classified correctly. This study has shown that the ADXRD data of normal and secondary colorectal liver cancer are statistically different and x-ray diffraction data analysed using multivariate analysis have the potential to be used as a method of tissue classification.

  17. Air Quality Pattern Assessment in Malaysia Using Multivariate Techniques

    International Nuclear Information System (INIS)

    Hamza Ahmad Isiyaka; Azman Azid

    2015-01-01

    This study aims to investigate the spatial characteristics in the pattern of air quality monitoring sites, identify the most discriminating parameters contributing to air pollution, and predict the level of air pollution index (API) in Malaysia using multivariate techniques. Five parameters observed for five years (2000-2004) were used. Hierarchical agglomerative cluster analysis classified the five air quality monitoring sites into two independent groups based on the characteristics of activities in the monitoring stations. Discriminate analysis for standard, backward stepwise and forward stepwise mode gave a correct assignation of more than 87 % in the confusion matrix. This result indicates that only three parameters (PM_1_0, SO_2 and NO_2) with a p<0.0001 discriminate best in polluting the air. The major possible sources of air pollution were identified using principal component analysis that account for more than 58 % and 60 % in the total variance. Based on the findings, anthropogenic activities (vehicular emission, industrial activities, construction sites, bush burning) have a strong influence in the source of air pollution. Furthermore, artificial neural network (ANN) was used to predict the level of air pollution index at R"2 = 0.8493 and RMSE = 5.9184. This indicates that ANN can predict more than 84 % of the API. (author)

  18. Multivariable Techniques for High-Speed Research Flight Control Systems

    Science.gov (United States)

    Newman, Brett A.

    1999-01-01

    This report describes the activities and findings conducted under contract with NASA Langley Research Center. Subject matter is the investigation of suitable multivariable flight control design methodologies and solutions for large, flexible high-speed vehicles. Specifically, methodologies are to address the inner control loops used for stabilization and augmentation of a highly coupled airframe system possibly involving rigid-body motion, structural vibrations, unsteady aerodynamics, and actuator dynamics. Design and analysis techniques considered in this body of work are both conventional-based and contemporary-based, and the vehicle of interest is the High-Speed Civil Transport (HSCT). Major findings include: (1) control architectures based on aft tail only are not well suited for highly flexible, high-speed vehicles, (2) theoretical underpinnings of the Wykes structural mode control logic is based on several assumptions concerning vehicle dynamic characteristics, and if not satisfied, the control logic can break down leading to mode destabilization, (3) two-loop control architectures that utilize small forward vanes with the aft tail provide highly attractive and feasible solutions to the longitudinal axis control challenges, and (4) closed-loop simulation sizing analyses indicate the baseline vane model utilized in this report is most likely oversized for normal loading conditions.

  19. Multivariate analysis of data in sensory science

    CERN Document Server

    Naes, T; Risvik, E

    1996-01-01

    The state-of-the-art of multivariate analysis in sensory science is described in this volume. Both methods for aggregated and individual sensory profiles are discussed. Processes and results are presented in such a way that they can be understood not only by statisticians but also by experienced sensory panel leaders and users of sensory analysis. The techniques presented are focused on examples and interpretation rather than on the technical aspects, with an emphasis on new and important methods which are possibly not so well known to scientists in the field. Important features of the book are discussions on the relationship among the methods with a strong accent on the connection between problems and methods. All procedures presented are described in relation to sensory data and not as completely general statistical techniques. Sensory scientists, applied statisticians, chemometricians, those working in consumer science, food scientists and agronomers will find this book of value.

  20. HORIZONTAL BRANCH MORPHOLOGY OF GLOBULAR CLUSTERS: A MULTIVARIATE STATISTICAL ANALYSIS

    International Nuclear Information System (INIS)

    Jogesh Babu, G.; Chattopadhyay, Tanuka; Chattopadhyay, Asis Kumar; Mondal, Saptarshi

    2009-01-01

    The proper interpretation of horizontal branch (HB) morphology is crucial to the understanding of the formation history of stellar populations. In the present study a multivariate analysis is used (principal component analysis) for the selection of appropriate HB morphology parameter, which, in our case, is the logarithm of effective temperature extent of the HB (log T effHB ). Then this parameter is expressed in terms of the most significant observed independent parameters of Galactic globular clusters (GGCs) separately for coherent groups, obtained in a previous work, through a stepwise multiple regression technique. It is found that, metallicity ([Fe/H]), central surface brightness (μ v ), and core radius (r c ) are the significant parameters to explain most of the variations in HB morphology (multiple R 2 ∼ 0.86) for GGC elonging to the bulge/disk while metallicity ([Fe/H]) and absolute magnitude (M v ) are responsible for GGC belonging to the inner halo (multiple R 2 ∼ 0.52). The robustness is tested by taking 1000 bootstrap samples. A cluster analysis is performed for the red giant branch (RGB) stars of the GGC belonging to Galactic inner halo (Cluster 2). A multi-episodic star formation is preferred for RGB stars of GGC belonging to this group. It supports the asymptotic giant branch (AGB) model in three episodes instead of two as suggested by Carretta et al. for halo GGC while AGB model is suggested to be revisited for bulge/disk GGC.

  1. [Monitoring method of extraction process for Schisandrae Chinensis Fructus based on near infrared spectroscopy and multivariate statistical process control].

    Science.gov (United States)

    Xu, Min; Zhang, Lei; Yue, Hong-Shui; Pang, Hong-Wei; Ye, Zheng-Liang; Ding, Li

    2017-10-01

    To establish an on-line monitoring method for extraction process of Schisandrae Chinensis Fructus, the formula medicinal material of Yiqi Fumai lyophilized injection by combining near infrared spectroscopy with multi-variable data analysis technology. The multivariate statistical process control (MSPC) model was established based on 5 normal batches in production and 2 test batches were monitored by PC scores, DModX and Hotelling T2 control charts. The results showed that MSPC model had a good monitoring ability for the extraction process. The application of the MSPC model to actual production process could effectively achieve on-line monitoring for extraction process of Schisandrae Chinensis Fructus, and can reflect the change of material properties in the production process in real time. This established process monitoring method could provide reference for the application of process analysis technology in the process quality control of traditional Chinese medicine injections. Copyright© by the Chinese Pharmaceutical Association.

  2. Exact null distributions of quadratic distribution-free statistics for two-way classification

    NARCIS (Netherlands)

    Wiel, van de M.A.

    2004-01-01

    Abstract We present new techniques for computing exact distributions of `Friedman-type¿ statistics. Representing the null distribution by a generating function allows for the use of general, not necessarily integer-valued rank scores. Moreover, we use symmetry properties of the multivariate

  3. Probability, statistics, and associated computing techniques

    International Nuclear Information System (INIS)

    James, F.

    1983-01-01

    This chapter attempts to explore the extent to which it is possible for the experimental physicist to find optimal statistical techniques to provide a unique and unambiguous quantitative measure of the significance of raw data. Discusses statistics as the inverse of probability; normal theory of parameter estimation; normal theory (Gaussian measurements); the universality of the Gaussian distribution; real-life resolution functions; combination and propagation of uncertainties; the sum or difference of 2 variables; local theory, or the propagation of small errors; error on the ratio of 2 discrete variables; the propagation of large errors; confidence intervals; classical theory; Bayesian theory; use of the likelihood function; the second derivative of the log-likelihood function; multiparameter confidence intervals; the method of MINOS; least squares; the Gauss-Markov theorem; maximum likelihood for uniform error distribution; the Chebyshev fit; the parameter uncertainties; the efficiency of the Chebyshev estimator; error symmetrization; robustness vs. efficiency; testing of hypotheses (e.g., the Neyman-Pearson test); goodness-of-fit; distribution-free tests; comparing two one-dimensional distributions; comparing multidimensional distributions; and permutation tests for comparing two point sets

  4. Combining heuristic and statistical techniques in landslide hazard assessments

    Science.gov (United States)

    Cepeda, Jose; Schwendtner, Barbara; Quan, Byron; Nadim, Farrokh; Diaz, Manuel; Molina, Giovanni

    2014-05-01

    As a contribution to the Global Assessment Report 2013 - GAR2013, coordinated by the United Nations International Strategy for Disaster Reduction - UNISDR, a drill-down exercise for landslide hazard assessment was carried out by entering the results of both heuristic and statistical techniques into a new but simple combination rule. The data available for this evaluation included landslide inventories, both historical and event-based. In addition to the application of a heuristic method used in the previous editions of GAR, the availability of inventories motivated the use of statistical methods. The heuristic technique is largely based on the Mora & Vahrson method, which estimates hazard as the product of susceptibility and triggering factors, where classes are weighted based on expert judgment and experience. Two statistical methods were also applied: the landslide index method, which estimates weights of the classes for the susceptibility and triggering factors based on the evidence provided by the density of landslides in each class of the factors; and the weights of evidence method, which extends the previous technique to include both positive and negative evidence of landslide occurrence in the estimation of weights for the classes. One key aspect during the hazard evaluation was the decision on the methodology to be chosen for the final assessment. Instead of opting for a single methodology, it was decided to combine the results of the three implemented techniques using a combination rule based on a normalization of the results of each method. The hazard evaluation was performed for both earthquake- and rainfall-induced landslides. The country chosen for the drill-down exercise was El Salvador. The results indicate that highest hazard levels are concentrated along the central volcanic chain and at the centre of the northern mountains.

  5. Visual Analysis of North Atlantic Hurricane Trends Using Parallel Coordinates and Statistical Techniques

    Science.gov (United States)

    2008-07-07

    analyzing multivariate data sets. The system was developed using the Java Development Kit (JDK) version 1.5; and it yields interactive performance on a... script and captures output from the MATLAB’s “regress” and “stepwisefit” utilities that perform simple and stepwise regression, respectively. The MATLAB...Statistical Association, vol. 85, no. 411, pp. 664–675, 1990. [9] H. Hauser, F. Ledermann, and H. Doleisch, “ Angular brushing of extended parallel coordinates

  6. MULTIVARIATE TECHNIQUES APPLIED TO EVALUATION OF LIGNOCELLULOSIC RESIDUES FOR BIOENERGY PRODUCTION

    Directory of Open Access Journals (Sweden)

    Thiago de Paula Protásio

    2013-12-01

    Full Text Available http://dx.doi.org/10.5902/1980509812361The evaluation of lignocellulosic wastes for bioenergy production demands to consider several characteristicsand properties that may be correlated. This fact demands the use of various multivariate analysis techniquesthat allow the evaluation of relevant energetic factors. This work aimed to apply cluster analysis and principalcomponents analyses for the selection and evaluation of lignocellulosic wastes for bioenergy production.8 types of residual biomass were used, whose the elemental components (C, H, O, N, S content, lignin, totalextractives and ashes contents, basic density and higher and lower heating values were determined. Bothmultivariate techniques applied for evaluation and selection of lignocellulosic wastes were efficient andsimilarities were observed between the biomass groups formed by them. Through the interpretation of thefirst principal component obtained, it was possible to create a global development index for the evaluationof the viability of energetic uses of biomass. The interpretation of the second principal component alloweda contrast between nitrogen and sulfur contents with oxygen content.

  7. Statistical and Machine-Learning Data Mining Techniques for Better Predictive Modeling and Analysis of Big Data

    CERN Document Server

    Ratner, Bruce

    2011-01-01

    The second edition of a bestseller, Statistical and Machine-Learning Data Mining: Techniques for Better Predictive Modeling and Analysis of Big Data is still the only book, to date, to distinguish between statistical data mining and machine-learning data mining. The first edition, titled Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data, contained 17 chapters of innovative and practical statistical data mining techniques. In this second edition, renamed to reflect the increased coverage of machine-learning data mining techniques, the author has

  8. Using the expected detection delay to assess the performance of different multivariate statistical process monitoring methods for multiplicative and drift faults.

    Science.gov (United States)

    Zhang, Kai; Shardt, Yuri A W; Chen, Zhiwen; Peng, Kaixiang

    2017-03-01

    Using the expected detection delay (EDD) index to measure the performance of multivariate statistical process monitoring (MSPM) methods for constant additive faults have been recently developed. This paper, based on a statistical investigation of the T 2 - and Q-test statistics, extends the EDD index to the multiplicative and drift fault cases. As well, it is used to assess the performance of common MSPM methods that adopt these two test statistics. Based on how to use the measurement space, these methods can be divided into two groups, those which consider the complete measurement space, for example, principal component analysis-based methods, and those which only consider some subspace that reflects changes in key performance indicators, such as partial least squares-based methods. Furthermore, a generic form for them to use T 2 - and Q-test statistics are given. With the extended EDD index, the performance of these methods to detect drift and multiplicative faults is assessed using both numerical simulations and the Tennessee Eastman process. Copyright © 2016 ISA. Published by Elsevier Ltd. All rights reserved.

  9. The application of statistical techniques to nuclear materials accountancy

    International Nuclear Information System (INIS)

    Annibal, P.S.; Roberts, P.D.

    1990-02-01

    Over the past decade much theoretical research has been carried out on the development of statistical methods for nuclear materials accountancy. In practice plant operation may differ substantially from the idealized models often cited. This paper demonstrates the importance of taking account of plant operation in applying the statistical techniques, to improve the accuracy of the estimates and the knowledge of the errors. The benefits are quantified either by theoretical calculation or by simulation. Two different aspects are considered; firstly, the use of redundant measurements to reduce the error on the estimate of the mass of heavy metal in an accountancy tank is investigated. Secondly, a means of improving the knowledge of the 'Material Unaccounted For' (the difference between the inventory calculated from input/output data, and the measured inventory), using information about the plant measurement system, is developed and compared with existing general techniques. (author)

  10. Air Quality Forecasting through Different Statistical and Artificial Intelligence Techniques

    Science.gov (United States)

    Mishra, D.; Goyal, P.

    2014-12-01

    Urban air pollution forecasting has emerged as an acute problem in recent years because there are sever environmental degradation due to increase in harmful air pollutants in the ambient atmosphere. In this study, there are different types of statistical as well as artificial intelligence techniques are used for forecasting and analysis of air pollution over Delhi urban area. These techniques are principle component analysis (PCA), multiple linear regression (MLR) and artificial neural network (ANN) and the forecasting are observed in good agreement with the observed concentrations through Central Pollution Control Board (CPCB) at different locations in Delhi. But such methods suffers from disadvantages like they provide limited accuracy as they are unable to predict the extreme points i.e. the pollution maximum and minimum cut-offs cannot be determined using such approach. Also, such methods are inefficient approach for better output forecasting. But with the advancement in technology and research, an alternative to the above traditional methods has been proposed i.e. the coupling of statistical techniques with artificial Intelligence (AI) can be used for forecasting purposes. The coupling of PCA, ANN and fuzzy logic is used for forecasting of air pollutant over Delhi urban area. The statistical measures e.g., correlation coefficient (R), normalized mean square error (NMSE), fractional bias (FB) and index of agreement (IOA) of the proposed model are observed in better agreement with the all other models. Hence, the coupling of statistical and artificial intelligence can be use for the forecasting of air pollutant over urban area.

  11. Graphics for the multivariate two-sample problem

    International Nuclear Information System (INIS)

    Friedman, J.H.; Rafsky, L.C.

    1981-01-01

    Some graphical methods for comparing multivariate samples are presented. These methods are based on minimal spanning tree techniques developed for multivariate two-sample tests. The utility of these methods is illustrated through examples using both real and artificial data

  12. Multivariate alteration detection (MAD) in multispectral, bi-temporal image data: A new approach to change detction studies

    DEFF Research Database (Denmark)

    Nielsen, Allan Aasbjerg; Conradsen, Knut

    This paper introduces a new orthogonal transformation, the multivariate alteration detection (MAD) transformation, based on an established multivariate statistical technique canonical correlation analysis. The theory for canonical correlation analysis is sketched and a result necessary...... for the definition of the MAD transformation is proven. As opposed to traditional univariate change detection schemes our scheme transforms two sets of multivariate observations (e.g. two multispectral satellite images covering the same geographical area acquired at different points in time) into a difference...... between two linear combinations of the original variables explaining maximal change (i.e. the difference explaining maximal variance) in all variables simultaneously. The MAD transformation is invariant to linear scaling. The MAD transformation can be used iteratively. First, it can be used to detect...

  13. Multivariate Bonferroni-type inequalities theory and applications

    CERN Document Server

    Chen, John

    2014-01-01

    Multivariate Bonferroni-Type Inequalities: Theory and Applications presents a systematic account of research discoveries on multivariate Bonferroni-type inequalities published in the past decade. The emergence of new bounding approaches pushes the conventional definitions of optimal inequalities and demands new insights into linear and Fréchet optimality. The book explores these advances in bounding techniques with corresponding innovative applications. It presents the method of linear programming for multivariate bounds, multivariate hybrid bounds, sub-Markovian bounds, and bounds using Hamil

  14. Multivariate statistical process control of a continuous pharmaceutical twin-screw granulation and fluid bed drying process.

    Science.gov (United States)

    Silva, A F; Sarraguça, M C; Fonteyne, M; Vercruysse, J; De Leersnyder, F; Vanhoorne, V; Bostijn, N; Verstraeten, M; Vervaet, C; Remon, J P; De Beer, T; Lopes, J A

    2017-08-07

    A multivariate statistical process control (MSPC) strategy was developed for the monitoring of the ConsiGma™-25 continuous tablet manufacturing line. Thirty-five logged variables encompassing three major units, being a twin screw high shear granulator, a fluid bed dryer and a product control unit, were used to monitor the process. The MSPC strategy was based on principal component analysis of data acquired under normal operating conditions using a series of four process runs. Runs with imposed disturbances in the dryer air flow and temperature, in the granulator barrel temperature, speed and liquid mass flow and in the powder dosing unit mass flow were utilized to evaluate the model's monitoring performance. The impact of the imposed deviations to the process continuity was also evaluated using Hotelling's T 2 and Q residuals statistics control charts. The influence of the individual process variables was assessed by analyzing contribution plots at specific time points. Results show that the imposed disturbances were all detected in both control charts. Overall, the MSPC strategy was successfully developed and applied. Additionally, deviations not associated with the imposed changes were detected, mainly in the granulator barrel temperature control. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. Development of infill drilling recovery models for carbonates reservoirs using neural networks and multivariate statistical as a novel method

    International Nuclear Information System (INIS)

    Soto, R; Wu, Ch. H; Bubela, A M

    1999-01-01

    This work introduces a novel methodology to improve reservoir characterization models. In this methodology we integrated multivariate statistical analyses, and neural network models for forecasting the infill drilling ultimate oil recovery from reservoirs in San Andres and Clearfork carbonate formations in west Texas. Development of the oil recovery forecast models help us to understand the relative importance of dominant reservoir characteristics and operational variables, reproduce recoveries for units included in the database, forecast recoveries for possible new units in similar geological setting, and make operational (infill drilling) decisions. The variety of applications demands the creation of multiple recovery forecast models. We have developed intelligent software (Soto, 1998), oilfield intelligence (01), as an engineering tool to improve the characterization of oil and gas reservoirs. 01 integrates neural networks and multivariate statistical analysis. It is composed of five main subsystems: data input, preprocessing, architecture design, graphic design, and inference engine modules. One of the challenges in this research was to identify the dominant and the optimum number of independent variables. The variables include porosity, permeability, water saturation, depth, area, net thickness, gross thickness, formation volume factor, pressure, viscosity, API gravity, number of wells in initial water flooding, number of wells for primary recovery, number of infill wells over the initial water flooding, PRUR, IWUR, and IDUR. Multivariate principal component analysis is used to identify the dominant and the optimum number of independent variables. We compared the results from neural network models with the non-parametric approach. The advantage of the non-parametric regression is that it is easy to use. The disadvantage is that it retains a large variance of forecast results for a particular data set. We also used neural network concepts to develop recovery

  16. Multivariate survival analysis and competing risks

    CERN Document Server

    Crowder, Martin J

    2012-01-01

    Multivariate Survival Analysis and Competing Risks introduces univariate survival analysis and extends it to the multivariate case. It covers competing risks and counting processes and provides many real-world examples, exercises, and R code. The text discusses survival data, survival distributions, frailty models, parametric methods, multivariate data and distributions, copulas, continuous failure, parametric likelihood inference, and non- and semi-parametric methods. There are many books covering survival analysis, but very few that cover the multivariate case in any depth. Written for a graduate-level audience in statistics/biostatistics, this book includes practical exercises and R code for the examples. The author is renowned for his clear writing style, and this book continues that trend. It is an excellent reference for graduate students and researchers looking for grounding in this burgeoning field of research.

  17. The value of multivariate model sophistication

    DEFF Research Database (Denmark)

    Rombouts, Jeroen; Stentoft, Lars; Violante, Francesco

    2014-01-01

    We assess the predictive accuracies of a large number of multivariate volatility models in terms of pricing options on the Dow Jones Industrial Average. We measure the value of model sophistication in terms of dollar losses by considering a set of 444 multivariate models that differ in their spec....... In addition to investigating the value of model sophistication in terms of dollar losses directly, we also use the model confidence set approach to statistically infer the set of models that delivers the best pricing performances.......We assess the predictive accuracies of a large number of multivariate volatility models in terms of pricing options on the Dow Jones Industrial Average. We measure the value of model sophistication in terms of dollar losses by considering a set of 444 multivariate models that differ...

  18. Damage detection of engine bladed-disks using multivariate statistical analysis

    Science.gov (United States)

    Fang, X.; Tang, J.

    2006-03-01

    The timely detection of damage in aero-engine bladed-disks is an extremely important and challenging research topic. Bladed-disks have high modal density and, particularly, their vibration responses are subject to significant uncertainties due to manufacturing tolerance (blade-to-blade difference or mistuning), operating condition change and sensor noise. In this study, we present a new methodology for the on-line damage detection of engine bladed-disks using their vibratory responses during spin-up or spin-down operations which can be measured by blade-tip-timing sensing technique. We apply a principle component analysis (PCA)-based approach for data compression, feature extraction, and denoising. The non-model based damage detection is achieved by analyzing the change between response features of the healthy structure and of the damaged one. We facilitate such comparison by incorporating the Hotelling's statistic T2 analysis, which yields damage declaration with a given confidence level. The effectiveness of the method is demonstrated by case studies.

  19. Geochemistry of natural and anthropogenic fall-out (aerosol and precipitation) collected from the NW Mediterranean: two different multivariate statistical approaches

    International Nuclear Information System (INIS)

    Molinaroli, E.; Pistolato, M.; Rampazzo, G.; Guerzoni, S.

    1999-01-01

    The chemical characteristics of the mineral fractions of aerosol and precipitation collected in Sardinia (NW Mediterranean) are highlighted by means of two multivariate statistical approaches. Two different combinations of classification and statistical methods for geochemical data are presented. It is shown that the application of cluster analysis subsequent to Q-Factor analysis better distinguishes among Saharan dust, background pollution (Europe-Mediterranean) and local aerosol from various source regions (Sardinia). Conversely, the application of simple cluster analysis was able to distinguish only between aerosols and precipitation particles, without assigning the sources (local or distant) to the aerosol. This method also highlighted the fact that crust-enriched precipitation is similar to desert-derived aerosol. Major elements (Al, Na) and trace metal (Pb) turn out to be the most discriminating elements of the analysed data set. Independent use of mineralogical, granulometric and meteorological data confirmed the results derived from the statistical methods employed. (Copyright (c) 1999 Elsevier Science B.V., Amsterdam. All rights reserved.)

  20. A comparison of linear and nonlinear statistical techniques in performance attribution.

    Science.gov (United States)

    Chan, N H; Genovese, C R

    2001-01-01

    Performance attribution is usually conducted under the linear framework of multifactor models. Although commonly used by practitioners in finance, linear multifactor models are known to be less than satisfactory in many situations. After a brief survey of nonlinear methods, nonlinear statistical techniques are applied to performance attribution of a portfolio constructed from a fixed universe of stocks using factors derived from some commonly used cross sectional linear multifactor models. By rebalancing this portfolio monthly, the cumulative returns for procedures based on standard linear multifactor model and three nonlinear techniques-model selection, additive models, and neural networks-are calculated and compared. It is found that the first two nonlinear techniques, especially in combination, outperform the standard linear model. The results in the neural-network case are inconclusive because of the great variety of possible models. Although these methods are more complicated and may require some tuning, toolboxes are developed and suggestions on calibration are proposed. This paper demonstrates the usefulness of modern nonlinear statistical techniques in performance attribution.

  1. Statistical optimisation techniques in fatigue signal editing problem

    International Nuclear Information System (INIS)

    Nopiah, Z. M.; Osman, M. H.; Baharin, N.; Abdullah, S.

    2015-01-01

    Success in fatigue signal editing is determined by the level of length reduction without compromising statistical constraints. A great reduction rate can be achieved by removing small amplitude cycles from the recorded signal. The long recorded signal sometimes renders the cycle-to-cycle editing process daunting. This has encouraged researchers to focus on the segment-based approach. This paper discusses joint application of the Running Damage Extraction (RDE) technique and single constrained Genetic Algorithm (GA) in fatigue signal editing optimisation.. In the first section, the RDE technique is used to restructure and summarise the fatigue strain. This technique combines the overlapping window and fatigue strain-life models. It is designed to identify and isolate the fatigue events that exist in the variable amplitude strain data into different segments whereby the retention of statistical parameters and the vibration energy are considered. In the second section, the fatigue data editing problem is formulated as a constrained single optimisation problem that can be solved using GA method. The GA produces the shortest edited fatigue signal by selecting appropriate segments from a pool of labelling segments. Challenges arise due to constraints on the segment selection by deviation level over three signal properties, namely cumulative fatigue damage, root mean square and kurtosis values. Experimental results over several case studies show that the idea of solving fatigue signal editing within a framework of optimisation is effective and automatic, and that the GA is robust for constrained segment selection

  2. Statistical optimisation techniques in fatigue signal editing problem

    Energy Technology Data Exchange (ETDEWEB)

    Nopiah, Z. M.; Osman, M. H. [Fundamental Engineering Studies Unit Faculty of Engineering and Built Environment, Universiti Kebangsaan Malaysia, 43600 UKM (Malaysia); Baharin, N.; Abdullah, S. [Department of Mechanical and Materials Engineering Faculty of Engineering and Built Environment, Universiti Kebangsaan Malaysia, 43600 UKM (Malaysia)

    2015-02-03

    Success in fatigue signal editing is determined by the level of length reduction without compromising statistical constraints. A great reduction rate can be achieved by removing small amplitude cycles from the recorded signal. The long recorded signal sometimes renders the cycle-to-cycle editing process daunting. This has encouraged researchers to focus on the segment-based approach. This paper discusses joint application of the Running Damage Extraction (RDE) technique and single constrained Genetic Algorithm (GA) in fatigue signal editing optimisation.. In the first section, the RDE technique is used to restructure and summarise the fatigue strain. This technique combines the overlapping window and fatigue strain-life models. It is designed to identify and isolate the fatigue events that exist in the variable amplitude strain data into different segments whereby the retention of statistical parameters and the vibration energy are considered. In the second section, the fatigue data editing problem is formulated as a constrained single optimisation problem that can be solved using GA method. The GA produces the shortest edited fatigue signal by selecting appropriate segments from a pool of labelling segments. Challenges arise due to constraints on the segment selection by deviation level over three signal properties, namely cumulative fatigue damage, root mean square and kurtosis values. Experimental results over several case studies show that the idea of solving fatigue signal editing within a framework of optimisation is effective and automatic, and that the GA is robust for constrained segment selection.

  3. The interprocess NIR sampling as an alternative approach to multivariate statistical process control for identifying sources of product-quality variability.

    Science.gov (United States)

    Marković, Snežana; Kerč, Janez; Horvat, Matej

    2017-03-01

    We are presenting a new approach of identifying sources of variability within a manufacturing process by NIR measurements of samples of intermediate material after each consecutive unit operation (interprocess NIR sampling technique). In addition, we summarize the development of a multivariate statistical process control (MSPC) model for the production of enteric-coated pellet product of the proton-pump inhibitor class. By developing provisional NIR calibration models, the identification of critical process points yields comparable results to the established MSPC modeling procedure. Both approaches are shown to lead to the same conclusion, identifying parameters of extrusion/spheronization and characteristics of lactose that have the greatest influence on the end-product's enteric coating performance. The proposed approach enables quicker and easier identification of variability sources during manufacturing process, especially in cases when historical process data is not straightforwardly available. In the presented case the changes of lactose characteristics are influencing the performance of the extrusion/spheronization process step. The pellet cores produced by using one (considered as less suitable) lactose source were on average larger and more fragile, leading to consequent breakage of the cores during subsequent fluid bed operations. These results were confirmed by additional experimental analyses illuminating the underlying mechanism of fracture of oblong pellets during the pellet coating process leading to compromised film coating.

  4. Confidence limits for contribution plots in multivariate statistical process control using bootstrap estimates.

    Science.gov (United States)

    Babamoradi, Hamid; van den Berg, Frans; Rinnan, Åsmund

    2016-02-18

    In Multivariate Statistical Process Control, when a fault is expected or detected in the process, contribution plots are essential for operators and optimization engineers in identifying those process variables that were affected by or might be the cause of the fault. The traditional way of interpreting a contribution plot is to examine the largest contributing process variables as the most probable faulty ones. This might result in false readings purely due to the differences in natural variation, measurement uncertainties, etc. It is more reasonable to compare variable contributions for new process runs with historical results achieved under Normal Operating Conditions, where confidence limits for contribution plots estimated from training data are used to judge new production runs. Asymptotic methods cannot provide confidence limits for contribution plots, leaving re-sampling methods as the only option. We suggest bootstrap re-sampling to build confidence limits for all contribution plots in online PCA-based MSPC. The new strategy to estimate CLs is compared to the previously reported CLs for contribution plots. An industrial batch process dataset was used to illustrate the concepts. Copyright © 2016 Elsevier B.V. All rights reserved.

  5. An Introduction to Applied Multivariate Analysis

    CERN Document Server

    Raykov, Tenko

    2008-01-01

    Focuses on the core multivariate statistics topics which are of fundamental relevance for its understanding. This book emphasis on the topics that are critical to those in the behavioral, social, and educational sciences.

  6. Determination of the archaeological origin of ceramic fragments characterized by neutron activation analysis, by means of the application of multivariable statistical analysis techniques

    International Nuclear Information System (INIS)

    Almazan T, M. G.; Jimenez R, M.; Monroy G, F.; Tenorio, D.; Rodriguez G, N. L.

    2009-01-01

    The elementary composition of archaeological ceramic fragments obtained during the explorations in San Miguel Ixtapan, Mexico State, was determined by the neutron activation analysis technique. The samples irradiation was realized in the research reactor TRIGA Mark III with a neutrons flow of 1·10 13 n·cm -2 ·s -1 . The irradiation time was of 2 hours. Previous to the acquisition of the gamma rays spectrum the samples were allowed to decay from 12 to 14 days. The analyzed elements were: Nd, Ce, Lu, Eu, Yb, Pa(Th), Tb, La, Cr, Hf, Sc, Co, Fe, Cs, Rb. The statistical treatment of the data, consistent in the group analysis and the main components analysis allowed to identify three different origins of the archaeological ceramic, designated as: local, foreign and regional. (Author)

  7. Line identification studies using traditional techniques and wavelength coincidence statistics

    International Nuclear Information System (INIS)

    Cowley, C.R.; Adelman, S.J.

    1990-01-01

    Traditional line identification techniques result in the assignment of individual lines to an atomic or ionic species. These methods may be supplemented by wavelength coincidence statistics (WCS). The strength and weakness of these methods are discussed using spectra of a number of normal and peculiar B and A stars that have been studied independently by both methods. The present results support the overall findings of some earlier studies. WCS would be most useful in a first survey, before traditional methods have been applied. WCS can quickly make a global search for all species and in this way may enable identifications of an unexpected spectrum that could easily be omitted entirely from a traditional study. This is illustrated by O I. WCS is a subject to well known weakness of any statistical technique, for example, a predictable number of spurious results are to be expected. The danger of small number statistics are illustrated. WCS is at its best relative to traditional methods in finding a line-rich atomic species that is only weakly present in a complicated stellar spectrum

  8. Compositional differences among Chinese soy sauce types studied by (13)C NMR spectroscopy coupled with multivariate statistical analysis.

    Science.gov (United States)

    Kamal, Ghulam Mustafa; Wang, Xiaohua; Bin Yuan; Wang, Jie; Sun, Peng; Zhang, Xu; Liu, Maili

    2016-09-01

    Soy sauce a well known seasoning all over the world, especially in Asia, is available in global market in a wide range of types based on its purpose and the processing methods. Its composition varies with respect to the fermentation processes and addition of additives, preservatives and flavor enhancers. A comprehensive (1)H NMR based study regarding the metabonomic variations of soy sauce to differentiate among different types of soy sauce available on the global market has been limited due to the complexity of the mixture. In present study, (13)C NMR spectroscopy coupled with multivariate statistical data analysis like principle component analysis (PCA), and orthogonal partial least square-discriminant analysis (OPLS-DA) was applied to investigate metabonomic variations among different types of soy sauce, namely super light, super dark, red cooking and mushroom soy sauce. The main additives in soy sauce like glutamate, sucrose and glucose were easily distinguished and quantified using (13)C NMR spectroscopy which were otherwise difficult to be assigned and quantified due to serious signal overlaps in (1)H NMR spectra. The significantly higher concentration of sucrose in dark, red cooking and mushroom flavored soy sauce can directly be linked to the addition of caramel in soy sauce. Similarly, significantly higher level of glutamate in super light as compared to super dark and mushroom flavored soy sauce may come from the addition of monosodium glutamate. The study highlights the potentiality of (13)C NMR based metabonomics coupled with multivariate statistical data analysis in differentiating between the types of soy sauce on the basis of level of additives, raw materials and fermentation procedures. Copyright © 2016 Elsevier B.V. All rights reserved.

  9. Water Quality Assessment and Pollution Source Identification of the Eastern Poyang Lake Basin Using Multivariate Statistical Methods

    Directory of Open Access Journals (Sweden)

    Weili Duan

    2016-01-01

    Full Text Available Multivariate statistical methods including cluster analysis (CA, discriminant analysis (DA and component analysis/factor analysis (PCA/FA, were applied to explore the surface water quality datasets including 14 parameters at 28 sites of the Eastern Poyang Lake Basin, Jiangxi Province of China, from January 2012 to April 2015, characterize spatiotemporal variation in pollution and identify potential pollution sources. The 28 sampling stations were divided into two periods (wet season and dry season and two regions (low pollution and high pollution, respectively, using hierarchical CA method. Four parameters (temperature, pH, ammonia-nitrogen (NH4-N, and total nitrogen (TN were identified using DA to distinguish temporal groups with close to 97.86% correct assignations. Again using DA, five parameters (pH, chemical oxygen demand (COD, TN, Fluoride (F, and Sulphide (S led to 93.75% correct assignations for distinguishing spatial groups. Five potential pollution sources including nutrients pollution, oxygen consuming organic pollution, fluorine chemical pollution, heavy metals pollution and natural pollution, were identified using PCA/FA techniques for both the low pollution region and the high pollution region. Heavy metals (Cuprum (Cu, chromium (Cr and Zinc (Zn, fluoride and sulfide are of particular concern in the study region because of many open-pit copper mines such as Dexing Copper Mine. Results obtained from this study offer a reasonable classification scheme for low-cost monitoring networks. The results also inform understanding of spatio-temporal variation in water quality as these topics relate to water resources management.

  10. Detecting relationships between the interannual variability in climate records and ecological time series using a multivariate statistical approach - four case studies for the North Sea region

    Energy Technology Data Exchange (ETDEWEB)

    Heyen, H. [GKSS-Forschungszentrum Geesthacht GmbH (Germany). Inst. fuer Gewaesserphysik

    1998-12-31

    A multivariate statistical approach is presented that allows a systematic search for relationships between the interannual variability in climate records and ecological time series. Statistical models are built between climatological predictor fields and the variables of interest. Relationships are sought on different temporal scales and for different seasons and time lags. The possibilities and limitations of this approach are discussed in four case studies dealing with salinity in the German Bight, abundance of zooplankton at Helgoland Roads, macrofauna communities off Norderney and the arrival of migratory birds on Helgoland. (orig.) [Deutsch] Ein statistisches, multivariates Modell wird vorgestellt, das eine systematische Suche nach potentiellen Zusammenhaengen zwischen Variabilitaet in Klima- und oekologischen Zeitserien erlaubt. Anhand von vier Anwendungsbeispielen wird der Klimaeinfluss auf den Salzgehalt in der Deutschen Bucht, Zooplankton vor Helgoland, Makrofauna vor Norderney, und die Ankunft von Zugvoegeln auf Helgoland untersucht. (orig.)

  11. Multivariate Birkhoff interpolation

    CERN Document Server

    Lorentz, Rudolph A

    1992-01-01

    The subject of this book is Lagrange, Hermite and Birkhoff (lacunary Hermite) interpolation by multivariate algebraic polynomials. It unifies and extends a new algorithmic approach to this subject which was introduced and developed by G.G. Lorentz and the author. One particularly interesting feature of this algorithmic approach is that it obviates the necessity of finding a formula for the Vandermonde determinant of a multivariate interpolation in order to determine its regularity (which formulas are practically unknown anyways) by determining the regularity through simple geometric manipulations in the Euclidean space. Although interpolation is a classical problem, it is surprising how little is known about its basic properties in the multivariate case. The book therefore starts by exploring its fundamental properties and its limitations. The main part of the book is devoted to a complete and detailed elaboration of the new technique. A chapter with an extensive selection of finite elements follows as well a...

  12. Statistical and Economic Techniques for Site-specific Nematode Management.

    Science.gov (United States)

    Liu, Zheng; Griffin, Terry; Kirkpatrick, Terrence L

    2014-03-01

    Recent advances in precision agriculture technologies and spatial statistics allow realistic, site-specific estimation of nematode damage to field crops and provide a platform for the site-specific delivery of nematicides within individual fields. This paper reviews the spatial statistical techniques that model correlations among neighboring observations and develop a spatial economic analysis to determine the potential of site-specific nematicide application. The spatial econometric methodology applied in the context of site-specific crop yield response contributes to closing the gap between data analysis and realistic site-specific nematicide recommendations and helps to provide a practical method of site-specifically controlling nematodes.

  13. On Multivariate Methods in Robust Econometrics

    Czech Academy of Sciences Publication Activity Database

    Kalina, Jan

    2012-01-01

    Roč. 21, č. 1 (2012), s. 69-82 ISSN 1210-0455 R&D Projects: GA MŠk(CZ) 1M06014 Institutional research plan: CEZ:AV0Z10300504 Keywords : least weighted squares * heteroscedasticity * multivariate statistics * model selection * diagnostics * computational aspects Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.561, year: 2012 http://www.vse.cz/pep/abstrakt.php?IDcl=411

  14. Designing a risk-based surveillance program for Mycobacterium avium ssp. paratuberculosis in Norwegian dairy herds using multivariate statistical process control analysis.

    Science.gov (United States)

    Whist, A C; Liland, K H; Jonsson, M E; Sæbø, S; Sviland, S; Østerås, O; Norström, M; Hopp, P

    2014-11-01

    Surveillance programs for animal diseases are critical to early disease detection and risk estimation and to documenting a population's disease status at a given time. The aim of this study was to describe a risk-based surveillance program for detecting Mycobacterium avium ssp. paratuberculosis (MAP) infection in Norwegian dairy cattle. The included risk factors for detecting MAP were purchase of cattle, combined cattle and goat farming, and location of the cattle farm in counties containing goats with MAP. The risk indicators included production data [culling of animals >3 yr of age, carcass conformation of animals >3 yr of age, milk production decrease in older lactating cows (lactations 3, 4, and 5)], and clinical data (diarrhea, enteritis, or both, in animals >3 yr of age). Except for combined cattle and goat farming and cattle farm location, all data were collected at the cow level and summarized at the herd level. Predefined risk factors and risk indicators were extracted from different national databases and combined in a multivariate statistical process control to obtain a risk assessment for each herd. The ordinary Hotelling's T(2) statistic was applied as a multivariate, standardized measure of difference between the current observed state and the average state of the risk factors for a given herd. To make the analysis more robust and adapt it to the slowly developing nature of MAP, monthly risk calculations were based on data accumulated during a 24-mo period. Monitoring of these variables was performed to identify outliers that may indicate deviance in one or more of the underlying processes. The highest-ranked herds were scattered all over Norway and clustered in high-density dairy cattle farm areas. The resulting rankings of herds are being used in the national surveillance program for MAP in 2014 to increase the sensitivity of the ongoing surveillance program in which 5 fecal samples for bacteriological examination are collected from 25 dairy herds

  15. Multivariate Analysis and Prediction of Dioxin-Furan ...

    Science.gov (United States)

    Peer Review Draft of Regional Methods Initiative Final Report Dioxins, which are bioaccumulative and environmentally persistent, pose an ongoing risk to human and ecosystem health. Fish constitute a significant source of dioxin exposure for humans and fish-eating wildlife. Current dioxin analytical methods are costly, time-consuming, and produce hazardous by-products. A Danish team developed a novel, multivariate statistical methodology based on the covariance of dioxin-furan congener Toxic Equivalences (TEQs) and fatty acid methyl esters (FAMEs) and applied it to North Atlantic Ocean fishmeal samples. The goal of the current study was to attempt to extend this Danish methodology to 77 whole and composite fish samples from three trophic groups: predator (whole largemouth bass), benthic (whole flathead and channel catfish) and forage fish (composite bluegill, pumpkinseed and green sunfish) from two dioxin contaminated rivers (Pocatalico R. and Kanawha R.) in West Virginia, USA. Multivariate statistical analyses, including, Principal Components Analysis (PCA), Hierarchical Clustering, and Partial Least Squares Regression (PLS), were used to assess the relationship between the FAMEs and TEQs in these dioxin contaminated freshwater fish from the Kanawha and Pocatalico Rivers. These three multivariate statistical methods all confirm that the pattern of Fatty Acid Methyl Esters (FAMEs) in these freshwater fish covaries with and is predictive of the WHO TE

  16. Multivariable control in nuclear power stations

    International Nuclear Information System (INIS)

    Parent, M.; McMorran, P.D.

    1982-11-01

    Multivariable methods have the potential to improve the control of large systems such as nuclear power stations. Linear-quadratic optimal control is a multivariable method based on the minimization of a cost function. A related technique leads to the Kalman filter for estimation of plant state from noisy measurements. A design program for optimal control and Kalman filtering has been developed as part of a computer-aided design package for multivariable control systems. The method is demonstrated on a model of a nuclear steam generator, and simulated results are presented

  17. Statistical methods of evaluating and comparing imaging techniques

    International Nuclear Information System (INIS)

    Freedman, L.S.

    1987-01-01

    Over the past 20 years several new methods of generating images of internal organs and the anatomy of the body have been developed and used to enhance the accuracy of diagnosis and treatment. These include ultrasonic scanning, radioisotope scanning, computerised X-ray tomography (CT) and magnetic resonance imaging (MRI). The new techniques have made a considerable impact on radiological practice in hospital departments, not least on the investigational process for patients suspected or known to have malignant disease. As a consequence of the increased range of imaging techniques now available, there has developed a need to evaluate and compare their usefulness. Over the past 10 years formal studies of the application of imaging technology have been conducted and many reports have appeared in the literature. These studies cover a range of clinical situations. Likewise, the methodologies employed for evaluating and comparing the techniques in question have differed widely. While not attempting an exhaustive review of the clinical studies which have been reported, this paper aims to examine the statistical designs and analyses which have been used. First a brief review of the different types of study is given. Examples of each type are then chosen to illustrate statistical issues related to their design and analysis. In the final sections it is argued that a form of classification for these different types of study might be helpful in clarifying relationships between them and bringing a perspective to the field. A classification based upon a limited analogy with clinical trials is suggested

  18. Statistics of extremes theory and applications

    CERN Document Server

    Beirlant, Jan; Segers, Johan; Teugels, Jozef; De Waal, Daniel; Ferro, Chris

    2006-01-01

    Research in the statistical analysis of extreme values has flourished over the past decade: new probability models, inference and data analysis techniques have been introduced; and new application areas have been explored. Statistics of Extremes comprehensively covers a wide range of models and application areas, including risk and insurance: a major area of interest and relevance to extreme value theory. Case studies are introduced providing a good balance of theory and application of each model discussed, incorporating many illustrated examples and plots of data. The last part of the book covers some interesting advanced topics, including  time series, regression, multivariate and Bayesian modelling of extremes, the use of which has huge potential.  

  19. Combining Statistical Methodologies in Water Quality Monitoring in a Hydrological Basin - Space and Time Approaches

    OpenAIRE

    Costa, Marco; A. Manuela Gonçalves

    2012-01-01

    In this work are discussed some statistical approaches that combine multivariate statistical techniques and time series analysis in order to describe and model spatial patterns and temporal evolution by observing hydrological series of water quality variables recorded in time and space. These approaches are illustrated with a data set collected in the River Ave hydrological basin located in the Northwest region of Portugal.

  20. The analysis of multivariate group differences using common principal components

    NARCIS (Netherlands)

    Bechger, T.M.; Blanca, M.J.; Maris, G.

    2014-01-01

    Although it is simple to determine whether multivariate group differences are statistically significant or not, such differences are often difficult to interpret. This article is about common principal components analysis as a tool for the exploratory investigation of multivariate group differences

  1. Multivariate Non-Symmetric Stochastic Models for Spatial Dependence Models

    Science.gov (United States)

    Haslauer, C. P.; Bárdossy, A.

    2017-12-01

    A copula based multivariate framework allows more flexibility to describe different kind of dependences than what is possible using models relying on the confining assumption of symmetric Gaussian models: different quantiles can be modelled with a different degree of dependence; it will be demonstrated how this can be expected given process understanding. maximum likelihood based multivariate quantitative parameter estimation yields stable and reliable results; not only improved results in cross-validation based measures of uncertainty are obtained but also a more realistic spatial structure of uncertainty compared to second order models of dependence; as much information as is available is included in the parameter estimation: incorporation of censored measurements (e.g., below detection limit, or ones that are above the sensitive range of the measurement device) yield to more realistic spatial models; the proportion of true zeros can be jointly estimated with and distinguished from censored measurements which allow estimates about the age of a contaminant in the system; secondary information (categorical and on the rational scale) has been used to improve the estimation of the primary variable; These copula based multivariate statistical techniques are demonstrated based on hydraulic conductivity observations at the Borden (Canada) site, the MADE site (USA), and a large regional groundwater quality data-set in south-west Germany. Fields of spatially distributed K were simulated with identical marginal simulation, identical second order spatial moments, yet substantially differing solute transport characteristics when numerical tracer tests were performed. A statistical methodology is shown that allows the delineation of a boundary layer separating homogenous parts of a spatial data-set. The effects of this boundary layer (macro structure) and the spatial dependence of K (micro structure) on solute transport behaviour is shown.

  2. Application of multivariate statistical methods to classify archaeological pottery from Tel-Alramad site, Syria, based on x-ray fluorescence analysis

    International Nuclear Information System (INIS)

    Bakraji, E. H.

    2007-01-01

    Radioisotopic x-ray fluorescence (XRF) analysis has been utilized to determine the elemental composition of 55 archaeological pottery samples by the determination of 17 chemical elements. Fifty-four of them came from the Tel-Alramad Site in Katana town, near Damascus city, Syria, and one sample came from Brazil. The XRF results have been processed using two multivariate statistical methods, cluster and factor analysis, in order to determine similarities and correlation between the selected samples based on their elemental composition. The methodology successfully separates the samples where four distinct chemical groups were identified. (author)

  3. IR spectroscopy together with multivariate data analysis as a process analytical tool for in-line monitoring of crystallization process and solid-state analysis of crystalline product

    DEFF Research Database (Denmark)

    Pöllänen, Kati; Häkkinen, Antti; Reinikainen, Satu-Pia

    2005-01-01

    -ray powder diffraction (XRPD) as a reference technique. In order to fully utilize DRIFT, the application of multivariate techniques are needed, e.g., multivariate statistical process control (MSPC), principal component analysis (PCA) and partial least squares (PLS). The results demonstrate that multivariate...... Fourier transform infra red (ATR-FTIR) spectroscopy provides valuable information on process, which can be utilized for more controlled crystallization processes. Diffuse reflectance Fourier transform infra red (DRIFT-IR) is applied for polymorphic characterization of crystalline product using X......Crystalline product should exist in optimal polymorphic form. Robust and reliable method for polymorph characterization is of great importance. In this work, infra red (IR) spectroscopy is applied for monitoring of crystallization process in situ. The results show that attenuated total reflection...

  4. Power Estimation in Multivariate Analysis of Variance

    Directory of Open Access Journals (Sweden)

    Jean François Allaire

    2007-09-01

    Full Text Available Power is often overlooked in designing multivariate studies for the simple reason that it is believed to be too complicated. In this paper, it is shown that power estimation in multivariate analysis of variance (MANOVA can be approximated using a F distribution for the three popular statistics (Hotelling-Lawley trace, Pillai-Bartlett trace, Wilk`s likelihood ratio. Consequently, the same procedure, as in any statistical test, can be used: computation of the critical F value, computation of the noncentral parameter (as a function of the effect size and finally estimation of power using a noncentral F distribution. Various numerical examples are provided which help to understand and to apply the method. Problems related to post hoc power estimation are discussed.

  5. Elemental characterization of herbal medicines used in Ghana by instrumental neutron activation analysis and atomic absorption spectrometry and multivariate statistical analysis

    International Nuclear Information System (INIS)

    Ayivor, J.E.; Nyarko, B.J.B.; Dampare, S.B.; Okine, L.K.

    2010-01-01

    k 0 instrumental neutron activation analysis and atomic absorption spectrometry were applied to determine multi elements in thirteen Ghanaian herbal medicines used for the management of various diseases. Concentrations of AI, Cu, Mg, Mn and Na were determined. As, Br, K, CI, and Na were determined by short and medium irradiations at a thermal neutron flux of 5x10ncm -2 s -1 . Fe, Cr, Pb, Co, Ni, Sn, Ca, Ba, Li and Sb were determined using atomic absorption spectrometry. Ba, Cu, Li and V were present at trace levels whereas AI, CI, Na, Ca were present at major levels. K, Br, Mg, Mn, Co, Ni, Fe and Sb were also present at minor levels. The precision and accuracy of the method using real samples and standard reference materials were within ±10% of the reported value. Multivariate analytical techniques, such as cluster analysis and principal component analysis (PCA)/factor analysis (FA), have been applied to evaluate the chemical variations in the herbal medicine dataset. All the 13 samples may be grouped into two statistically significant clusters, reflecting the different chemical compositions. The concentrations of elements were within the recommended daily allowances or maximum permissible levels posing no adverse effects on human health.

  6. Correlating phospholipid fatty acids (PLFA) in a landfill leachate polluted aquifer with biogeochemical factors by multivariate statistical methods

    DEFF Research Database (Denmark)

    Ludvigsen, Liselotte; Albrechtsen, Hans-Jørgen; Rootzén, Helle

    1997-01-01

    Different multivariate statistical analyses were applied to phospholipid fatty acids representing the biomass composition and to different biogeochemical parameters measured in 37 samples from a landfill contaminated aquifer at Grindsted Landfill (Denmark). Principal component analysis...... and correspondence analysis were used to identify groups of samples showing similar patterns with respect to biogeochemical variables and phospholipid fatty acid composition. The principal component analysis revealed that for the biogeochemical parameters the first principal component was linked to the pollution...... was used to allocate samples of phospholipid fatty acids into predefined classes. A large percentages of samples were classified correctly when discriminating samples into groups of dissolved organic carbon and specific conductivity, indicating that the biomass is highly influenced by the pollution...

  7. Categorical and nonparametric data analysis choosing the best statistical technique

    CERN Document Server

    Nussbaum, E Michael

    2014-01-01

    Featuring in-depth coverage of categorical and nonparametric statistics, this book provides a conceptual framework for choosing the most appropriate type of test in various research scenarios. Class tested at the University of Nevada, the book's clear explanations of the underlying assumptions, computer simulations, and Exploring the Concept boxes help reduce reader anxiety. Problems inspired by actual studies provide meaningful illustrations of the techniques. The underlying assumptions of each test and the factors that impact validity and statistical power are reviewed so readers can explain

  8. Attitudes toward Advanced and Multivariate Statistics When Using Computers.

    Science.gov (United States)

    Kennedy, Robert L.; McCallister, Corliss Jean

    This study investigated the attitudes toward statistics of graduate students who studied advanced statistics in a course in which the focus of instruction was the use of a computer program in class. The use of the program made it possible to provide an individualized, self-paced, student-centered, and activity-based course. The three sections…

  9. Characterization of groundwater quality using water evaluation indices, multivariate statistics and geostatistics in central Bangladesh

    Directory of Open Access Journals (Sweden)

    Md. Bodrud-Doza

    2016-04-01

    Full Text Available This study investigates the groundwater quality in the Faridpur district of central Bangladesh based on preselected 60 sample points. Water evaluation indices and a number of statistical approaches such as multivariate statistics and geostatistics are applied to characterize water quality, which is a major factor for controlling the groundwater quality in term of drinking purposes. The study reveal that EC, TDS, Ca2+, total As and Fe values of groundwater samples exceeded Bangladesh and international standards. Ground water quality index (GWQI exhibited that about 47% of the samples were belonging to good quality water for drinking purposes. The heavy metal pollution index (HPI, degree of contamination (Cd, heavy metal evaluation index (HEI reveal that most of the samples belong to low level of pollution. However, Cd provide better alternative than other indices. Principle component analysis (PCA suggests that groundwater quality is mainly related to geogenic (rock–water interaction and anthropogenic source (agrogenic and domestic sewage in the study area. Subsequently, the findings of cluster analysis (CA and correlation matrix (CM are also consistent with the PCA results. The spatial distributions of groundwater quality parameters are determined by geostatistical modeling. The exponential semivariagram model is validated as the best fitted models for most of the indices values. It is expected that outcomes of the study will provide insights for decision makers taking proper measures for groundwater quality management in central Bangladesh.

  10. PyMVPA: A python toolbox for multivariate pattern analysis of fMRI data.

    Science.gov (United States)

    Hanke, Michael; Halchenko, Yaroslav O; Sederberg, Per B; Hanson, Stephen José; Haxby, James V; Pollmann, Stefan

    2009-01-01

    Decoding patterns of neural activity onto cognitive states is one of the central goals of functional brain imaging. Standard univariate fMRI analysis methods, which correlate cognitive and perceptual function with the blood oxygenation-level dependent (BOLD) signal, have proven successful in identifying anatomical regions based on signal increases during cognitive and perceptual tasks. Recently, researchers have begun to explore new multivariate techniques that have proven to be more flexible, more reliable, and more sensitive than standard univariate analysis. Drawing on the field of statistical learning theory, these new classifier-based analysis techniques possess explanatory power that could provide new insights into the functional properties of the brain. However, unlike the wealth of software packages for univariate analyses, there are few packages that facilitate multivariate pattern classification analyses of fMRI data. Here we introduce a Python-based, cross-platform, and open-source software toolbox, called PyMVPA, for the application of classifier-based analysis techniques to fMRI datasets. PyMVPA makes use of Python's ability to access libraries written in a large variety of programming languages and computing environments to interface with the wealth of existing machine learning packages. We present the framework in this paper and provide illustrative examples on its usage, features, and programmability.

  11. Multivariate analysis of structure and contribution per shares made by potential risk factors at malignant neoplasms in trachea, bronchial tubes and lung

    Directory of Open Access Journals (Sweden)

    G.T. Aydinov

    2017-03-01

    Full Text Available The article gives the results of multivariate analysis of structure and contribution per shares made by potential risk factors at malignant neoplasms in trachea, bronchial tubes and lung. The authors used specialized databases comprising personified records on oncologic diseases in Taganrog, Rostov region, over 1986-2015 (30,684 registered cases of malignant neoplasms, including 3,480 cases of trachea cancer, bronchial tubes cancer, and lung cancer. When carrying out analytical research we applied both multivariate statistical techniques (factor analysis and hierarchical cluster correlation analysis and conventional techniques of epidemiologic analysis including etiologic fraction calculation (EF, as well as an original technique of assessing actual (epidemiologic risk. Average long-term morbidity with trachea, bronchial tubes and lung cancer over 2011-2015 amounts to 46.64 o / oooo . Over the last 15 years a stable decreasing trend has formed, annual average growth being – 1.22 %. This localization holds the 3rd rank place in oncologic morbidity structure, its specific weight being 10.02 %. We determined etiological fraction (EF for smoking as a priority risk factor causing trachea, bronchial tubes and lung cancer; this fraction amounts to 76.19 % for people aged 40 and older, and to 81.99 % for those aged 60 and older. Application of multivariate statistical techniques (factor analysis and cluster correlation analysis in this research enabled us to make factor structure more simple; namely, to highlight, interpret, give a quantitative estimate of self-descriptiveness and rank four group (latent potential risk factors causing lung cancer.

  12. Multivariable Feedback Control of Nuclear Reactors

    Directory of Open Access Journals (Sweden)

    Rune Moen

    1982-07-01

    Full Text Available Multivariable feedback control has been adapted for optimal control of the spatial power distribution in nuclear reactor cores. Two design techniques, based on the theory of automatic control, were developed: the State Variable Feedback (SVF is an application of the linear optimal control theory, and the Multivariable Frequency Response (MFR is based on a generalization of the traditional frequency response approach to control system design.

  13. Introduction to multivariate discrimination

    Science.gov (United States)

    Kégl, Balázs

    2013-07-01

    Multivariate discrimination or classification is one of the best-studied problem in machine learning, with a plethora of well-tested and well-performing algorithms. There are also several good general textbooks [1-9] on the subject written to an average engineering, computer science, or statistics graduate student; most of them are also accessible for an average physics student with some background on computer science and statistics. Hence, instead of writing a generic introduction, we concentrate here on relating the subject to a practitioner experimental physicist. After a short introduction on the basic setup (Section 1) we delve into the practical issues of complexity regularization, model selection, and hyperparameter optimization (Section 2), since it is this step that makes high-complexity non-parametric fitting so different from low-dimensional parametric fitting. To emphasize that this issue is not restricted to classification, we illustrate the concept on a low-dimensional but non-parametric regression example (Section 2.1). Section 3 describes the common algorithmic-statistical formal framework that unifies the main families of multivariate classification algorithms. We explain here the large-margin principle that partly explains why these algorithms work. Section 4 is devoted to the description of the three main (families of) classification algorithms, neural networks, the support vector machine, and AdaBoost. We do not go into the algorithmic details; the goal is to give an overview on the form of the functions these methods learn and on the objective functions they optimize. Besides their technical description, we also make an attempt to put these algorithm into a socio-historical context. We then briefly describe some rather heterogeneous applications to illustrate the pattern recognition pipeline and to show how widespread the use of these methods is (Section 5). We conclude the chapter with three essentially open research problems that are either

  14. Introduction to multivariate discrimination

    International Nuclear Information System (INIS)

    Kegl, B.

    2013-01-01

    Multivariate discrimination or classification is one of the best-studied problem in machine learning, with a plethora of well-tested and well-performing algorithms. There are also several good general textbooks [1-9] on the subject written to an average engineering, computer science, or statistics graduate student; most of them are also accessible for an average physics student with some background on computer science and statistics. Hence, instead of writing a generic introduction, we concentrate here on relating the subject to a practitioner experimental physicist. After a short introduction on the basic setup (Section 1) we delve into the practical issues of complexity regularization, model selection, and hyper-parameter optimization (Section 2), since it is this step that makes high-complexity non-parametric fitting so different from low-dimensional parametric fitting. To emphasize that this issue is not restricted to classification, we illustrate the concept on a low-dimensional but non-parametric regression example (Section 2.1). Section 3 describes the common algorithmic-statistical formal framework that unifies the main families of multivariate classification algorithms. We explain here the large-margin principle that partly explains why these algorithms work. Section 4 is devoted to the description of the three main (families of) classification algorithms, neural networks, the support vector machine, and AdaBoost. We do not go into the algorithmic details; the goal is to give an overview on the form of the functions these methods learn and on the objective functions they optimize. Besides their technical description, we also make an attempt to put these algorithm into a socio-historical context. We then briefly describe some rather heterogeneous applications to illustrate the pattern recognition pipeline and to show how widespread the use of these methods is (Section 5). We conclude the chapter with three essentially open research problems that are either

  15. An assessment on the use of bivariate, multivariate and soft computing techniques for collapse susceptibility in GIS environ

    Science.gov (United States)

    Yilmaz, Işik; Marschalko, Marian; Bednarik, Martin

    2013-04-01

    The paper presented herein compares and discusses the use of bivariate, multivariate and soft computing techniques for collapse susceptibility modelling. Conditional probability (CP), logistic regression (LR) and artificial neural networks (ANN) models representing the bivariate, multivariate and soft computing techniques were used in GIS based collapse susceptibility mapping in an area from Sivas basin (Turkey). Collapse-related factors, directly or indirectly related to the causes of collapse occurrence, such as distance from faults, slope angle and aspect, topographical elevation, distance from drainage, topographic wetness index (TWI), stream power index (SPI), Normalized Difference Vegetation Index (NDVI) by means of vegetation cover, distance from roads and settlements were used in the collapse susceptibility analyses. In the last stage of the analyses, collapse susceptibility maps were produced from the models, and they were then compared by means of their validations. However, Area Under Curve (AUC) values obtained from all three models showed that the map obtained from soft computing (ANN) model looks like more accurate than the other models, accuracies of all three models can be evaluated relatively similar. The results also showed that the conditional probability is an essential method in preparation of collapse susceptibility map and highly compatible with GIS operating features.

  16. Validated univariate and multivariate spectrophotometric methods for the determination of pharmaceuticals mixture in complex wastewater

    Science.gov (United States)

    Riad, Safaa M.; Salem, Hesham; Elbalkiny, Heba T.; Khattab, Fatma I.

    2015-04-01

    Five, accurate, precise, and sensitive univariate and multivariate spectrophotometric methods were developed for the simultaneous determination of a ternary mixture containing Trimethoprim (TMP), Sulphamethoxazole (SMZ) and Oxytetracycline (OTC) in waste water samples collected from different cites either production wastewater or livestock wastewater after their solid phase extraction using OASIS HLB cartridges. In univariate methods OTC was determined at its λmax 355.7 nm (0D), while (TMP) and (SMZ) were determined by three different univariate methods. Method (A) is based on successive spectrophotometric resolution technique (SSRT). The technique starts with the ratio subtraction method followed by ratio difference method for determination of TMP and SMZ. Method (B) is successive derivative ratio technique (SDR). Method (C) is mean centering of the ratio spectra (MCR). The developed multivariate methods are principle component regression (PCR) and partial least squares (PLS). The specificity of the developed methods is investigated by analyzing laboratory prepared mixtures containing different ratios of the three drugs. The obtained results are statistically compared with those obtained by the official methods, showing no significant difference with respect to accuracy and precision at p = 0.05.

  17. Identification of Chemical Attribution Signatures of Fentanyl Syntheses Using Multivariate Statistical Analysis of Orthogonal Analytical Data

    Energy Technology Data Exchange (ETDEWEB)

    Mayer, B. P. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Mew, D. A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); DeHope, A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Spackman, P. E. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Williams, A. M. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2015-09-24

    Attribution of the origin of an illicit drug relies on identification of compounds indicative of its clandestine production and is a key component of many modern forensic investigations. The results of these studies can yield detailed information on method of manufacture, starting material source, and final product - all critical forensic evidence. In the present work, chemical attribution signatures (CAS) associated with the synthesis of the analgesic fentanyl, N-(1-phenylethylpiperidin-4-yl)-N-phenylpropanamide, were investigated. Six synthesis methods, all previously published fentanyl synthetic routes or hybrid versions thereof, were studied in an effort to identify and classify route-specific signatures. 160 distinct compounds and inorganic species were identified using gas and liquid chromatographies combined with mass spectrometric methods (GC-MS and LCMS/ MS-TOF) in conjunction with inductively coupled plasma mass spectrometry (ICPMS). The complexity of the resultant data matrix urged the use of multivariate statistical analysis. Using partial least squares discriminant analysis (PLS-DA), 87 route-specific CAS were classified and a statistical model capable of predicting the method of fentanyl synthesis was validated and tested against CAS profiles from crude fentanyl products deposited and later extracted from two operationally relevant surfaces: stainless steel and vinyl tile. This work provides the most detailed fentanyl CAS investigation to date by using orthogonal mass spectral data to identify CAS of forensic significance for illicit drug detection, profiling, and attribution.

  18. Discrimination of irradiated MOX fuel from UOX fuel by multivariate statistical analysis of simulated activities of gamma-emitting isotopes

    Science.gov (United States)

    Åberg Lindell, M.; Andersson, P.; Grape, S.; Hellesen, C.; Håkansson, A.; Thulin, M.

    2018-03-01

    This paper investigates how concentrations of certain fission products and their related gamma-ray emissions can be used to discriminate between uranium oxide (UOX) and mixed oxide (MOX) type fuel. Discrimination of irradiated MOX fuel from irradiated UOX fuel is important in nuclear facilities and for transport of nuclear fuel, for purposes of both criticality safety and nuclear safeguards. Although facility operators keep records on the identity and properties of each fuel, tools for nuclear safeguards inspectors that enable independent verification of the fuel are critical in the recovery of continuity of knowledge, should it be lost. A discrimination methodology for classification of UOX and MOX fuel, based on passive gamma-ray spectroscopy data and multivariate analysis methods, is presented. Nuclear fuels and their gamma-ray emissions were simulated in the Monte Carlo code Serpent, and the resulting data was used as input to train seven different multivariate classification techniques. The trained classifiers were subsequently implemented and evaluated with respect to their capabilities to correctly predict the classes of unknown fuel items. The best results concerning successful discrimination of UOX and MOX-fuel were acquired when using non-linear classification techniques, such as the k nearest neighbors method and the Gaussian kernel support vector machine. For fuel with cooling times up to 20 years, when it is considered that gamma-rays from the isotope 134Cs can still be efficiently measured, success rates of 100% were obtained. A sensitivity analysis indicated that these methods were also robust.

  19. Multivariant analyses of trace element patterns for environmental tracking

    International Nuclear Information System (INIS)

    Jervis, R.E.; Ko, M.M.C.; Junliang Tian; Puling Liu

    1993-01-01

    Nuclear-based analytical techniques: INAA, PIXE and photon activation permit simultaneous multielemental determination of concentrations in environmental materials, which data are often found sufficiently precise and free of uncontrolled, random errors among the various elements such that the data sets can yield valuable information on elemental communality through multi-variant statistical 'factor' analysis. Characteristic factor patterns obtained in this way can provide clues to the likely sources in the environment of various components. Recent studies in three different environmental situations: solid waste incinerators , Chinese soils, and iron and steel industry, involving measurements of 30-35 elements, have yielded distinct elemental patterns or, environmental signatures, with factor loading coefficients ranging mostly in the ranges: 0.7-0.96. (author) 10 refs.; 2 figs.; 9 tabs

  20. Statistical techniques to extract information during SMAP soil moisture assimilation

    Science.gov (United States)

    Kolassa, J.; Reichle, R. H.; Liu, Q.; Alemohammad, S. H.; Gentine, P.

    2017-12-01

    Statistical techniques permit the retrieval of soil moisture estimates in a model climatology while retaining the spatial and temporal signatures of the satellite observations. As a consequence, the need for bias correction prior to an assimilation of these estimates is reduced, which could result in a more effective use of the independent information provided by the satellite observations. In this study, a statistical neural network (NN) retrieval algorithm is calibrated using SMAP brightness temperature observations and modeled soil moisture estimates (similar to those used to calibrate the SMAP Level 4 DA system). Daily values of surface soil moisture are estimated using the NN and then assimilated into the NASA Catchment model. The skill of the assimilation estimates is assessed based on a comprehensive comparison to in situ measurements from the SMAP core and sparse network sites as well as the International Soil Moisture Network. The NN retrieval assimilation is found to significantly improve the model skill, particularly in areas where the model does not represent processes related to agricultural practices. Additionally, the NN method is compared to assimilation experiments using traditional bias correction techniques. The NN retrieval assimilation is found to more effectively use the independent information provided by SMAP resulting in larger model skill improvements than assimilation experiments using traditional bias correction techniques.

  1. Impact of statistical learning methods on the predictive power of multivariate normal tissue complication probability models.

    Science.gov (United States)

    Xu, Cheng-Jian; van der Schaaf, Arjen; Schilstra, Cornelis; Langendijk, Johannes A; van't Veld, Aart A

    2012-03-15

    To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator (LASSO), and Bayesian model averaging (BMA), were used to build NTCP models of xerostomia following radiotherapy treatment for head and neck cancer. Performance of each learning method was evaluated by a repeated cross-validation scheme in order to obtain a fair comparison among methods. It was found that the LASSO and BMA methods produced models with significantly better predictive power than that of the stepwise selection method. Furthermore, the LASSO method yields an easily interpretable model as the stepwise method does, in contrast to the less intuitive BMA method. The commonly used stepwise selection method, which is simple to execute, may be insufficient for NTCP modeling. The LASSO method is recommended. Copyright © 2012 Elsevier Inc. All rights reserved.

  2. Impact of Statistical Learning Methods on the Predictive Power of Multivariate Normal Tissue Complication Probability Models

    Energy Technology Data Exchange (ETDEWEB)

    Xu Chengjian, E-mail: c.j.xu@umcg.nl [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands); Schaaf, Arjen van der; Schilstra, Cornelis; Langendijk, Johannes A.; Veld, Aart A. van' t [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands)

    2012-03-15

    Purpose: To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. Methods and Materials: In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator (LASSO), and Bayesian model averaging (BMA), were used to build NTCP models of xerostomia following radiotherapy treatment for head and neck cancer. Performance of each learning method was evaluated by a repeated cross-validation scheme in order to obtain a fair comparison among methods. Results: It was found that the LASSO and BMA methods produced models with significantly better predictive power than that of the stepwise selection method. Furthermore, the LASSO method yields an easily interpretable model as the stepwise method does, in contrast to the less intuitive BMA method. Conclusions: The commonly used stepwise selection method, which is simple to execute, may be insufficient for NTCP modeling. The LASSO method is recommended.

  3. Impact of Statistical Learning Methods on the Predictive Power of Multivariate Normal Tissue Complication Probability Models

    International Nuclear Information System (INIS)

    Xu Chengjian; Schaaf, Arjen van der; Schilstra, Cornelis; Langendijk, Johannes A.; Veld, Aart A. van’t

    2012-01-01

    Purpose: To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. Methods and Materials: In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator (LASSO), and Bayesian model averaging (BMA), were used to build NTCP models of xerostomia following radiotherapy treatment for head and neck cancer. Performance of each learning method was evaluated by a repeated cross-validation scheme in order to obtain a fair comparison among methods. Results: It was found that the LASSO and BMA methods produced models with significantly better predictive power than that of the stepwise selection method. Furthermore, the LASSO method yields an easily interpretable model as the stepwise method does, in contrast to the less intuitive BMA method. Conclusions: The commonly used stepwise selection method, which is simple to execute, may be insufficient for NTCP modeling. The LASSO method is recommended.

  4. Comparing Relationships among Yield and Its Related Traits in Mycorrhizal and Nonmycorrhizal Inoculated Wheat Cultivars under Different Water Regimes Using Multivariate Statistics

    Directory of Open Access Journals (Sweden)

    Armin Saed-Moucheshi

    2013-01-01

    Full Text Available Multivariate statistical techniques were used to compare the relationship between yield and its related traits under noninoculated and inoculated cultivars with mycorrhizal fungus (Glomus intraradices; each one consisted of three wheat cultivars and four water regimes. Results showed that, under inoculation conditions, spike weight per plant and total chlorophyll content of the flag leaf were the most important variables contributing to wheat grain yield variation, while, under noninoculated condition, in addition to two mentioned traits, grain weight per spike and leaf area were also important variables accounting for wheat grain yield variation. Therefore, spike weight per plant and chlorophyll content of flag leaf can be used as selection criteria in breeding programs for both inoculated and noninoculated wheat cultivars under different water regimes, and also grain weight per spike and leaf area can be considered for noninoculated condition. Furthermore, inoculation of wheat cultivars showed higher value in the most measured traits, and the results indicated that inoculation treatment could change the relationship among morphological traits of wheat cultivars under drought stress. Also, it seems that the results of stepwise regression as a selecting method together with principal component and factor analysis are stronger methods to be applied in breeding programs for screening important traits.

  5. A course in statistics with R

    CERN Document Server

    Tattar, Prabhanjan N; Manjunath, B G

    2016-01-01

    Integrates the theory and applications of statistics using R A Course in Statistics with R has been written to bridge the gap between theory and applications and explain how mathematical expressions are converted into R programs. The book has been primarily designed as a useful companion for a Masters student during each semester of the course, but will also help applied statisticians in revisiting the underpinnings of the subject. With this dual goal in mind, the book begins with R basics and quickly covers visualization and exploratory analysis. Probability and statistical inference, inclusive of classical, nonparametric, and Bayesian schools, is developed with definitions, motivations, mathematical expression and R programs in a way which will help the reader to understand the mathematical development as well as R implementation. Linear regression models, experimental designs, multivariate analysis, and categorical data analysis are treated in a way which makes effective use of visualization techniques and...

  6. Statistics and analysis of scientific data

    CERN Document Server

    Bonamente, Massimiliano

    2017-01-01

    The revised second edition of this textbook provides the reader with a solid foundation in probability theory and statistics as applied to the physical sciences, engineering and related fields. It covers a broad range of numerical and analytical methods that are essential for the correct analysis of scientific data, including probability theory, distribution functions of statistics, fits to two-dimensional data and parameter estimation, Monte Carlo methods and Markov chains. Features new to this edition include: • a discussion of statistical techniques employed in business science, such as multiple regression analysis of multivariate datasets. • a new chapter on the various measures of the mean including logarithmic averages. • new chapters on systematic errors and intrinsic scatter, and on the fitting of data with bivariate errors. • a new case study and additional worked examples. • mathematical derivations and theoretical background material have been appropriately marked,to improve the readabili...

  7. Human Exposure Risk Assessment Due to Heavy Metals in Groundwater by Pollution Index and Multivariate Statistical Methods: A Case Study from South Africa

    OpenAIRE

    Vetrimurugan Elumalai; K. Brindha; Elango Lakshmanan

    2017-01-01

    Heavy metals in surface and groundwater were analysed and their sources were identified using multivariate statistical tools for two towns in South Africa. Human exposure risk through the drinking water pathway was also assessed. Electrical conductivity values showed that groundwater is desirable to permissible for drinking except for six locations. Concentration of aluminium, lead and nickel were above the permissible limit for drinking at all locations. Boron, cadmium, iron and manganese ex...

  8. Multivariate analysis of remote LIBS spectra using partial least squares, principal component analysis, and related techniques

    Energy Technology Data Exchange (ETDEWEB)

    Clegg, Samuel M [Los Alamos National Laboratory; Barefield, James E [Los Alamos National Laboratory; Wiens, Roger C [Los Alamos National Laboratory; Sklute, Elizabeth [MT HOLYOKE COLLEGE; Dyare, Melinda D [MT HOLYOKE COLLEGE

    2008-01-01

    Quantitative analysis with LIBS traditionally employs calibration curves that are complicated by the chemical matrix effects. These chemical matrix effects influence the LIBS plasma and the ratio of elemental composition to elemental emission line intensity. Consequently, LIBS calibration typically requires a priori knowledge of the unknown, in order for a series of calibration standards similar to the unknown to be employed. In this paper, three new Multivariate Analysis (MV A) techniques are employed to analyze the LIBS spectra of 18 disparate igneous and highly-metamorphosed rock samples. Partial Least Squares (PLS) analysis is used to generate a calibration model from which unknown samples can be analyzed. Principal Components Analysis (PCA) and Soft Independent Modeling of Class Analogy (SIMCA) are employed to generate a model and predict the rock type of the samples. These MV A techniques appear to exploit the matrix effects associated with the chemistries of these 18 samples.

  9. Multivariate statistical process control in product quality review assessment - A case study.

    Science.gov (United States)

    Kharbach, M; Cherrah, Y; Vander Heyden, Y; Bouklouze, A

    2017-11-01

    According to the Food and Drug Administration and the European Good Manufacturing Practices (GMP) guidelines, Annual Product Review (APR) is a mandatory requirement in GMP. It consists of evaluating a large collection of qualitative or quantitative data in order to verify the consistency of an existing process. According to the Code of Federal Regulation Part 11 (21 CFR 211.180), all finished products should be reviewed annually for the quality standards to determine the need of any change in specification or manufacturing of drug products. Conventional Statistical Process Control (SPC) evaluates the pharmaceutical production process by examining only the effect of a single factor at the time using a Shewhart's chart. It neglects to take into account the interaction between the variables. In order to overcome this issue, Multivariate Statistical Process Control (MSPC) can be used. Our case study concerns an APR assessment, where 164 historical batches containing six active ingredients, manufactured in Morocco, were collected during one year. Each batch has been checked by assaying the six active ingredients by High Performance Liquid Chromatography according to European Pharmacopoeia monographs. The data matrix was evaluated both by SPC and MSPC. The SPC indicated that all batches are under control, while the MSPC, based on Principal Component Analysis (PCA), for the data being either autoscaled or robust scaled, showed four and seven batches, respectively, out of the Hotelling T 2 95% ellipse. Also, an improvement of the capability of the process is observed without the most extreme batches. The MSPC can be used for monitoring subtle changes in the manufacturing process during an APR assessment. Copyright © 2017 Académie Nationale de Pharmacie. Published by Elsevier Masson SAS. All rights reserved.

  10. A Comparison of Pseudo-Maximum Likelihood and Asymptotically Distribution-Free Dynamic Factor Analysis Parameter Estimation in Fitting Covariance-Structure Models to Block-Toeplitz Representing Single-Subject Multivariate Time-Series

    NARCIS (Netherlands)

    Molenaar, P.C.M.; Nesselroade, J.R.

    1998-01-01

    The study of intraindividual variability pervades empirical inquiry in virtually all subdisciplines of psychology. The statistical analysis of multivariate time-series data - a central product of intraindividual investigations - requires special modeling techniques. The dynamic factor model (DFM),

  11. The assessment of processes controlling the spatial distribution of hydrogeochemical groundwater types in Mali using multivariate statistics

    Science.gov (United States)

    Keita, Souleymane; Zhonghua, Tang

    2017-10-01

    Sustainable management of groundwater resources is a major issue for developing countries, especially in Mali. The multiple uses of groundwater led countries to promote sound management policies for sustainable use of the groundwater resources. For this reason, each country needs data enabling it to monitor and predict the changes of the resources. Also given the importance of groundwater quality changes often marked by the recurrence of droughts; the potential impacts of regional and geological setting of groundwater resources requires careful study. Unfortunately, recent decades have seen a considerable reduction of national capacities to ensure the hydrogeological monitoring and production of qualit data for decision making. The purpose of this work is to use the groundwater data and translate into useful information that can improve water resources management capacity in Mali. In this paper, we used groundwater analytical data from accredited, laboratories in Mali to carry out a national scale assessment of the groundwater types and their distribution. We, adapted multivariate statistical methods to classify 2035 groundwater samples into seven main groundwater types and built a national scale map from the results. We used a two-level K-mean clustering technique to examine the hydro-geochemical records as percentages of the total concentrations of major ions, namely sodium (Na), magnesium (Mg), calcium (Ca), chloride (Cl), bicarbonate (HCO3), and sulphate (SO4). The first step of clustering formed 20 groups, and these groups were then re-clustered to produce the final seven groundwater types. The results were verified and confirmed using Principal Component Analysis (PCA) and RockWare (Aq.QA) software. We found that HCO3 was the most dominant anion throughout the country and that Cl and SO4 were only important in some local zones. The dominant cations were Na and Mg. Also, major ion ratios changed with geographical location and geological, and climatic

  12. Curve fitting and modeling with splines using statistical variable selection techniques

    Science.gov (United States)

    Smith, P. L.

    1982-01-01

    The successful application of statistical variable selection techniques to fit splines is demonstrated. Major emphasis is given to knot selection, but order determination is also discussed. Two FORTRAN backward elimination programs, using the B-spline basis, were developed. The program for knot elimination is compared in detail with two other spline-fitting methods and several statistical software packages. An example is also given for the two-variable case using a tensor product basis, with a theoretical discussion of the difficulties of their use.

  13. Simplicial band depth for multivariate functional data

    KAUST Repository

    Ló pez-Pintado, Sara; Sun, Ying; Lin, Juan K.; Genton, Marc G.

    2014-01-01

    sample of curves. Based on these depths, a sample of multivariate curves can be ordered from the center outward and order statistics can be defined. Properties of the proposed depths, such as invariance and consistency, can be established. A simulation

  14. Statistical precision of delayed-neutron nondestructive assay techniques

    International Nuclear Information System (INIS)

    Bayne, C.K.; McNeany, S.R.

    1979-02-01

    A theoretical analysis of the statistical precision of delayed-neutron nondestructive assay instruments is presented. Such instruments measure the fissile content of nuclear fuel samples by neutron irradiation and delayed-neutron detection. The precision of these techniques is limited by the statistical nature of the nuclear decay process, but the precision can be optimized by proper selection of system operating parameters. Our method is a three-part analysis. We first present differential--difference equations describing the fundamental physics of the measurements. We then derive and present complete analytical solutions to these equations. Final equations governing the expected number and variance of delayed-neutron counts were computer programmed to calculate the relative statistical precision of specific system operating parameters. Our results show that Poisson statistics do not govern the number of counts accumulated in multiple irradiation-count cycles and that, in general, maximum count precision does not correspond with maximum count as first expected. Covariance between the counts of individual cycles must be considered in determining the optimum number of irradiation-count cycles and the optimum irradiation-to-count time ratio. For the assay system in use at ORNL, covariance effects are small, but for systems with short irradiation-to-count transition times, covariance effects force the optimum number of irradiation-count cycles to be half those giving maximum count. We conclude that the equations governing the expected value and variance of delayed-neutron counts have been derived in closed form. These have been computerized and can be used to select optimum operating parameters for delayed-neutron assay devices

  15. Matrix-based introduction to multivariate data analysis

    CERN Document Server

    Adachi, Kohei

    2016-01-01

    This book enables readers who may not be familiar with matrices to understand a variety of multivariate analysis procedures in matrix forms. Another feature of the book is that it emphasizes what model underlies a procedure and what objective function is optimized for fitting the model to data. The author believes that the matrix-based learning of such models and objective functions is the fastest way to comprehend multivariate data analysis. The text is arranged so that readers can intuitively capture the purposes for which multivariate analysis procedures are utilized: plain explanations of the purposes with numerical examples precede mathematical descriptions in almost every chapter. This volume is appropriate for undergraduate students who already have studied introductory statistics. Graduate students and researchers who are not familiar with matrix-intensive formulations of multivariate data analysis will also find the book useful, as it is based on modern matrix formulations with a special emphasis on ...

  16. Fractional and multivariable calculus model building and optimization problems

    CERN Document Server

    Mathai, A M

    2017-01-01

    This textbook presents a rigorous approach to multivariable calculus in the context of model building and optimization problems. This comprehensive overview is based on lectures given at five SERC Schools from 2008 to 2012 and covers a broad range of topics that will enable readers to understand and create deterministic and nondeterministic models. Researchers, advanced undergraduate, and graduate students in mathematics, statistics, physics, engineering, and biological sciences will find this book to be a valuable resource for finding appropriate models to describe real-life situations. The first chapter begins with an introduction to fractional calculus moving on to discuss fractional integrals, fractional derivatives, fractional differential equations and their solutions. Multivariable calculus is covered in the second chapter and introduces the fundamentals of multivariable calculus (multivariable functions, limits and continuity, differentiability, directional derivatives and expansions of multivariable ...

  17. The Statistical Analysis Techniques to Support the NGNP Fuel Performance Experiments

    International Nuclear Information System (INIS)

    Pham, Bihn T.; Einerson, Jeffrey J.

    2010-01-01

    This paper describes the development and application of statistical analysis techniques to support the AGR experimental program on NGNP fuel performance. The experiments conducted in the Idaho National Laboratory's Advanced Test Reactor employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule. The tests are instrumented with thermocouples embedded in graphite blocks and the target quantity (fuel/graphite temperature) is regulated by the He-Ne gas mixture that fills the gap volume. Three techniques for statistical analysis, namely control charting, correlation analysis, and regression analysis, are implemented in the SAS-based NGNP Data Management and Analysis System (NDMAS) for automated processing and qualification of the AGR measured data. The NDMAS also stores daily neutronic (power) and thermal (heat transfer) code simulation results along with the measurement data, allowing for their combined use and comparative scrutiny. The ultimate objective of this work includes (a) a multi-faceted system for data monitoring and data accuracy testing, (b) identification of possible modes of diagnostics deterioration and changes in experimental conditions, (c) qualification of data for use in code validation, and (d) identification and use of data trends to support effective control of test conditions with respect to the test target. Analysis results and examples given in the paper show the three statistical analysis techniques providing a complementary capability to warn of thermocouple failures. It also suggests that the regression analysis models relating calculated fuel temperatures and thermocouple readings can enable online regulation of experimental parameters (i.e. gas mixture content), to effectively maintain the target quantity (fuel temperature) within a given range.

  18. The statistical analysis techniques to support the NGNP fuel performance experiments

    Energy Technology Data Exchange (ETDEWEB)

    Pham, Binh T., E-mail: Binh.Pham@inl.gov; Einerson, Jeffrey J.

    2013-10-15

    This paper describes the development and application of statistical analysis techniques to support the Advanced Gas Reactor (AGR) experimental program on Next Generation Nuclear Plant (NGNP) fuel performance. The experiments conducted in the Idaho National Laboratory’s Advanced Test Reactor employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule. The tests are instrumented with thermocouples embedded in graphite blocks and the target quantity (fuel temperature) is regulated by the He–Ne gas mixture that fills the gap volume. Three techniques for statistical analysis, namely control charting, correlation analysis, and regression analysis, are implemented in the NGNP Data Management and Analysis System for automated processing and qualification of the AGR measured data. The neutronic and thermal code simulation results are used for comparative scrutiny. The ultimate objective of this work includes (a) a multi-faceted system for data monitoring and data accuracy testing, (b) identification of possible modes of diagnostics deterioration and changes in experimental conditions, (c) qualification of data for use in code validation, and (d) identification and use of data trends to support effective control of test conditions with respect to the test target. Analysis results and examples given in the paper show the three statistical analysis techniques providing a complementary capability to warn of thermocouple failures. It also suggests that the regression analysis models relating calculated fuel temperatures and thermocouple readings can enable online regulation of experimental parameters (i.e. gas mixture content), to effectively maintain the fuel temperature within a given range.

  19. The association of 83 plasma proteins with CHD mortality, BMI, HDL-, and total-cholesterol in men: Applying multivariate statistics to identify proteins with prognostic value and biological relevance

    NARCIS (Netherlands)

    Geert Heidema, A.; Thissen, U.; Boer, J.M.A.; Bouwman, F.G.; Feskens, E.J.M.; Mariman, E.C.M.

    2009-01-01

    In this study, we applied the multivariate statistical tool Partial Least Squares (PLS) to analyze the relative importance of 83 plasma proteins in relation to coronary heart disease (CHD) mortality and the intermediate end points body mass index, HDL-cholesterol and total cholesterol. From a Dutch

  20. Shannon Entropy and Mutual Information for Multivariate Skew-Elliptical Distributions

    KAUST Repository

    Arellano-Valle, Reinaldo B.

    2012-02-27

    The entropy and mutual information index are important concepts developed by Shannon in the context of information theory. They have been widely studied in the case of the multivariate normal distribution. We first extend these tools to the full symmetric class of multivariate elliptical distributions and then to the more flexible families of multivariate skew-elliptical distributions. We study in detail the cases of the multivariate skew-normal and skew-t distributions. We implement our findings to the application of the optimal design of an ozone monitoring station network in Santiago de Chile. © 2012 Board of the Foundation of the Scandinavian Journal of Statistics.

  1. Shannon Entropy and Mutual Information for Multivariate Skew-Elliptical Distributions

    KAUST Repository

    Arellano-Valle, Reinaldo B.; Contreras-Reyes, Javier E.; Genton, Marc G.

    2012-01-01

    The entropy and mutual information index are important concepts developed by Shannon in the context of information theory. They have been widely studied in the case of the multivariate normal distribution. We first extend these tools to the full symmetric class of multivariate elliptical distributions and then to the more flexible families of multivariate skew-elliptical distributions. We study in detail the cases of the multivariate skew-normal and skew-t distributions. We implement our findings to the application of the optimal design of an ozone monitoring station network in Santiago de Chile. © 2012 Board of the Foundation of the Scandinavian Journal of Statistics.

  2. Statistical evaluation of recorded knowledge in nuclear and other instrumental analytical techniques

    International Nuclear Information System (INIS)

    Braun, T.

    1987-01-01

    The main points addressed in this study are the following: Statistical distribution patterns of published literature on instrumental analytical techniques 1981-1984; structure of scientific literature and heuristics for identifying active specialities and emerging hot spot research areas in instrumental analytical techniques; growth and growth rates of the literature in some of the identified hot research areas; quality and quantity in instrumental analytical research output. (orig.)

  3. Simplicial band depth for multivariate functional data

    KAUST Repository

    López-Pintado, Sara

    2014-03-05

    We propose notions of simplicial band depth for multivariate functional data that extend the univariate functional band depth. The proposed simplicial band depths provide simple and natural criteria to measure the centrality of a trajectory within a sample of curves. Based on these depths, a sample of multivariate curves can be ordered from the center outward and order statistics can be defined. Properties of the proposed depths, such as invariance and consistency, can be established. A simulation study shows the robustness of this new definition of depth and the advantages of using a multivariate depth versus the marginal depths for detecting outliers. Real data examples from growth curves and signature data are used to illustrate the performance and usefulness of the proposed depths. © 2014 Springer-Verlag Berlin Heidelberg.

  4. Basic elements of computational statistics

    CERN Document Server

    Härdle, Wolfgang Karl; Okhrin, Yarema

    2017-01-01

    This textbook on computational statistics presents tools and concepts of univariate and multivariate statistical data analysis with a strong focus on applications and implementations in the statistical software R. It covers mathematical, statistical as well as programming problems in computational statistics and contains a wide variety of practical examples. In addition to the numerous R sniplets presented in the text, all computer programs (quantlets) and data sets to the book are available on GitHub and referred to in the book. This enables the reader to fully reproduce as well as modify and adjust all examples to their needs. The book is intended for advanced undergraduate and first-year graduate students as well as for data analysts new to the job who would like a tour of the various statistical tools in a data analysis workshop. The experienced reader with a good knowledge of statistics and programming might skip some sections on univariate models and enjoy the various mathematical roots of multivariate ...

  5. Parameter-free extraction of EMCD from an energy-filtered diffraction datacube using multivariate curve resolution

    International Nuclear Information System (INIS)

    Muto, S.; Tatsumi, K.; Rusz, J.

    2013-01-01

    We present a parameter-free method of extraction of the electron magnetic circular dichroism spectra from energy-filtered diffraction patterns measured on a crystalline specimen. The method is based on a multivariate curve resolution technique. The main advantage of the proposed method is that it allows extraction of the magnetic signal regardless of the symmetry and orientation of the crystal, as long as there is a sufficiently strong magnetic component of the signal in the diffraction plane. This method essentially overcomes difficulties in extraction of the EMCD signal caused by complexity of dynamical diffraction effects. - Highlights: ► New method of extraction of EMCD signal using statistical methods (multivariate curve resolution). ► EMCD can be extracted quantitatively regardless of symmetry of crystal or its orientation. ► First principles simulation of EFDIF datacube, including dynamical diffraction effects

  6. A Framework for Diagnosing the Out-of-Control Signals in Multivariate Process Using Optimized Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Tai-fu Li

    2013-01-01

    Full Text Available Multivariate statistical process control is the continuation and development of unitary statistical process control. Most multivariate statistical quality control charts are usually used (in manufacturing and service industries to determine whether a process is performing as intended or if there are some unnatural causes of variation upon an overall statistics. Once the control chart detects out-of-control signals, one difficulty encountered with multivariate control charts is the interpretation of an out-of-control signal. That is, we have to determine whether one or more or a combination of variables is responsible for the abnormal signal. A novel approach for diagnosing the out-of-control signals in the multivariate process is described in this paper. The proposed methodology uses the optimized support vector machines (support vector machine classification based on genetic algorithm to recognize set of subclasses of multivariate abnormal patters, identify the responsible variable(s on the occurrence of abnormal pattern. Multiple sets of experiments are used to verify this model. The performance of the proposed approach demonstrates that this model can accurately classify the source(s of out-of-control signal and even outperforms the conventional multivariate control scheme.

  7. Multivariate calibration analysis of colorimetric mercury sensing using a molecular probe

    International Nuclear Information System (INIS)

    Perez-Hernandez, Javier; Albero, Josep; Correig, Xavier; Llobet, Eduard; Palomares, Emilio

    2009-01-01

    Selectivity is one of the main challenges of sensors, particularly those based on chemical interactions. Multivariate analytical models can determine the concentration of analytes even in the presence of other potential interferences. In this work, we have determined the presence of mercury ions in aqueous solutions in the ppm range (0-2 mg L -1 ) using a ruthenium bis-thiocyanate complex as a chemical probe. Moreover, we have analyzed the mercury-containing solutions with the co-existence of higher concentrations (19.5 mg L -1 ) of other potential competitors such as Cd 2+ , Pb 2+ , Cu 2+ and Zn 2+ ions. Our experimental model is based on partial least squares (PLS) method and other techniques as genetic algorithm and statistical feature selection (SFS) that have been used to refine, beforehand, the analytical data. In summary, we have demonstrated that the root mean square error of prediction without pre-treatment and with statistical feature selection can be reduced from 10.22% to 6.27%

  8. Comparison of adaptive statistical iterative and filtered back projection reconstruction techniques in brain CT

    International Nuclear Information System (INIS)

    Ren, Qingguo; Dewan, Sheilesh Kumar; Li, Ming; Li, Jianying; Mao, Dingbiao; Wang, Zhenglei; Hua, Yanqing

    2012-01-01

    Purpose: To compare image quality and visualization of normal structures and lesions in brain computed tomography (CT) with adaptive statistical iterative reconstruction (ASIR) and filtered back projection (FBP) reconstruction techniques in different X-ray tube current–time products. Materials and methods: In this IRB-approved prospective study, forty patients (nineteen men, twenty-one women; mean age 69.5 ± 11.2 years) received brain scan at different tube current–time products (300 and 200 mAs) in 64-section multi-detector CT (GE, Discovery CT750 HD). Images were reconstructed with FBP and four levels of ASIR-FBP blending. Two radiologists (please note that our hospital is renowned for its geriatric medicine department, and these two radiologists are more experienced in chronic cerebral vascular disease than in neoplastic disease, so this research did not contain cerebral tumors but as a discussion) assessed all the reconstructed images for visibility of normal structures, lesion conspicuity, image contrast and diagnostic confidence in a blinded and randomized manner. Volume CT dose index (CTDI vol ) and dose-length product (DLP) were recorded. All the data were analyzed by using SPSS 13.0 statistical analysis software. Results: There was no statistically significant difference between the image qualities at 200 mAs with 50% ASIR blending technique and 300 mAs with FBP technique (p > .05). While between the image qualities at 200 mAs with FBP and 300 mAs with FBP technique a statistically significant difference (p < .05) was found. Conclusion: ASIR provided same image quality and diagnostic ability in brain imaging with greater than 30% dose reduction compared with FBP reconstruction technique

  9. Comparison of adaptive statistical iterative and filtered back projection reconstruction techniques in brain CT

    Energy Technology Data Exchange (ETDEWEB)

    Ren, Qingguo, E-mail: renqg83@163.com [Department of Radiology, Hua Dong Hospital of Fudan University, Shanghai 200040 (China); Dewan, Sheilesh Kumar, E-mail: sheilesh_d1@hotmail.com [Department of Geriatrics, Hua Dong Hospital of Fudan University, Shanghai 200040 (China); Li, Ming, E-mail: minli77@163.com [Department of Radiology, Hua Dong Hospital of Fudan University, Shanghai 200040 (China); Li, Jianying, E-mail: Jianying.Li@med.ge.com [CT Imaging Research Center, GE Healthcare China, Beijing (China); Mao, Dingbiao, E-mail: maodingbiao74@163.com [Department of Radiology, Hua Dong Hospital of Fudan University, Shanghai 200040 (China); Wang, Zhenglei, E-mail: Williswang_doc@yahoo.com.cn [Department of Radiology, Shanghai Electricity Hospital, Shanghai 200050 (China); Hua, Yanqing, E-mail: cjr.huayanqing@vip.163.com [Department of Radiology, Hua Dong Hospital of Fudan University, Shanghai 200040 (China)

    2012-10-15

    Purpose: To compare image quality and visualization of normal structures and lesions in brain computed tomography (CT) with adaptive statistical iterative reconstruction (ASIR) and filtered back projection (FBP) reconstruction techniques in different X-ray tube current–time products. Materials and methods: In this IRB-approved prospective study, forty patients (nineteen men, twenty-one women; mean age 69.5 ± 11.2 years) received brain scan at different tube current–time products (300 and 200 mAs) in 64-section multi-detector CT (GE, Discovery CT750 HD). Images were reconstructed with FBP and four levels of ASIR-FBP blending. Two radiologists (please note that our hospital is renowned for its geriatric medicine department, and these two radiologists are more experienced in chronic cerebral vascular disease than in neoplastic disease, so this research did not contain cerebral tumors but as a discussion) assessed all the reconstructed images for visibility of normal structures, lesion conspicuity, image contrast and diagnostic confidence in a blinded and randomized manner. Volume CT dose index (CTDI{sub vol}) and dose-length product (DLP) were recorded. All the data were analyzed by using SPSS 13.0 statistical analysis software. Results: There was no statistically significant difference between the image qualities at 200 mAs with 50% ASIR blending technique and 300 mAs with FBP technique (p > .05). While between the image qualities at 200 mAs with FBP and 300 mAs with FBP technique a statistically significant difference (p < .05) was found. Conclusion: ASIR provided same image quality and diagnostic ability in brain imaging with greater than 30% dose reduction compared with FBP reconstruction technique.

  10. Statistical mixture design and multivariate analysis of inkjet printed a-WO3/TiO2/WOX electrochromic films.

    Science.gov (United States)

    Wojcik, Pawel Jerzy; Pereira, Luís; Martins, Rodrigo; Fortunato, Elvira

    2014-01-13

    An efficient mathematical strategy in the field of solution processed electrochromic (EC) films is outlined as a combination of an experimental work, modeling, and information extraction from massive computational data via statistical software. Design of Experiment (DOE) was used for statistical multivariate analysis and prediction of mixtures through a multiple regression model, as well as the optimization of a five-component sol-gel precursor subjected to complex constraints. This approach significantly reduces the number of experiments to be realized, from 162 in the full factorial (L=3) and 72 in the extreme vertices (D=2) approach down to only 30 runs, while still maintaining a high accuracy of the analysis. By carrying out a finite number of experiments, the empirical modeling in this study shows reasonably good prediction ability in terms of the overall EC performance. An optimized ink formulation was employed in a prototype of a passive EC matrix fabricated in order to test and trial this optically active material system together with a solid-state electrolyte for the prospective application in EC displays. Coupling of DOE with chromogenic material formulation shows the potential to maximize the capabilities of these systems and ensures increased productivity in many potential solution-processed electrochemical applications.

  11. Ripening-dependent metabolic changes in the volatiles of pineapple (Ananas comosus (L.) Merr.) fruit: II. Multivariate statistical profiling of pineapple aroma compounds based on comprehensive two-dimensional gas chromatography-mass spectrometry.

    Science.gov (United States)

    Steingass, Christof Björn; Jutzi, Manfred; Müller, Jenny; Carle, Reinhold; Schmarr, Hans-Georg

    2015-03-01

    Ripening-dependent changes of pineapple volatiles were studied in a nontargeted profiling analysis. Volatiles were isolated via headspace solid phase microextraction and analyzed by comprehensive 2D gas chromatography and mass spectrometry (HS-SPME-GC×GC-qMS). Profile patterns presented in the contour plots were evaluated applying image processing techniques and subsequent multivariate statistical data analysis. Statistical methods comprised unsupervised hierarchical cluster analysis (HCA) and principal component analysis (PCA) to classify the samples. Supervised partial least squares discriminant analysis (PLS-DA) and partial least squares (PLS) regression were applied to discriminate different ripening stages and describe the development of volatiles during postharvest storage, respectively. Hereby, substantial chemical markers allowing for class separation were revealed. The workflow permitted the rapid distinction between premature green-ripe pineapples and postharvest-ripened sea-freighted fruits. Volatile profiles of fully ripe air-freighted pineapples were similar to those of green-ripe fruits postharvest ripened for 6 days after simulated sea freight export, after PCA with only two principal components. However, PCA considering also the third principal component allowed differentiation between air-freighted fruits and the four progressing postharvest maturity stages of sea-freighted pineapples.

  12. Multivariate statistical analysis of a multi-step industrial processes

    DEFF Research Database (Denmark)

    Reinikainen, S.P.; Høskuldsson, Agnar

    2007-01-01

    Monitoring and quality control of industrial processes often produce information on how the data have been obtained. In batch processes, for instance, the process is carried out in stages; some process or control parameters are set at each stage. However, the obtained data might not be utilized...... efficiently, even if this information may reveal significant knowledge about process dynamics or ongoing phenomena. When studying the process data, it may be important to analyse the data in the light of the physical or time-wise development of each process step. In this paper, a unified approach to analyse...... multivariate multi-step processes, where results from each step are used to evaluate future results, is presented. The methods presented are based on Priority PLS Regression. The basic idea is to compute the weights in the regression analysis for given steps, but adjust all data by the resulting score vectors...

  13. Chemometric and Statistical Analyses of ToF-SIMS Spectra of Increasingly Complex Biological Samples

    Energy Technology Data Exchange (ETDEWEB)

    Berman, E S; Wu, L; Fortson, S L; Nelson, D O; Kulp, K S; Wu, K J

    2007-10-24

    Characterizing and classifying molecular variation within biological samples is critical for determining fundamental mechanisms of biological processes that will lead to new insights including improved disease understanding. Towards these ends, time-of-flight secondary ion mass spectrometry (ToF-SIMS) was used to examine increasingly complex samples of biological relevance, including monosaccharide isomers, pure proteins, complex protein mixtures, and mouse embryo tissues. The complex mass spectral data sets produced were analyzed using five common statistical and chemometric multivariate analysis techniques: principal component analysis (PCA), linear discriminant analysis (LDA), partial least squares discriminant analysis (PLSDA), soft independent modeling of class analogy (SIMCA), and decision tree analysis by recursive partitioning. PCA was found to be a valuable first step in multivariate analysis, providing insight both into the relative groupings of samples and into the molecular basis for those groupings. For the monosaccharides, pure proteins and protein mixture samples, all of LDA, PLSDA, and SIMCA were found to produce excellent classification given a sufficient number of compound variables calculated. For the mouse embryo tissues, however, SIMCA did not produce as accurate a classification. The decision tree analysis was found to be the least successful for all the data sets, providing neither as accurate a classification nor chemical insight for any of the tested samples. Based on these results we conclude that as the complexity of the sample increases, so must the sophistication of the multivariate technique used to classify the samples. PCA is a preferred first step for understanding ToF-SIMS data that can be followed by either LDA or PLSDA for effective classification analysis. This study demonstrates the strength of ToF-SIMS combined with multivariate statistical and chemometric techniques to classify increasingly complex biological samples

  14. Multivariate statistical study of heavy metal enrichment in sediments of the Pearl River Estuary

    International Nuclear Information System (INIS)

    Liu, W.X.; Li, X.D.; Shen, Z.G.; Wang, D.C.; Wai, O.W.H.; Li, Y.S.

    2003-01-01

    Multivariate statistical analysis identified the heavy metal accumulation layers of sediment profiles and showed the various sources of metals in the estuary. - The concentrations and chemical partitioning of heavy metals in the sediment cores of the Pearl River Estuary were studied. Based on Pearson correlation coefficients and principal component analysis results, Al was selected as the concentration normalizer for Pb, while Fe was used as the normalizing element for Co, Cu, Ni and Zn. In each profile, sections with metal concentrations exceeding the upper 95% prediction interval of the linear regression model were regarded as metal enrichment layers. The heavy metal accumulation mainly occurred at sites in the western shallow water areas and east channel, which reflected the hydraulic conditions and influence from riparian anthropogenic activities. Heavy metals in the enrichment sections were evaluated by a sequential extraction method for possible chemical forms in sediments. Since the residual, Fe/Mn oxides and organic/sulfide fractions were dominant geochemical phases in the enriched sections, the bioavailability of heavy metals in sediments was generally low. The 206 Pb/ 207 Pb ratios in the metal-enriched sediment sections also revealed the influence of anthropogenic sources. The spatial distribution of cumulative heavy metals in the sediments suggested that the Zn and Cu mainly originated from point sources, while the Pb probably came from non-point sources in the estuary

  15. Optimal model-free prediction from multivariate time series

    Science.gov (United States)

    Runge, Jakob; Donner, Reik V.; Kurths, Jürgen

    2015-05-01

    Forecasting a time series from multivariate predictors constitutes a challenging problem, especially using model-free approaches. Most techniques, such as nearest-neighbor prediction, quickly suffer from the curse of dimensionality and overfitting for more than a few predictors which has limited their application mostly to the univariate case. Therefore, selection strategies are needed that harness the available information as efficiently as possible. Since often the right combination of predictors matters, ideally all subsets of possible predictors should be tested for their predictive power, but the exponentially growing number of combinations makes such an approach computationally prohibitive. Here a prediction scheme that overcomes this strong limitation is introduced utilizing a causal preselection step which drastically reduces the number of possible predictors to the most predictive set of causal drivers making a globally optimal search scheme tractable. The information-theoretic optimality is derived and practical selection criteria are discussed. As demonstrated for multivariate nonlinear stochastic delay processes, the optimal scheme can even be less computationally expensive than commonly used suboptimal schemes like forward selection. The method suggests a general framework to apply the optimal model-free approach to select variables and subsequently fit a model to further improve a prediction or learn statistical dependencies. The performance of this framework is illustrated on a climatological index of El Niño Southern Oscillation.

  16. Multivariate Statistical Analysis Software Technologies for Astrophysical Research Involving Large Data Bases

    Science.gov (United States)

    Djorgovski, S. G.

    1994-01-01

    We developed a package to process and analyze the data from the digital version of the Second Palomar Sky Survey. This system, called SKICAT, incorporates the latest in machine learning and expert systems software technology, in order to classify the detected objects objectively and uniformly, and facilitate handling of the enormous data sets from digital sky surveys and other sources. The system provides a powerful, integrated environment for the manipulation and scientific investigation of catalogs from virtually any source. It serves three principal functions: image catalog construction, catalog management, and catalog analysis. Through use of the GID3* Decision Tree artificial induction software, SKICAT automates the process of classifying objects within CCD and digitized plate images. To exploit these catalogs, the system also provides tools to merge them into a large, complex database which may be easily queried and modified when new data or better methods of calibrating or classifying become available. The most innovative feature of SKICAT is the facility it provides to experiment with and apply the latest in machine learning technology to the tasks of catalog construction and analysis. SKICAT provides a unique environment for implementing these tools for any number of future scientific purposes. Initial scientific verification and performance tests have been made using galaxy counts and measurements of galaxy clustering from small subsets of the survey data, and a search for very high redshift quasars. All of the tests were successful and produced new and interesting scientific results. Attachments to this report give detailed accounts of the technical aspects of the SKICAT system, and of some of the scientific results achieved to date. We also developed a user-friendly package for multivariate statistical analysis of small and moderate-size data sets, called STATPROG. The package was tested extensively on a number of real scientific applications and has

  17. Multivariate statistical analysis software technologies for astrophysical research involving large data bases

    Science.gov (United States)

    Djorgovski, S. George

    1994-01-01

    We developed a package to process and analyze the data from the digital version of the Second Palomar Sky Survey. This system, called SKICAT, incorporates the latest in machine learning and expert systems software technology, in order to classify the detected objects objectively and uniformly, and facilitate handling of the enormous data sets from digital sky surveys and other sources. The system provides a powerful, integrated environment for the manipulation and scientific investigation of catalogs from virtually any source. It serves three principal functions: image catalog construction, catalog management, and catalog analysis. Through use of the GID3* Decision Tree artificial induction software, SKICAT automates the process of classifying objects within CCD and digitized plate images. To exploit these catalogs, the system also provides tools to merge them into a large, complete database which may be easily queried and modified when new data or better methods of calibrating or classifying become available. The most innovative feature of SKICAT is the facility it provides to experiment with and apply the latest in machine learning technology to the tasks of catalog construction and analysis. SKICAT provides a unique environment for implementing these tools for any number of future scientific purposes. Initial scientific verification and performance tests have been made using galaxy counts and measurements of galaxy clustering from small subsets of the survey data, and a search for very high redshift quasars. All of the tests were successful, and produced new and interesting scientific results. Attachments to this report give detailed accounts of the technical aspects for multivariate statistical analysis of small and moderate-size data sets, called STATPROG. The package was tested extensively on a number of real scientific applications, and has produced real, published results.

  18. SPICE: exploration and analysis of post-cytometric complex multivariate datasets.

    Science.gov (United States)

    Roederer, Mario; Nozzi, Joshua L; Nason, Martha C

    2011-02-01

    Polychromatic flow cytometry results in complex, multivariate datasets. To date, tools for the aggregate analysis of these datasets across multiple specimens grouped by different categorical variables, such as demographic information, have not been optimized. Often, the exploration of such datasets is accomplished by visualization of patterns with pie charts or bar charts, without easy access to statistical comparisons of measurements that comprise multiple components. Here we report on algorithms and a graphical interface we developed for these purposes. In particular, we discuss thresholding necessary for accurate representation of data in pie charts, the implications for display and comparison of normalized versus unnormalized data, and the effects of averaging when samples with significant background noise are present. Finally, we define a statistic for the nonparametric comparison of complex distributions to test for difference between groups of samples based on multi-component measurements. While originally developed to support the analysis of T cell functional profiles, these techniques are amenable to a broad range of datatypes. Published 2011 Wiley-Liss, Inc.

  19. Multivariate Methods Based Soft Measurement for Wine Quality Evaluation

    Directory of Open Access Journals (Sweden)

    Shen Yin

    2014-01-01

    a decision. However, since the physicochemical indexes of wine can to some extent reflect the quality of wine, the multivariate statistical methods based soft measure can help the oenologist in wine evaluation.

  20. Training and evaluation of neural networks for multi-variate time series processing

    DEFF Research Database (Denmark)

    Fog, Torben L.; Larsen, Jan; Hansen, Lars Kai

    1995-01-01

    We study the training and generalization for multi-variate time series processing. It is suggested to used a quasi-maximum likelihood approach rather than the standard sum of squared errors, thus taking dependencies among the errors of the individual time series into account. This may lead...... to improved generalization performance. Further, we extend the optimal brain damage pruning technique to the multi-variate case. A key ingredient is an algebraic expression for the generalization ability of a multi-variate model. The variability of the suggested techniques are successfully demonstrated...

  1. Robust methods for multivariate data analysis A1

    DEFF Research Database (Denmark)

    Frosch, Stina; Von Frese, J.; Bro, Rasmus

    2005-01-01

    Outliers may hamper proper classical multivariate analysis, and lead to incorrect conclusions. To remedy the problem of outliers, robust methods are developed in statistics and chemometrics. Robust methods reduce or remove the effect of outlying data points and allow the ?good? data to primarily...... determine the result. This article reviews the most commonly used robust multivariate regression and exploratory methods that have appeared since 1996 in the field of chemometrics. Special emphasis is put on the robust versions of chemometric standard tools like PCA and PLS and the corresponding robust...

  2. Search for top squark Production at the LHC at $\\sqrt{\\text{s}}=13$ TeV with the ATLAS Detector Using Multivariate Analysis Techniques

    CERN Document Server

    AUTHOR|(CDS)2230945; Köhler, Nicolas Maximilian; Junggeburth, Johannes Josef

    Supersymmetry is a very promising extension of the Standard Model. It predicts new heavy particles, which are currently searched for in the ATLAS experiment at the Large Hadron Collider at a center-of-mass energy of 13 TeV. So far, all searches for supersymmetric particles use a cut-based signal selection. In this thesis, the use of multivariate selection techniques, Boosted Decision Trees and Artificial Neural Networks, is explored for the search for top squarks, the supersymmetric partner of the top quark. The multivariate methods increase the expected lower limit in the mass of top squarks by approximately 90 GeV from currently 990 GeV for small neutralino masses.

  3. Intelligent Prediction of Soccer Technical Skill on Youth Soccer Player's Relative Performance Using Multivariate Analysis and Artificial Neural Network Techniques

    OpenAIRE

    Abdullah, M. R; Maliki, A. B. H. M; Musa, R. M; Kosni, N. A; Juahir, H

    2016-01-01

    This study aims to predict the potential pattern of soccer technical skill on Malaysia youth soccer players relative performance using multivariate analysis and artificial neural network techniques. 184 male youth soccer players were recruited in Malaysia soccer academy (average age = 15.2±2.0) underwent to, physical fitness test, anthropometric, maturity, motivation and the level of skill related soccer. Unsupervised pattern recognition of principal component analysis (PCA) was used to ident...

  4. Analysis of preservative-treated wood by multivariate analysis of laser-induced breakdown spectroscopy spectra

    International Nuclear Information System (INIS)

    Martin, Madhavi Z.; Labbe, Nicole; Rials, Timothy G.; Wullschleger, Stan D.

    2005-01-01

    In this work, multivariate statistical analysis (MVA) techniques are coupled with laser-induced breakdown spectroscopy (LIBS) to identify preservative types (chromated copper arsenate, ammoniacal copper zinc or alkaline copper quat), and to predict elemental content in preservative-treated wood. The elemental composition of the samples was measured with a standard laboratory method of digestion followed by atomic absorption spectroscopy analysis. The elemental composition was then correlated with the LIBS spectra using projection to latent structures (PLS) models. The correlations for the different elements introduced by different treatments were very strong, with the correlation coefficients generally above 0.9. Additionally, principal component analysis (PCA) was used to differentiate the samples treated with different preservative formulations. The research has focused not only on demonstrating the application of LIBS as a tool for use in the forest products industry, but also considered sampling errors, limits of detection, reproducibility, and accuracy of measurements as they relate to multivariate analysis of this complex wood substrate

  5. Analysis of preservative-treated wood by multivariate analysis of laser-induced breakdown spectroscopy spectra

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Madhavi Z. [Environmental Sciences Division Oak Ridge National Laboratory, P.O. Box 2008 MS 6422, Oak Ridge TN 37831-6422 (United States); Labbe, Nicole [Forest Products Center, University of Tennessee, 2506 Jacob Drive, Knoxville, TN 37996-4570 (United States)]. E-mail: nlabbe@utk.edu; Rials, Timothy G. [Forest Products Center, University of Tennessee, 2506 Jacob Drive, Knoxville, TN 37996-4570 (United States); Wullschleger, Stan D. [Environmental Sciences Division Oak Ridge National Laboratory, P.O. Box 2008 MS 6422, Oak Ridge TN 37831-6422 (United States)

    2005-08-31

    In this work, multivariate statistical analysis (MVA) techniques are coupled with laser-induced breakdown spectroscopy (LIBS) to identify preservative types (chromated copper arsenate, ammoniacal copper zinc or alkaline copper quat), and to predict elemental content in preservative-treated wood. The elemental composition of the samples was measured with a standard laboratory method of digestion followed by atomic absorption spectroscopy analysis. The elemental composition was then correlated with the LIBS spectra using projection to latent structures (PLS) models. The correlations for the different elements introduced by different treatments were very strong, with the correlation coefficients generally above 0.9. Additionally, principal component analysis (PCA) was used to differentiate the samples treated with different preservative formulations. The research has focused not only on demonstrating the application of LIBS as a tool for use in the forest products industry, but also considered sampling errors, limits of detection, reproducibility, and accuracy of measurements as they relate to multivariate analysis of this complex wood substrate.

  6. Estimating an Effect Size in One-Way Multivariate Analysis of Variance (MANOVA)

    Science.gov (United States)

    Steyn, H. S., Jr.; Ellis, S. M.

    2009-01-01

    When two or more univariate population means are compared, the proportion of variation in the dependent variable accounted for by population group membership is eta-squared. This effect size can be generalized by using multivariate measures of association, based on the multivariate analysis of variance (MANOVA) statistics, to establish whether…

  7. The studies of post-medieval glass by multivariate and X-ray fluorescence analysis

    International Nuclear Information System (INIS)

    Kierzek, J.; Kunicki-Goldfinger, J.

    2002-01-01

    Multivariate statistical analysis of the results obtained by energy dispersive X-ray fluorescence analysis has been used in the study of baroque vessel glasses originated from central Europe. X-ray spectrometry can be applied as a completely non-destructive, non-sampling and multi-element method. It is very useful in the studies of valuable historical artefacts. For the last years, multivariate statistical analysis has been developed as an important tool for the archaeometric purposes. Cluster, principal component and discriminant analysis were applied for the classification of the examined objects. The obtained results show that these statistical tools are very useful and complementary in the studies of historical objects. (author)

  8. Multivariate Regression of Liver on Intestine of Mice: A ...

    African Journals Online (AJOL)

    Multivariate Regression of Liver on Intestine of Mice: A Chemotherapeutic Evaluation of Plant ... Using an analysis of covariance model, the effects ... The findings revealed, with the aid of likelihood-ratio statistic, a marked improvement in

  9. Analysis of Surface Water Pollution in the Kinta River Using Multivariate Technique

    International Nuclear Information System (INIS)

    Hamza Ahmad Isiyaka; Hafizan Juahir

    2015-01-01

    This study aims to investigate the spatial variation in the characteristics of water quality monitoring sites, identify the most significant parameters and the major possible sources of pollution, and apportion the source category in the Kinta River. 31 parameters collected from eight monitoring sites for eight years (2006-2013) were employed. The eight monitoring stations were spatially grouped into three independent clusters in a dendrogram. A drastic reduction in the number of monitored parameters from 31 to eight and nine significant parameters (P<0.05) was achieved using the forward stepwise and backward stepwise discriminate analysis (DA). Principal component analysis (PCA) accounted for more than 76 % in the total variance and attributes the source of pollution to anthropogenic and natural processes. The source apportionment using a combined multiple linear regression and principal component scores indicates that 41 % of the total pollution load is from rock weathering and untreated waste water, 26 % from waste discharge, 24 % from surface runoff and 7 % from faecal waste. This study proposes a reduction in the number of monitoring stations and parameters for a cost effective and time management in the monitoring processes and multivariate technique can provide a simple representation of complex and dynamic water quality characteristics. (author)

  10. Multivariate phase type distributions - Applications and parameter estimation

    DEFF Research Database (Denmark)

    Meisch, David

    The best known univariate probability distribution is the normal distribution. It is used throughout the literature in a broad field of applications. In cases where it is not sensible to use the normal distribution alternative distributions are at hand and well understood, many of these belonging...... and statistical inference, is the multivariate normal distribution. Unfortunately only little is known about the general class of multivariate phase type distribution. Considering the results concerning parameter estimation and inference theory of univariate phase type distributions, the class of multivariate...... projects and depend on reliable cost estimates. The Successive Principle is a group analysis method primarily used for analyzing medium to large projects in relation to cost or duration. We believe that the mathematical modeling used in the Successive Principle can be improved. We suggested a novel...

  11. Statistical identification of effective input variables

    International Nuclear Information System (INIS)

    Vaurio, J.K.

    1982-09-01

    A statistical sensitivity analysis procedure has been developed for ranking the input data of large computer codes in the order of sensitivity-importance. The method is economical for large codes with many input variables, since it uses a relatively small number of computer runs. No prior judgemental elimination of input variables is needed. The sceening method is based on stagewise correlation and extensive regression analysis of output values calculated with selected input value combinations. The regression process deals with multivariate nonlinear functions, and statistical tests are also available for identifying input variables that contribute to threshold effects, i.e., discontinuities in the output variables. A computer code SCREEN has been developed for implementing the screening techniques. The efficiency has been demonstrated by several examples and applied to a fast reactor safety analysis code (Venus-II). However, the methods and the coding are general and not limited to such applications

  12. Real-time monitoring of a coffee roasting process with near infrared spectroscopy using multivariate statistical analysis: A feasibility study.

    Science.gov (United States)

    Catelani, Tiago A; Santos, João Rodrigo; Páscoa, Ricardo N M J; Pezza, Leonardo; Pezza, Helena R; Lopes, João A

    2018-03-01

    This work proposes the use of near infrared (NIR) spectroscopy in diffuse reflectance mode and multivariate statistical process control (MSPC) based on principal component analysis (PCA) for real-time monitoring of the coffee roasting process. The main objective was the development of a MSPC methodology able to early detect disturbances to the roasting process resourcing to real-time acquisition of NIR spectra. A total of fifteen roasting batches were defined according to an experimental design to develop the MSPC models. This methodology was tested on a set of five batches where disturbances of different nature were imposed to simulate real faulty situations. Some of these batches were used to optimize the model while the remaining was used to test the methodology. A modelling strategy based on a time sliding window provided the best results in terms of distinguishing batches with and without disturbances, resourcing to typical MSPC charts: Hotelling's T 2 and squared predicted error statistics. A PCA model encompassing a time window of four minutes with three principal components was able to efficiently detect all disturbances assayed. NIR spectroscopy combined with the MSPC approach proved to be an adequate auxiliary tool for coffee roasters to detect faults in a conventional roasting process in real-time. Copyright © 2017 Elsevier B.V. All rights reserved.

  13. Notices about using elementary statistics in psychology

    OpenAIRE

    松田, 文子; 三宅, 幹子; 橋本, 優花里; 山崎, 理央; 森田, 愛子; 小嶋, 佳子

    2003-01-01

    Improper uses of elementary statistics that were often observed in beginners' manuscripts and papers were collected and better ways were suggested. This paper consists of three parts: About descriptive statistics, multivariate analyses, and statistical tests.

  14. Study on sources of colored glaze of Xiyue Temple in Shanxi province by INAA and multivariable statistical analysis

    International Nuclear Information System (INIS)

    Cheng Lin; Feng Songlin

    2005-01-01

    The major, minor and trace elements in the bodies of ancient colored glazes which came from the site of Xiyue Temple and Lidipo kiln in Shanxi province, and were unearthed from the stratums of Song, Yuan, Ming, Early Qing and Late Qing dynasty were analyzed by instrumental neutron activation analysis (INAA). The results of multivariable statistical analyses show that the chemical compositions of the colored glaze bodies are steady from Song to Early Qing dynasty, but distinctly different from that in Late Qing. Probably, the sources of fired material of ancient colored glaze from Song to Early Qing came from the site of Xiyue Temple. The chemical compositions of three pieces of colored glazes in Ming dynasty and that in Late Qing are similar to that of Lidipo kiln. From this, authors could conclude that the sources of the materials of ancient coloured glazes of Xiyue Temple in Late Qing dynasty were fired in Lidipo kiln. (authors)

  15. [Methods of the multivariate statistical analysis of so-called polyetiological diseases using the example of coronary heart disease].

    Science.gov (United States)

    Lifshits, A M

    1979-01-01

    General characteristics of the multivariate statistical analysis (MSA) is given. Methodical premises and criteria for the selection of an adequate MSA method applicable to pathoanatomic investigations of the epidemiology of multicausal diseases are presented. The experience of using MSA with computors and standard computing programs in studies of coronary arteries aterosclerosis on the materials of 2060 autopsies is described. The combined use of 4 MSA methods: sequential, correlational, regressional, and discriminant permitted to quantitate the contribution of each of the 8 examined risk factors in the development of aterosclerosis. The most important factors were found to be the age, arterial hypertension, and heredity. Occupational hypodynamia and increased fatness were more important in men, whereas diabetes melitus--in women. The registration of this combination of risk factors by MSA methods provides for more reliable prognosis of the likelihood of coronary heart disease with a fatal outcome than prognosis of the degree of coronary aterosclerosis.

  16. Inferring the origin of rare fruit distillates from compositional data using multivariate statistical analyses and the identification of new flavour constituents.

    Science.gov (United States)

    Mihajilov-Krstev, Tatjana M; Denić, Marija S; Zlatković, Bojan K; Stankov-Jovanović, Vesna P; Mitić, Violeta D; Stojanović, Gordana S; Radulović, Niko S

    2015-04-01

    In Serbia, delicatessen fruit alcoholic drinks are produced from autochthonous fruit-bearing species such as cornelian cherry, blackberry, elderberry, wild strawberry, European wild apple, European blueberry and blackthorn fruits. There are no chemical data on many of these and herein we analysed volatile minor constituents of these rare fruit distillates. Our second goal was to determine possible chemical markers of these distillates through a statistical/multivariate treatment of the herein obtained and previously reported data. Detailed chemical analyses revealed a complex volatile profile of all studied fruit distillates with 371 identified compounds. A number of constituents were recognised as marker compounds for a particular distillate. Moreover, 33 of them represent newly detected flavour constituents in alcoholic beverages or, in general, in foodstuffs. With the aid of multivariate analyses, these volatile profiles were successfully exploited to infer the origin of raw materials used in the production of these spirits. It was also shown that all fruit distillates possessed weak antimicrobial properties. It seems that the aroma of these highly esteemed wild-fruit spirits depends on the subtle balance of various minor volatile compounds, whereby some of them are specific to a certain type of fruit distillate and enable their mutual distinction. © 2014 Society of Chemical Industry.

  17. Methods for Analyzing Multivariate Phenotypes in Genetic Association Studies

    Directory of Open Access Journals (Sweden)

    Qiong Yang

    2012-01-01

    Full Text Available Multivariate phenotypes are frequently encountered in genetic association studies. The purpose of analyzing multivariate phenotypes usually includes discovery of novel genetic variants of pleiotropy effects, that is, affecting multiple phenotypes, and the ultimate goal of uncovering the underlying genetic mechanism. In recent years, there have been new method development and application of existing statistical methods to such phenotypes. In this paper, we provide a review of the available methods for analyzing association between a single marker and a multivariate phenotype consisting of the same type of components (e.g., all continuous or all categorical or different types of components (e.g., some are continuous and others are categorical. We also reviewed causal inference methods designed to test whether the detected association with the multivariate phenotype is truly pleiotropy or the genetic marker exerts its effects on some phenotypes through affecting the others.

  18. Multivariate analysis of eigenvalues and eigenvectors in tensor based morphometry

    Science.gov (United States)

    Rajagopalan, Vidya; Schwartzman, Armin; Hua, Xue; Leow, Alex; Thompson, Paul; Lepore, Natasha

    2015-01-01

    We develop a new algorithm to compute voxel-wise shape differences in tensor-based morphometry (TBM). As in standard TBM, we non-linearly register brain T1-weighed MRI data from a patient and control group to a template, and compute the Jacobian of the deformation fields. In standard TBM, the determinants of the Jacobian matrix at each voxel are statistically compared between the two groups. More recently, a multivariate extension of the statistical analysis involving the deformation tensors derived from the Jacobian matrices has been shown to improve statistical detection power.7 However, multivariate methods comprising large numbers of variables are computationally intensive and may be subject to noise. In addition, the anatomical interpretation of results is sometimes difficult. Here instead, we analyze the eigenvalues and the eigenvectors of the Jacobian matrices. Our method is validated on brain MRI data from Alzheimer's patients and healthy elderly controls from the Alzheimer's Disease Neuro Imaging Database.

  19. A MULTIVARIATE ANALYSIS OF CROATIAN COUNTIES ENTREPRENEURSHIP

    Directory of Open Access Journals (Sweden)

    Elza Jurun

    2012-12-01

    Full Text Available In the focus of this paper is a multivariate analysis of Croatian Counties entrepreneurship. Complete data base available by official statistic institutions at national and regional level is used. Modern econometric methodology starting from a comparative analysis via multiple regression to multivariate cluster analysis is carried out as well as the analysis of successful or inefficacious entrepreneurship measured by indicators of efficiency, profitability and productivity. Time horizons of the comparative analysis are in 2004 and 2010. Accelerators of socio-economic development - number of entrepreneur investors, investment in fixed assets and current assets ratio in multiple regression model are analytically filtered between twenty-six independent variables as variables of the dominant influence on GDP per capita in 2010 as dependent variable. Results of multivariate cluster analysis of twentyone Croatian Counties are interpreted also in the sense of three Croatian NUTS 2 regions according to European nomenclature of regional territorial division of Croatia.

  20. Modelling the Covariance Structure in Marginal Multivariate Count Models

    DEFF Research Database (Denmark)

    Bonat, W. H.; Olivero, J.; Grande-Vega, M.

    2017-01-01

    The main goal of this article is to present a flexible statistical modelling framework to deal with multivariate count data along with longitudinal and repeated measures structures. The covariance structure for each response variable is defined in terms of a covariance link function combined...... be used to indicate whether there was statistical evidence of a decline in blue duikers and other species hunted during the study period. Determining whether observed drops in the number of animals hunted are indeed true is crucial to assess whether species depletion effects are taking place in exploited...... with a matrix linear predictor involving known matrices. In order to specify the joint covariance matrix for the multivariate response vector, the generalized Kronecker product is employed. We take into account the count nature of the data by means of the power dispersion function associated with the Poisson...

  1. Multivariate Methods for Meta-Analysis of Genetic Association Studies.

    Science.gov (United States)

    Dimou, Niki L; Pantavou, Katerina G; Braliou, Georgia G; Bagos, Pantelis G

    2018-01-01

    Multivariate meta-analysis of genetic association studies and genome-wide association studies has received a remarkable attention as it improves the precision of the analysis. Here, we review, summarize and present in a unified framework methods for multivariate meta-analysis of genetic association studies and genome-wide association studies. Starting with the statistical methods used for robust analysis and genetic model selection, we present in brief univariate methods for meta-analysis and we then scrutinize multivariate methodologies. Multivariate models of meta-analysis for a single gene-disease association studies, including models for haplotype association studies, multiple linked polymorphisms and multiple outcomes are discussed. The popular Mendelian randomization approach and special cases of meta-analysis addressing issues such as the assumption of the mode of inheritance, deviation from Hardy-Weinberg Equilibrium and gene-environment interactions are also presented. All available methods are enriched with practical applications and methodologies that could be developed in the future are discussed. Links for all available software implementing multivariate meta-analysis methods are also provided.

  2. A new multivariate zero-adjusted Poisson model with applications to biomedicine.

    Science.gov (United States)

    Liu, Yin; Tian, Guo-Liang; Tang, Man-Lai; Yuen, Kam Chuen

    2018-05-25

    Recently, although advances were made on modeling multivariate count data, existing models really has several limitations: (i) The multivariate Poisson log-normal model (Aitchison and Ho, ) cannot be used to fit multivariate count data with excess zero-vectors; (ii) The multivariate zero-inflated Poisson (ZIP) distribution (Li et al., 1999) cannot be used to model zero-truncated/deflated count data and it is difficult to apply to high-dimensional cases; (iii) The Type I multivariate zero-adjusted Poisson (ZAP) distribution (Tian et al., 2017) could only model multivariate count data with a special correlation structure for random components that are all positive or negative. In this paper, we first introduce a new multivariate ZAP distribution, based on a multivariate Poisson distribution, which allows the correlations between components with a more flexible dependency structure, that is some of the correlation coefficients could be positive while others could be negative. We then develop its important distributional properties, and provide efficient statistical inference methods for multivariate ZAP model with or without covariates. Two real data examples in biomedicine are used to illustrate the proposed methods. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. Assessment of statistical agreement of three techniques for the study of cut marks: 3D digital microscope, laser scanning confocal microscopy and micro-photogrammetry.

    Science.gov (United States)

    Maté-González, Miguel Ángel; Aramendi, Julia; Yravedra, José; Blasco, Ruth; Rosell, Jordi; González-Aguilera, Diego; Domínguez-Rodrigo, Manuel

    2017-09-01

    In the last few years, the study of cut marks on bone surfaces has become fundamental for the interpretation of prehistoric butchery practices. Due to the difficulties in the correct identification of cut marks, many criteria for their description and classification have been suggested. Different techniques, such as three-dimensional digital microscope (3D DM), laser scanning confocal microscopy (LSCM) and micro-photogrammetry (M-PG) have been recently applied to the study of cut marks. Although the 3D DM and LSCM microscopic techniques are the most commonly used for the 3D identification of cut marks, M-PG has also proved to be very efficient and a low-cost method. M-PG is a noninvasive technique that allows the study of the cortical surface without any previous preparation of the samples, and that generates high-resolution models. Despite the current application of microscopic and micro-photogrammetric techniques to taphonomy, their reliability has never been tested. In this paper, we compare 3D DM, LSCM and M-PG in order to assess their resolution and results. In this study, we analyse 26 experimental cut marks generated with a metal knife. The quantitative and qualitative information registered is analysed by means of standard multivariate statistics and geometric morphometrics to assess the similarities and differences obtained with the different methodologies. © 2017 The Authors Journal of Microscopy © 2017 Royal Microscopical Society.

  4. Statistical Techniques For Real-time Anomaly Detection Using Spark Over Multi-source VMware Performance Data

    Energy Technology Data Exchange (ETDEWEB)

    Solaimani, Mohiuddin [Univ. of Texas-Dallas, Richardson, TX (United States); Iftekhar, Mohammed [Univ. of Texas-Dallas, Richardson, TX (United States); Khan, Latifur [Univ. of Texas-Dallas, Richardson, TX (United States); Thuraisingham, Bhavani [Univ. of Texas-Dallas, Richardson, TX (United States); Ingram, Joey Burton [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2015-09-01

    Anomaly detection refers to the identi cation of an irregular or unusual pat- tern which deviates from what is standard, normal, or expected. Such deviated patterns typically correspond to samples of interest and are assigned different labels in different domains, such as outliers, anomalies, exceptions, or malware. Detecting anomalies in fast, voluminous streams of data is a formidable chal- lenge. This paper presents a novel, generic, real-time distributed anomaly detection framework for heterogeneous streaming data where anomalies appear as a group. We have developed a distributed statistical approach to build a model and later use it to detect anomaly. As a case study, we investigate group anomaly de- tection for a VMware-based cloud data center, which maintains a large number of virtual machines (VMs). We have built our framework using Apache Spark to get higher throughput and lower data processing time on streaming data. We have developed a window-based statistical anomaly detection technique to detect anomalies that appear sporadically. We then relaxed this constraint with higher accuracy by implementing a cluster-based technique to detect sporadic and continuous anomalies. We conclude that our cluster-based technique out- performs other statistical techniques with higher accuracy and lower processing time.

  5. Authigenic oxide Neodymium Isotopic composition as a proxy of seawater: applying multivariate statistical analyses.

    Science.gov (United States)

    McKinley, C. C.; Scudder, R.; Thomas, D. J.

    2016-12-01

    The Neodymium Isotopic composition (Nd IC) of oxide coatings has been applied as a tracer of water mass composition and used to address fundamental questions about past ocean conditions. The leached authigenic oxide coating from marine sediment is widely assumed to reflect the dissolved trace metal composition of the bottom water interacting with sediment at the seafloor. However, recent studies have shown that readily reducible sediment components, in addition to trace metal fluxes from the pore water, are incorporated into the bottom water, influencing the trace metal composition of leached oxide coatings. This challenges the prevailing application of the authigenic oxide Nd IC as a proxy of seawater composition. Therefore, it is important to identify the component end-members that create sediments of different lithology and determine if, or how they might contribute to the Nd IC of oxide coatings. To investigate lithologic influence on the results of sequential leaching, we selected two sites with complete bulk sediment statistical characterization. Site U1370 in the South Pacific Gyre, is predominantly composed of Rhyolite ( 60%) and has a distinguishable ( 10%) Fe-Mn Oxyhydroxide component (Dunlea et al., 2015). Site 1149 near the Izu-Bonin-Arc is predominantly composed of dispersed ash ( 20-50%) and eolian dust from Asia ( 50-80%) (Scudder et al., 2014). We perform a two-step leaching procedure: a 14 mL of 0.02 M hydroxylamine hydrochloride (HH) in 20% acetic acid buffered to a pH 4 for one hour, targeting metals bound to Fe- and Mn- oxides fractions, and a second HH leach for 12 hours, designed to remove any remaining oxides from the residual component. We analyze all three resulting fractions for a large suite of major, trace and rare earth elements, a sub-set of the samples are also analyzed for Nd IC. We use multivariate statistical analyses of the resulting geochemical data to identify how each component of the sediment partitions across the sequential

  6. Multivariate Welch t-test on distances

    OpenAIRE

    Alekseyenko, Alexander V.

    2016-01-01

    Motivation: Permutational non-Euclidean analysis of variance, PERMANOVA, is routinely used in exploratory analysis of multivariate datasets to draw conclusions about the significance of patterns visualized through dimension reduction. This method recognizes that pairwise distance matrix between observations is sufficient to compute within and between group sums of squares necessary to form the (pseudo) F statistic. Moreover, not only Euclidean, but arbitrary distances can be used. This method...

  7. Multivariate-Statistical Assessment of Heavy Metals for Agricultural Soils in Northern China

    OpenAIRE

    Yang, Pingguo; Yang, Miao; Mao, Renzhao; Shao, Hongbo

    2014-01-01

    The study evaluated eight heavy metals content and soil pollution from agricultural soils in northern China. Multivariate and geostatistical analysis approaches were used to determine the anthropogenic and natural contribution of soil heavy metal concentrations. Single pollution index and integrated pollution index could be used to evaluate soil heavy metal risk. The results show that the first factor explains 27.3% of the eight soil heavy metals with strong positive loadings on Cu, Zn, and C...

  8. Study of groundwater arsenic pollution in Lanyang Plain using multivariate statistical analysis

    Science.gov (United States)

    chan, S.

    2013-12-01

    The study area, Lanyang Plain in the eastern Taiwan, has highly developed agriculture and aquaculture, which consume over 70% of the water supplies. Groundwater is frequently considered as an alternative water source. However, the serious arsenic pollution of groundwater in Lanyan Plain should be well studied to ensure the safety of groundwater usage. In this study, 39 groundwater samples were collected. The results of hydrochemistry demonstrate two major trends in Piper diagram. The major trend with most of groundwater samples is determined with water type between Ca+Mg-HCO3 and Na+K-HCO3. This can be explained with cation exchange reaction. The minor trend is obviously corresponding to seawater intrusion, which has water type of Na+K-Cl, because the localities of these samples are all in the coastal area. The multivariate statistical analysis on hydrochemical data was conducted for further exploration on the mechanism of arsenic contamination. Two major factors can be extracted with factor analysis. The major factor includes Ca, Mg and Sr while the minor factor includes Na, K and As. This reconfirms that cation exchange reaction mainly control the groundwater hydrochemistry in the study area. It is worth to note that arsenic is positively related to Na and K. The result of cluster analysis shows that groundwater samples with high arsenic concentration can be grouped into that with high Na, K and HCO3. This supports that cation exchange would enhance the release of arsenic and exclude the effect of seawater intrusion. In other words, the water-rock reaction time is key to obtain higher arsenic content. In general, the major source of arsenic in sediments include exchangeable, reducible and oxidizable phases, which are adsorbed ions, Fe-Mn oxides and organic matters/pyrite, respectively. However, the results of factor analysis do not show apparent correlation between arsenic and Fe/Mn. This may exclude Fe-Mn oxides as a major source of arsenic. The other sources

  9. RELIABILITY OF CERTAIN TESTS FOR EVALUATION OF JUDO TECHNIQUES

    Directory of Open Access Journals (Sweden)

    Slavko Obadov

    2007-05-01

    Full Text Available The sample included 106 judokas. Assessment of the level of mastership of judo techniques was carried out by evaluation of fi ve competent studies. Each subject performed a technique three times and each performance was evaluated by the judges. In order to evaluate measurement of each technique, Cronbach’s coeffi cient of reliability  was calculated. During the procedure the subjects's results were also transformed to factor scores i.e. the results of each performer at the main component of evaluation in the fi ve studies. These factor scores could be used in the subsequent procedure of multivariant statistical analysis.

  10. Statistical methods to monitor the West Valley off-gas system

    International Nuclear Information System (INIS)

    Eggett, D.L.

    1990-01-01

    This paper reports on the of-gas system for the ceramic melter operated at the West Valley Demonstration Project at West Valley, NY, monitored during melter operation. A one-at-a-time method of monitoring the parameters of the off-gas system is not statistically sound. Therefore, multivariate statistical methods appropriate for the monitoring of many correlated parameters will be used. Monitoring a large number of parameters increases the probability of a false out-of-control signal. If the parameters being monitored are statistically independent, the control limits can be easily adjusted to obtain the desired probability of a false out-of-control signal. The principal component (PC) scores have desirable statistical properties when the original variables are distributed as multivariate normals. Two statistics derived from the PC scores and used to form multivariate control charts are outlined and their distributional properties reviewed

  11. Boosting Higgs pair production in the [Formula: see text] final state with multivariate techniques.

    Science.gov (United States)

    Behr, J Katharina; Bortoletto, Daniela; Frost, James A; Hartland, Nathan P; Issever, Cigdem; Rojo, Juan

    2016-01-01

    The measurement of Higgs pair production will be a cornerstone of the LHC program in the coming years. Double Higgs production provides a crucial window upon the mechanism of electroweak symmetry breaking and has a unique sensitivity to the Higgs trilinear coupling. We study the feasibility of a measurement of Higgs pair production in the [Formula: see text] final state at the LHC. Our analysis is based on a combination of traditional cut-based methods with state-of-the-art multivariate techniques. We account for all relevant backgrounds, including the contributions from light and charm jet mis-identification, which are ultimately comparable in size to the irreducible 4 b QCD background. We demonstrate the robustness of our analysis strategy in a high pileup environment. For an integrated luminosity of [Formula: see text] ab[Formula: see text], a signal significance of [Formula: see text] is obtained, indicating that the [Formula: see text] final state alone could allow for the observation of double Higgs production at the High Luminosity LHC.

  12. A Range-Based Multivariate Model for Exchange Rate Volatility

    OpenAIRE

    Tims, Ben; Mahieu, Ronald

    2003-01-01

    textabstractIn this paper we present a parsimonious multivariate model for exchange rate volatilities based on logarithmic high-low ranges of daily exchange rates. The multivariate stochastic volatility model divides the log range of each exchange rate into two independent latent factors, which are interpreted as the underlying currency specific components. Due to the normality of logarithmic volatilities the model can be estimated conveniently with standard Kalman filter techniques. Our resu...

  13. Multivariate anomaly detection for Earth observations: a comparison of algorithms and feature extraction techniques

    Directory of Open Access Journals (Sweden)

    M. Flach

    2017-08-01

    Full Text Available Today, many processes at the Earth's surface are constantly monitored by multiple data streams. These observations have become central to advancing our understanding of vegetation dynamics in response to climate or land use change. Another set of important applications is monitoring effects of extreme climatic events, other disturbances such as fires, or abrupt land transitions. One important methodological question is how to reliably detect anomalies in an automated and generic way within multivariate data streams, which typically vary seasonally and are interconnected across variables. Although many algorithms have been proposed for detecting anomalies in multivariate data, only a few have been investigated in the context of Earth system science applications. In this study, we systematically combine and compare feature extraction and anomaly detection algorithms for detecting anomalous events. Our aim is to identify suitable workflows for automatically detecting anomalous patterns in multivariate Earth system data streams. We rely on artificial data that mimic typical properties and anomalies in multivariate spatiotemporal Earth observations like sudden changes in basic characteristics of time series such as the sample mean, the variance, changes in the cycle amplitude, and trends. This artificial experiment is needed as there is no gold standard for the identification of anomalies in real Earth observations. Our results show that a well-chosen feature extraction step (e.g., subtracting seasonal cycles, or dimensionality reduction is more important than the choice of a particular anomaly detection algorithm. Nevertheless, we identify three detection algorithms (k-nearest neighbors mean distance, kernel density estimation, a recurrence approach and their combinations (ensembles that outperform other multivariate approaches as well as univariate extreme-event detection methods. Our results therefore provide an effective workflow to

  14. Comparative multivariate analyses of transient otoacoustic emissions and distorsion products in normal and impaired hearing.

    Science.gov (United States)

    Stamate, Mirela Cristina; Todor, Nicolae; Cosgarea, Marcel

    2015-01-01

    The clinical utility of otoacoustic emissions as a noninvasive objective test of cochlear function has been long studied. Both transient otoacoustic emissions and distorsion products can be used to identify hearing loss, but to what extent they can be used as predictors for hearing loss is still debated. Most studies agree that multivariate analyses have better test performances than univariate analyses. The aim of the study was to determine transient otoacoustic emissions and distorsion products performance in identifying normal and impaired hearing loss, using the pure tone audiogram as a gold standard procedure and different multivariate statistical approaches. The study included 105 adult subjects with normal hearing and hearing loss who underwent the same test battery: pure-tone audiometry, tympanometry, otoacoustic emission tests. We chose to use the logistic regression as a multivariate statistical technique. Three logistic regression models were developed to characterize the relations between different risk factors (age, sex, tinnitus, demographic features, cochlear status defined by otoacoustic emissions) and hearing status defined by pure-tone audiometry. The multivariate analyses allow the calculation of the logistic score, which is a combination of the inputs, weighted by coefficients, calculated within the analyses. The accuracy of each model was assessed using receiver operating characteristics curve analysis. We used the logistic score to generate receivers operating curves and to estimate the areas under the curves in order to compare different multivariate analyses. We compared the performance of each otoacoustic emission (transient, distorsion product) using three different multivariate analyses for each ear, when multi-frequency gold standards were used. We demonstrated that all multivariate analyses provided high values of the area under the curve proving the performance of the otoacoustic emissions. Each otoacoustic emission test presented high

  15. Multivariate research in areas of phosphorus cast-iron brake shoes manufacturing using the statistical analysis and the multiple regression equations

    Science.gov (United States)

    Kiss, I.; Cioată, V. G.; Alexa, V.; Raţiu, S. A.

    2017-05-01

    The braking system is one of the most important and complex subsystems of railway vehicles, especially when it comes for safety. Therefore, installing efficient safe brakes on the modern railway vehicles is essential. Nowadays is devoted attention to solving problems connected with using high performance brake materials and its impact on thermal and mechanical loading of railway wheels. The main factor that influences the selection of a friction material for railway applications is the performance criterion, due to the interaction between the brake block and the wheel produce complex thermos-mechanical phenomena. In this work, the investigated subjects are the cast-iron brake shoes, which are still widely used on freight wagons. Therefore, the cast-iron brake shoes - with lamellar graphite and with a high content of phosphorus (0.8-1.1%) - need a special investigation. In order to establish the optimal condition for the cast-iron brake shoes we proposed a mathematical modelling study by using the statistical analysis and multiple regression equations. Multivariate research is important in areas of cast-iron brake shoes manufacturing, because many variables interact with each other simultaneously. Multivariate visualization comes to the fore when researchers have difficulties in comprehending many dimensions at one time. Technological data (hardness and chemical composition) obtained from cast-iron brake shoes were used for this purpose. In order to settle the multiple correlation between the hardness of the cast-iron brake shoes, and the chemical compositions elements several model of regression equation types has been proposed. Because a three-dimensional surface with variables on three axes is a common way to illustrate multivariate data, in which the maximum and minimum values are easily highlighted, we plotted graphical representation of the regression equations in order to explain interaction of the variables and locate the optimal level of each variable for

  16. Applied statistics for civil and environmental engineers

    CERN Document Server

    Kottegoda, N T

    2009-01-01

    Civil and environmental engineers need an understanding of mathematical statistics and probability theory to deal with the variability that affects engineers'' structures, soil pressures, river flows and the like. Students, too, need to get to grips with these rather difficult concepts.This book, written by engineers for engineers, tackles the subject in a clear, up-to-date manner using a process-orientated approach. It introduces the subjects of mathematical statistics and probability theory, and then addresses model estimation and testing, regression and multivariate methods, analysis of extreme events, simulation techniques, risk and reliability, and economic decision making.325 examples and case studies from European and American practice are included and each chapter features realistic problems to be solved.For the second edition new sections have been added on Monte Carlo Markov chain modeling with details of practical Gibbs sampling, sensitivity analysis and aleatory and epistemic uncertainties, and co...

  17. Multivariable PID controller design tuning using bat algorithm for activated sludge process

    Science.gov (United States)

    Atikah Nor’Azlan, Nur; Asmiza Selamat, Nur; Mat Yahya, Nafrizuan

    2018-04-01

    The designing of a multivariable PID control for multi input multi output is being concerned with this project by applying four multivariable PID control tuning which is Davison, Penttinen-Koivo, Maciejowski and Proposed Combined method. The determination of this study is to investigate the performance of selected optimization technique to tune the parameter of MPID controller. The selected optimization technique is Bat Algorithm (BA). All the MPID-BA tuning result will be compared and analyzed. Later, the best MPID-BA will be chosen in order to determine which techniques are better based on the system performances in terms of transient response.

  18. Multivariate Analysis of Industrial Scale Fermentation Data

    DEFF Research Database (Denmark)

    Mears, Lisa; Nørregård, Rasmus; Stocks, Stuart M.

    2015-01-01

    Multivariate analysis allows process understanding to be gained from the vast and complex datasets recorded from fermentation processes, however the application of such techniques to this field can be limited by the data pre-processing requirements and data handling. In this work many iterations...

  19. Multivariate statistical process control (MSPC) using Raman spectroscopy for in-line culture cell monitoring considering time-varying batches synchronized with correlation optimized warping (COW).

    Science.gov (United States)

    Liu, Ya-Juan; André, Silvère; Saint Cristau, Lydia; Lagresle, Sylvain; Hannas, Zahia; Calvosa, Éric; Devos, Olivier; Duponchel, Ludovic

    2017-02-01

    Multivariate statistical process control (MSPC) is increasingly popular as the challenge provided by large multivariate datasets from analytical instruments such as Raman spectroscopy for the monitoring of complex cell cultures in the biopharmaceutical industry. However, Raman spectroscopy for in-line monitoring often produces unsynchronized data sets, resulting in time-varying batches. Moreover, unsynchronized data sets are common for cell culture monitoring because spectroscopic measurements are generally recorded in an alternate way, with more than one optical probe parallelly connecting to the same spectrometer. Synchronized batches are prerequisite for the application of multivariate analysis such as multi-way principal component analysis (MPCA) for the MSPC monitoring. Correlation optimized warping (COW) is a popular method for data alignment with satisfactory performance; however, it has never been applied to synchronize acquisition time of spectroscopic datasets in MSPC application before. In this paper we propose, for the first time, to use the method of COW to synchronize batches with varying durations analyzed with Raman spectroscopy. In a second step, we developed MPCA models at different time intervals based on the normal operation condition (NOC) batches synchronized by COW. New batches are finally projected considering the corresponding MPCA model. We monitored the evolution of the batches using two multivariate control charts based on Hotelling's T 2 and Q. As illustrated with results, the MSPC model was able to identify abnormal operation condition including contaminated batches which is of prime importance in cell culture monitoring We proved that Raman-based MSPC monitoring can be used to diagnose batches deviating from the normal condition, with higher efficacy than traditional diagnosis, which would save time and money in the biopharmaceutical industry. Copyright © 2016 Elsevier B.V. All rights reserved.

  20. Multivariate tensor-based brain anatomical surface morphometry via holomorphic one-forms.

    Science.gov (United States)

    Wang, Yalin; Chan, Tony F; Toga, Arthur W; Thompson, Paul M

    2009-01-01

    Here we introduce multivariate tensor-based surface morphometry using holomorphic one-forms to study brain anatomy. We computed new statistics from the Riemannian metric tensors that retain the full information in the deformation tensor fields. We introduce two different holomorphic one-forms that induce different surface conformal parameterizations. We applied this framework to 3D MRI data to analyze hippocampal surface morphometry in Alzheimer's Disease (AD; 26 subjects), lateral ventricular surface morphometry in HIV/AIDS (19 subjects) and cortical surface morphometry in Williams Syndrome (WS; 80 subjects). Experimental results demonstrated that our method powerfully detected brain surface abnormalities. Multivariate statistics on the local tensors outperformed other TBM methods including analysis of the Jacobian determinant, the largest eigenvalue, or the pair of eigenvalues, of the surface Jacobian matrix.

  1. Multivariate pattern classification reveals autonomic and experiential representations of discrete emotions.

    Science.gov (United States)

    Kragel, Philip A; Labar, Kevin S

    2013-08-01

    Defining the structural organization of emotions is a central unresolved question in affective science. In particular, the extent to which autonomic nervous system activity signifies distinct affective states remains controversial. Most prior research on this topic has used univariate statistical approaches in attempts to classify emotions from psychophysiological data. In the present study, electrodermal, cardiac, respiratory, and gastric activity, as well as self-report measures were taken from healthy subjects during the experience of fear, anger, sadness, surprise, contentment, and amusement in response to film and music clips. Information pertaining to affective states present in these response patterns was analyzed using multivariate pattern classification techniques. Overall accuracy for classifying distinct affective states was 58.0% for autonomic measures and 88.2% for self-report measures, both of which were significantly above chance. Further, examining the error distribution of classifiers revealed that the dimensions of valence and arousal selectively contributed to decoding emotional states from self-report, whereas a categorical configuration of affective space was evident in both self-report and autonomic measures. Taken together, these findings extend recent multivariate approaches to study emotion and indicate that pattern classification tools may improve upon univariate approaches to reveal the underlying structure of emotional experience and physiological expression. PsycINFO Database Record (c) 2013 APA, all rights reserved.

  2. Predictive analysis of beer quality by correlating sensory evaluation with higher alcohol and ester production using multivariate statistics methods.

    Science.gov (United States)

    Dong, Jian-Jun; Li, Qing-Liang; Yin, Hua; Zhong, Cheng; Hao, Jun-Guang; Yang, Pan-Fei; Tian, Yu-Hong; Jia, Shi-Ru

    2014-10-15

    Sensory evaluation is regarded as a necessary procedure to ensure a reproducible quality of beer. Meanwhile, high-throughput analytical methods provide a powerful tool to analyse various flavour compounds, such as higher alcohol and ester. In this study, the relationship between flavour compounds and sensory evaluation was established by non-linear models such as partial least squares (PLS), genetic algorithm back-propagation neural network (GA-BP), support vector machine (SVM). It was shown that SVM with a Radial Basis Function (RBF) had a better performance of prediction accuracy for both calibration set (94.3%) and validation set (96.2%) than other models. Relatively lower prediction abilities were observed for GA-BP (52.1%) and PLS (31.7%). In addition, the kernel function of SVM played an essential role of model training when the prediction accuracy of SVM with polynomial kernel function was 32.9%. As a powerful multivariate statistics method, SVM holds great potential to assess beer quality. Copyright © 2014 Elsevier Ltd. All rights reserved.

  3. Distinct multivariate brain morphological patterns and their added predictive value with cognitive and polygenic risk scores in mental disorders

    Directory of Open Access Journals (Sweden)

    Nhat Trung Doan

    2017-01-01

    Full Text Available The brain underpinnings of schizophrenia and bipolar disorders are multidimensional, reflecting complex pathological processes and causal pathways, requiring multivariate techniques to disentangle. Furthermore, little is known about the complementary clinical value of brain structural phenotypes when combined with data on cognitive performance and genetic risk. Using data-driven fusion of cortical thickness, surface area, and gray matter density maps (GMD, we found six biologically meaningful patterns showing strong group effects, including four statistically independent multimodal patterns reflecting co-occurring alterations in thickness and GMD in patients, over and above two other independent patterns of widespread thickness and area reduction. Case-control classification using cognitive scores alone revealed high accuracy, and adding imaging features or polygenic risk scores increased performance, suggesting their complementary predictive value with cognitive scores being the most sensitive features. Multivariate pattern analyses reveal distinct patterns of brain morphology in mental disorders, provide insights on the relative importance between brain structure, cognitive and polygenetic risk score in classification of patients, and demonstrate the importance of multivariate approaches in studying the pathophysiological substrate of these complex disorders.

  4. Topics in theoretical and applied statistics

    CERN Document Server

    Giommi, Andrea

    2016-01-01

    This book highlights the latest research findings from the 46th International Meeting of the Italian Statistical Society (SIS) in Rome, during which both methodological and applied statistical research was discussed. This selection of fully peer-reviewed papers, originally presented at the meeting, addresses a broad range of topics, including the theory of statistical inference; data mining and multivariate statistical analysis; survey methodologies; analysis of social, demographic and health data; and economic statistics and econometrics.

  5. Multivariate Analysis of Schools and Educational Policy.

    Science.gov (United States)

    Kiesling, Herbert J.

    This report describes a multivariate analysis technique that approaches the problems of educational production function analysis by (1) using comparable measures of output across large experiments, (2) accounting systematically for differences in socioeconomic background, and (3) treating the school as a complete system in which different…

  6. Advanced statistics: linear regression, part II: multiple linear regression.

    Science.gov (United States)

    Marill, Keith A

    2004-01-01

    The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.

  7. Reagent-free bacterial identification using multivariate analysis of transmission spectra

    Science.gov (United States)

    Smith, Jennifer M.; Huffman, Debra E.; Acosta, Dayanis; Serebrennikova, Yulia; García-Rubio, Luis; Leparc, German F.

    2012-10-01

    The identification of bacterial pathogens from culture is critical to the proper administration of antibiotics and patient treatment. Many of the tests currently used in the clinical microbiology laboratory for bacterial identification today can be highly sensitive and specific; however, they have the additional burdens of complexity, cost, and the need for specialized reagents. We present an innovative, reagent-free method for the identification of pathogens from culture. A clinical study has been initiated to evaluate the sensitivity and specificity of this approach. Multiwavelength transmission spectra were generated from a set of clinical isolates including Escherichia coli, Klebsiella pneumoniae, Pseudomonas aeruginosa, and Staphylococcus aureus. Spectra of an initial training set of these target organisms were used to create identification models representing the spectral variability of each species using multivariate statistical techniques. Next, the spectra of the blinded isolates of targeted species were identified using the model achieving >94% sensitivity and >98% specificity, with 100% accuracy for P. aeruginosa and S. aureus. The results from this on-going clinical study indicate this approach is a powerful and exciting technique for identification of pathogens. The menu of models is being expanded to include other bacterial genera and species of clinical significance.

  8. Prospective surveillance of multivariate spatial disease data

    Science.gov (United States)

    Corberán-Vallet, A

    2012-01-01

    Surveillance systems are often focused on more than one disease within a predefined area. On those occasions when outbreaks of disease are likely to be correlated, the use of multivariate surveillance techniques integrating information from multiple diseases allows us to improve the sensitivity and timeliness of outbreak detection. In this article, we present an extension of the surveillance conditional predictive ordinate to monitor multivariate spatial disease data. The proposed surveillance technique, which is defined for each small area and time period as the conditional predictive distribution of those counts of disease higher than expected given the data observed up to the previous time period, alerts us to both small areas of increased disease incidence and the diseases causing the alarm within each area. We investigate its performance within the framework of Bayesian hierarchical Poisson models using a simulation study. An application to diseases of the respiratory system in South Carolina is finally presented. PMID:22534429

  9. Applications of modern statistical methods to analysis of data in physical science

    Science.gov (United States)

    Wicker, James Eric

    Modern methods of statistical and computational analysis offer solutions to dilemmas confronting researchers in physical science. Although the ideas behind modern statistical and computational analysis methods were originally introduced in the 1970's, most scientists still rely on methods written during the early era of computing. These researchers, who analyze increasingly voluminous and multivariate data sets, need modern analysis methods to extract the best results from their studies. The first section of this work showcases applications of modern linear regression. Since the 1960's, many researchers in spectroscopy have used classical stepwise regression techniques to derive molecular constants. However, problems with thresholds of entry and exit for model variables plagues this analysis method. Other criticisms of this kind of stepwise procedure include its inefficient searching method, the order in which variables enter or leave the model and problems with overfitting data. We implement an information scoring technique that overcomes the assumptions inherent in the stepwise regression process to calculate molecular model parameters. We believe that this kind of information based model evaluation can be applied to more general analysis situations in physical science. The second section proposes new methods of multivariate cluster analysis. The K-means algorithm and the EM algorithm, introduced in the 1960's and 1970's respectively, formed the basis of multivariate cluster analysis methodology for many years. However, several shortcomings of these methods include strong dependence on initial seed values and inaccurate results when the data seriously depart from hypersphericity. We propose new cluster analysis methods based on genetic algorithms that overcomes the strong dependence on initial seed values. In addition, we propose a generalization of the Genetic K-means algorithm which can accurately identify clusters with complex hyperellipsoidal covariance

  10. Quantitative profiling of polar metabolites in herbal medicine injections for multivariate statistical evaluation based on independence principal component analysis.

    Directory of Open Access Journals (Sweden)

    Miaomiao Jiang

    Full Text Available Botanical primary metabolites extensively exist in herbal medicine injections (HMIs, but often were ignored to control. With the limitation of bias towards hydrophilic substances, the primary metabolites with strong polarity, such as saccharides, amino acids and organic acids, are usually difficult to detect by the routinely applied reversed-phase chromatographic fingerprint technology. In this study, a proton nuclear magnetic resonance (1H NMR profiling method was developed for efficient identification and quantification of small polar molecules, mostly primary metabolites in HMIs. A commonly used medicine, Danhong injection (DHI, was employed as a model. With the developed method, 23 primary metabolites together with 7 polyphenolic acids were simultaneously identified, of which 13 metabolites with fully separated proton signals were quantified and employed for further multivariate quality control assay. The quantitative 1H NMR method was validated with good linearity, precision, repeatability, stability and accuracy. Based on independence principal component analysis (IPCA, the contents of 13 metabolites were characterized and dimensionally reduced into the first two independence principal components (IPCs. IPC1 and IPC2 were then used to calculate the upper control limits (with 99% confidence ellipsoids of χ2 and Hotelling T2 control charts. Through the constructed upper control limits, the proposed method was successfully applied to 36 batches of DHI to examine the out-of control sample with the perturbed levels of succinate, malonate, glucose, fructose, salvianic acid and protocatechuic aldehyde. The integrated strategy has provided a reliable approach to identify and quantify multiple polar metabolites of DHI in one fingerprinting spectrum, and it has also assisted in the establishment of IPCA models for the multivariate statistical evaluation of HMIs.

  11. Avoid Filling Swiss Cheese with Whipped Cream; Imputation Techniques and Evaluation Procedures for Cross-Country Time Series

    OpenAIRE

    Michael Weber; Michaela Denk

    2011-01-01

    International organizations collect data from national authorities to create multivariate cross-sectional time series for their analyses. As data from countries with not yet well-established statistical systems may be incomplete, the bridging of data gaps is a crucial challenge. This paper investigates data structures and missing data patterns in the cross-sectional time series framework, reviews missing value imputation techniques used for micro data in official statistics, and discusses the...

  12. Real/binary co-operative and co-evolving swarms based multivariable PID controller design of ball mill pulverizing system

    International Nuclear Information System (INIS)

    Menhas, Muhammad Ilyas; Fei Minrui; Wang Ling; Qian Lin

    2012-01-01

    Highlights: ► We extend the concept of co-operation and co-evolution in some PSO variants. ► We use developed co-operative PSOs in multivariable PID controller design/tuning. ► We find that co-operative PSOs converge faster and give high quality solutions. ► Dividing the search space among swarms improves search efficiency. ► The proposed methods allow the practitioner for heterogeneous problem formulation. - Abstract: In this paper, multivariable PID controller design based on cooperative and coevolving multiple swarms is demonstrated. A simplified multi-variable MIMO process model of a ball mill pulverizing system with steady state decoupler is considered. In order to formulate computational models of cooperative and coevolving multiple swarms three different algorithms like real coded PSO, discrete binary PSO (DBPSO) and probability based discrete binary PSO (PBPSO) are employed. Simulations are carried out on three composite functions simultaneously considering multiple objectives. The cooperative and coevolving multiple swarms based results are compared with the results obtained through single swarm based methods like real coded particle swarm optimization (PSO), discrete binary PSO (DBPSO), and probability based discrete binary PSO (PBPSO) algorithms. The cooperative and coevolving swarms based techniques outperform the real coded PSO, PBPSO, and the standard discrete binary PSO (DBPSO) algorithm in escaping from local optima. Furthermore, statistical analysis of the simulation results is performed to calculate the comparative reliability of various techniques. All of the techniques employed are suitable for controller tuning, however, the multiple cooperative and coevolving swarms based results are considerably better in terms of mean fitness, variance of fitness, and success rate in finding a feasible solution in comparison to those obtained using single swarm based methods.

  13. The association of 83 plasma proteins with CHD mortality, BMI, HDL-, and total-cholesterol in men: applying multivariate statistics to identify proteins with prognostic value and biological relevance.

    Science.gov (United States)

    Heidema, A Geert; Thissen, Uwe; Boer, Jolanda M A; Bouwman, Freek G; Feskens, Edith J M; Mariman, Edwin C M

    2009-06-01

    In this study, we applied the multivariate statistical tool Partial Least Squares (PLS) to analyze the relative importance of 83 plasma proteins in relation to coronary heart disease (CHD) mortality and the intermediate end points body mass index, HDL-cholesterol and total cholesterol. From a Dutch monitoring project for cardiovascular disease risk factors, men who died of CHD between initial participation (1987-1991) and end of follow-up (January 1, 2000) (N = 44) and matched controls (N = 44) were selected. Baseline plasma concentrations of proteins were measured by a multiplex immunoassay. With the use of PLS, we identified 15 proteins with prognostic value for CHD mortality and sets of proteins associated with the intermediate end points. Subsequently, sets of proteins and intermediate end points were analyzed together by Principal Components Analysis, indicating that proteins involved in inflammation explained most of the variance, followed by proteins involved in metabolism and proteins associated with total-C. This study is one of the first in which the association of a large number of plasma proteins with CHD mortality and intermediate end points is investigated by applying multivariate statistics, providing insight in the relationships among proteins, intermediate end points and CHD mortality, and a set of proteins with prognostic value.

  14. A statistical view of uncertainty in expert systems

    International Nuclear Information System (INIS)

    Spiegelhalter, D.J.

    1986-01-01

    The constructors of expert systems interpret ''uncertainty'' in a wide sense and have suggested a variety of qualitative and quantitative techniques for handling the concept, such as the theory of ''endorsements,'' fuzzy reasoning, and belief functions. After a brief selective review of procedures that do not adhere to the laws of probability, it is argued that a subjectivist Bayesian view of uncertainty, if flexibly applied, can provide many of the features demanded by expert systems. This claim is illustrated with a number of examples of probabilistic reasoning, and a connection drawn with statistical work on the graphical representation of multivariate distributions. Possible areas of future research are outlined

  15. Current breathomics-a review on data pre-processing techniques and machine learning in metabolomics breath analysis

    DEFF Research Database (Denmark)

    Smolinska, A.; Hauschild, A. C.; Fijten, R. R. R.

    2014-01-01

    been extensively developed. Yet, the application of machine learning methods for fingerprinting VOC profiles in the breathomics is still in its infancy. Therefore, in this paper, we describe the current state of the art in data pre-processing and multivariate analysis of breathomics data. We start...... different conditions (e.g. disease stage, treatment). Independently of the utilized analytical method, the most important question, 'which VOCs are discriminatory?', remains the same. Answers can be given by several modern machine learning techniques (multivariate statistics) and, therefore, are the focus...

  16. Decomposing biodiversity data using the Latent Dirichlet Allocation model, a probabilistic multivariate statistical method

    Science.gov (United States)

    Denis Valle; Benjamin Baiser; Christopher W. Woodall; Robin Chazdon; Jerome. Chave

    2014-01-01

    We propose a novel multivariate method to analyse biodiversity data based on the Latent Dirichlet Allocation (LDA) model. LDA, a probabilistic model, reduces assemblages to sets of distinct component communities. It produces easily interpretable results, can represent abrupt and gradual changes in composition, accommodates missing data and allows for coherent estimates...

  17. Modeling the geochemical distribution of rare earth elements (REEs using multivariate statistics in the eastern part of Marvast placer, the Yazd province

    Directory of Open Access Journals (Sweden)

    Amin Hossein Morshedy

    2017-07-01

    Full Text Available Introduction Nowadays, exploration of rare earth element (REE resources is considered as one of the strategic priorities, which has a special position in the advanced and intelligent industries (Castor and Hedrick, 2006. Significant resources of REEs are found in a wide range of geological settings, including primary deposits associated with igneous and hydrothermal processes (e.g. carbonatite, (per alkaline-igneous rocks, iron-oxide breccia complexes, scarns, fluorapatite veins and pegmatites, and secondary deposits concentrated by sedimentary processes and weathering (e.g. heavy-mineral sand deposits, fluviatile sandstones, unconformity-related uranium deposits, and lignites (Jaireth et al., 2014. Recent studies on various parts of Iran led to the identification of promising potential of these elements, including Central Iran, alkaline rocks in the Eslami Peninsula, iron and apatite in the Hormuz Island, Kahnouj titanium deposit, granitoid bodies in Yazd, Azerbaijan, and Mashhad and associated dikes, and finally placers related to the Shemshak formation in Marvast, Kharanagh, and Ardekan indicate high concentration of REE in magmatogenic iron–apatite deposits in Central Iran and placers in Marvast area in Yazd (Ghorbani, 2013. Materials and methods In the present study, the geochemical behavior of rare earth elements is modeled by using multivariate statistical methods in the eastern part of the Marvast placer. Marvast is located 185 km south of the city of Yazd in central Iran between Yazd and Mehriz. This area lies within the southeastern part of the Sanandaj-Sirjan Zone (Alipour-Asll et al., 2012. The samples of 53 wells were analyzed for Whole-rock trace-element concentrations (including REE by inductively coupled plasma-mass spectrometry (ICP-MS (GSI, 2004. The clustering techniques such as multivariate statistical analysis technique can be employed to find appropriate groups in data sets. One of the main objectives of data clustering

  18. Machine learning and statistical techniques : an application to the prediction of insolvency in Spanish non-life insurance companies

    OpenAIRE

    Díaz, Zuleyka; Segovia, María Jesús; Fernández, José

    2005-01-01

    Prediction of insurance companies insolvency has arisen as an important problem in the field of financial research. Most methods applied in the past to tackle this issue are traditional statistical techniques which use financial ratios as explicative variables. However, these variables often do not satisfy statistical assumptions, which complicates the application of the mentioned methods. In this paper, a comparative study of the performance of two non-parametric machine learning techniques ...

  19. Statistical uncertainty of extreme wind storms over Europe derived from a probabilistic clustering technique

    Science.gov (United States)

    Walz, Michael; Leckebusch, Gregor C.

    2016-04-01

    Extratropical wind storms pose one of the most dangerous and loss intensive natural hazards for Europe. However, due to only 50 years of high quality observational data, it is difficult to assess the statistical uncertainty of these sparse events just based on observations. Over the last decade seasonal ensemble forecasts have become indispensable in quantifying the uncertainty of weather prediction on seasonal timescales. In this study seasonal forecasts are used in a climatological context: By making use of the up to 51 ensemble members, a broad and physically consistent statistical base can be created. This base can then be used to assess the statistical uncertainty of extreme wind storm occurrence more accurately. In order to determine the statistical uncertainty of storms with different paths of progression, a probabilistic clustering approach using regression mixture models is used to objectively assign storm tracks (either based on core pressure or on extreme wind speeds) to different clusters. The advantage of this technique is that the entire lifetime of a storm is considered for the clustering algorithm. Quadratic curves are found to describe the storm tracks most accurately. Three main clusters (diagonal, horizontal or vertical progression of the storm track) can be identified, each of which have their own particulate features. Basic storm features like average velocity and duration are calculated and compared for each cluster. The main benefit of this clustering technique, however, is to evaluate if the clusters show different degrees of uncertainty, e.g. more (less) spread for tracks approaching Europe horizontally (diagonally). This statistical uncertainty is compared for different seasonal forecast products.

  20. Understanding gendered aspects of migration aspiration and motives of university students by multivariate statistical methods

    Directory of Open Access Journals (Sweden)

    Đula Borozan

    2014-03-01

    Full Text Available The paper deals with the application of multivariate analysis of variance and logistic regression in measuring, explaining and evaluating (i gender differences in expressing migration aspirations, and (ii a gender effect on migration motivation of university students in Croatia. The results supported the thesis that migration is a complex gendering process that assumes subjective assessment of the whole set of interrelated motives. According to logistic regression, gender is a significant predictor of migration aspirations among the selected demographic and socio-economic variables. A multivariate analysis of variance showed that gender and migration aspirations in interaction matter when it comes to migration motives, particularly related to the perceived importance of social networks. Females, and especially those who aspire to migrate, assessed these motives as more important than males.

  1. A survey of statistical downscaling techniques

    Energy Technology Data Exchange (ETDEWEB)

    Zorita, E.; Storch, H. von [GKSS-Forschungszentrum Geesthacht GmbH (Germany). Inst. fuer Hydrophysik

    1997-12-31

    The derivation of regional information from integrations of coarse-resolution General Circulation Models (GCM) is generally referred to as downscaling. The most relevant statistical downscaling techniques are described here and some particular examples are worked out in detail. They are classified into three main groups: linear methods, classification methods and deterministic non-linear methods. Their performance in a particular example, winter rainfall in the Iberian peninsula, is compared to a simple downscaling analog method. It is found that the analog method performs equally well than the more complicated methods. Downscaling analysis can be also used as a tool to validate regional performance of global climate models by analyzing the covariability of the simulated large-scale climate and the regional climates. (orig.) [Deutsch] Die Ableitung regionaler Information aus Integrationen grob aufgeloester Klimamodelle wird als `Regionalisierung` bezeichnet. Dieser Beitrag beschreibt die wichtigsten statistischen Regionalisierungsverfahren und gibt darueberhinaus einige detaillierte Beispiele. Regionalisierungsverfahren lassen sich in drei Hauptgruppen klassifizieren: lineare Verfahren, Klassifikationsverfahren und nicht-lineare deterministische Verfahren. Diese Methoden werden auf den Niederschlag auf der iberischen Halbinsel angewandt und mit den Ergebnissen eines einfachen Analog-Modells verglichen. Es wird festgestellt, dass die Ergebnisse der komplizierteren Verfahren im wesentlichen auch mit der Analog-Methode erzielt werden koennen. Eine weitere Anwendung der Regionalisierungsmethoden besteht in der Validierung globaler Klimamodelle, indem die simulierte und die beobachtete Kovariabilitaet zwischen dem grosskaligen und dem regionalen Klima miteinander verglichen wird. (orig.)

  2. A survey of statistical downscaling techniques

    Energy Technology Data Exchange (ETDEWEB)

    Zorita, E; Storch, H von [GKSS-Forschungszentrum Geesthacht GmbH (Germany). Inst. fuer Hydrophysik

    1998-12-31

    The derivation of regional information from integrations of coarse-resolution General Circulation Models (GCM) is generally referred to as downscaling. The most relevant statistical downscaling techniques are described here and some particular examples are worked out in detail. They are classified into three main groups: linear methods, classification methods and deterministic non-linear methods. Their performance in a particular example, winter rainfall in the Iberian peninsula, is compared to a simple downscaling analog method. It is found that the analog method performs equally well than the more complicated methods. Downscaling analysis can be also used as a tool to validate regional performance of global climate models by analyzing the covariability of the simulated large-scale climate and the regional climates. (orig.) [Deutsch] Die Ableitung regionaler Information aus Integrationen grob aufgeloester Klimamodelle wird als `Regionalisierung` bezeichnet. Dieser Beitrag beschreibt die wichtigsten statistischen Regionalisierungsverfahren und gibt darueberhinaus einige detaillierte Beispiele. Regionalisierungsverfahren lassen sich in drei Hauptgruppen klassifizieren: lineare Verfahren, Klassifikationsverfahren und nicht-lineare deterministische Verfahren. Diese Methoden werden auf den Niederschlag auf der iberischen Halbinsel angewandt und mit den Ergebnissen eines einfachen Analog-Modells verglichen. Es wird festgestellt, dass die Ergebnisse der komplizierteren Verfahren im wesentlichen auch mit der Analog-Methode erzielt werden koennen. Eine weitere Anwendung der Regionalisierungsmethoden besteht in der Validierung globaler Klimamodelle, indem die simulierte und die beobachtete Kovariabilitaet zwischen dem grosskaligen und dem regionalen Klima miteinander verglichen wird. (orig.)

  3. Nitrate source identification in groundwater of multiple land-use areas by combining isotopes and multivariate statistical analysis: A case study of Asopos basin (Central Greece).

    Science.gov (United States)

    Matiatos, Ioannis

    2016-01-15

    Nitrate (NO3) is one of the most common contaminants in aquatic environments and groundwater. Nitrate concentrations and environmental isotope data (δ(15)N-NO3 and δ(18)O-NO3) from groundwater of Asopos basin, which has different land-use types, i.e., a large number of industries (e.g., textile, metal processing, food, fertilizers, paint), urban and agricultural areas and livestock breeding facilities, were analyzed to identify the nitrate sources of water contamination and N-biogeochemical transformations. A Bayesian isotope mixing model (SIAR) and multivariate statistical analysis of hydrochemical data were used to estimate the proportional contribution of different NO3 sources and to identify the dominant factors controlling the nitrate content of the groundwater in the region. The comparison of SIAR and Principal Component Analysis showed that wastes originating from urban and industrial zones of the basin are mainly responsible for nitrate contamination of groundwater in these areas. Agricultural fertilizers and manure likely contribute to groundwater contamination away from urban fabric and industrial land-use areas. Soil contribution to nitrate contamination due to organic matter is higher in the south-western part of the area far from the industries and the urban settlements. The present study aims to highlight the use of environmental isotopes combined with multivariate statistical analysis in locating sources of nitrate contamination in groundwater leading to a more effective planning of environmental measures and remediation strategies in river basins and water bodies as defined by the European Water Frame Directive (Directive 2000/60/EC).

  4. Identifying sources of soil inorganic pollutants on a regional scale using a multivariate statistical approach: Role of pollutant migration and soil physicochemical properties

    International Nuclear Information System (INIS)

    Zhang Changbo; Wu Longhua; Luo Yongming; Zhang Haibo; Christie, Peter

    2008-01-01

    Principal components analysis (PCA) and correlation analysis were used to estimate the contribution of four components related to pollutant sources on the total variation in concentrations of Cu, Zn, Pb, Cd, As, Se, Hg, Fe and Mn in surface soil samples from a valley in east China with numerous copper and zinc smelters. Results indicate that when carrying out source identification of inorganic pollutants their tendency to migrate in soils may result in differences between the pollutant composition of the source and the receptor soil, potentially leading to errors in the characterization of pollutants using multivariate statistics. The stability and potential migration or movement of pollutants in soils must therefore be taken into account. Soil physicochemical properties may offer additional useful information. Two different mechanisms have been hypothesized for correlations between soil heavy metal concentrations and soil organic matter content and these may be helpful in interpreting the statistical analysis. - Principal components analysis with Varimax rotation can help identify sources of soil inorganic pollutants but pollutant migration and soil properties can exert important effects

  5. Practical statistics a handbook for business projects

    CERN Document Server

    Buglear, John

    2013-01-01

    Practical Statistics is a hands-on guide to statistics, progressing by complexity of data (univariate, bivariate, multivariate) and analysis (portray, summarise, generalise) in order to give the reader a solid understanding of the fundamentals and how to apply them.

  6. Sexual selection on multivariate phenotypes in Anastrepha Fraterculus (Diptera: Tephritidae) from Argentina

    International Nuclear Information System (INIS)

    Sciurano, R.; Rodriguero, M.; Gomez Cendra, P.; Vilardi, J.; Segura, D.; Cladera, J.L.; Allinghi, Armando

    2007-01-01

    Despite the interest in applying environmentally friendly control methods such as sterile insect technique (SIT) against Anastrepha fraterculus (Wiedemann) (Diptera: Tephritidae), information about its biology, taxonomy, and behavior is still insufficient. To increase this information, the present study aims to evaluate the performance of wild flies under field cage conditions through the study of sexual competitiveness among males (sexual selection). A wild population from Horco Molle, Tucuman, Argentina was sampled. Mature virgin males and females were released into outdoor field cages to compete for mating. Morphometric analyses were applied to determine the relationship between the multivariate phenotype and copulatory success. Successful and unsuccessful males were measured for 8 traits: head width (HW), face width (FW), eye length (EL), thorax length (THL), wing length (WL), wing width (WW), femur length (FL), and tibia length (TIL). Combinations of different multivariate statistical methods and graphical analyses were used to evaluate sexual selection on male phenotype. The results indicated that wing width and thorax length would be the most probable targets of sexual selection. They describe a non-linear association between expected fitness and each of these 2 traits. This non-linear relation suggests that observed selection could maintain the diversity related to body size. (author) [es

  7. Total coliforms, arsenic and cadmium exposure through drinking water in the Western Region of Ghana: application of multivariate statistical technique to groundwater quality.

    Science.gov (United States)

    Affum, Andrews Obeng; Osae, Shiloh Dede; Nyarko, Benjamin Jabez Botwe; Afful, Samuel; Fianko, Joseph Richmond; Akiti, Tetteh Thomas; Adomako, Dickson; Acquaah, Samuel Osafo; Dorleku, Micheal; Antoh, Emmanuel; Barnes, Felix; Affum, Enoch Acheampong

    2015-02-01

    In recent times, surface water resource in the Western Region of Ghana has been found to be inadequate in supply and polluted by various anthropogenic activities. As a result of these problems, the demand for groundwater by the human populations in the peri-urban communities for domestic, municipal and irrigation purposes has increased without prior knowledge of its water quality. Water samples were collected from 14 public hand-dug wells during the rainy season in 2013 and investigated for total coliforms, Escherichia coli, mercury (Hg), arsenic (As), cadmium (Cd) and physicochemical parameters. Multivariate statistical analysis of the dataset and a linear stoichiometric plot of major ions were applied to group the water samples and to identify the main factors and sources of contamination. Hierarchal cluster analysis revealed four clusters from the hydrochemical variables (R-mode) and three clusters in the case of water samples (Q-mode) after z score standardization. Principal component analysis after a varimax rotation of the dataset indicated that the four factors extracted explained 93.3 % of the total variance, which highlighted salinity, toxic elements and hardness pollution as the dominant factors affecting groundwater quality. Cation exchange, mineral dissolution and silicate weathering influenced groundwater quality. The ranking order of major ions was Na(+) > Ca(2+) > K(+) > Mg(2+) and Cl(-) > SO4 (2-) > HCO3 (-). Based on piper plot and the hydrogeology of the study area, sodium chloride (86 %), sodium hydrogen carbonate and sodium carbonate (14 %) water types were identified. Although E. coli were absent in the water samples, 36 % of the wells contained total coliforms (Enterobacter species) which exceeded the WHO guidelines limit of zero colony-forming unit (CFU)/100 mL of drinking water. With the exception of Hg, the concentration of As and Cd in 79 and 43 % of the water samples exceeded the WHO guideline limits of 10 and 3

  8. Preliminary Multi-Variable Parametric Cost Model for Space Telescopes

    Science.gov (United States)

    Stahl, H. Philip; Hendrichs, Todd

    2010-01-01

    This slide presentation reviews creating a preliminary multi-variable cost model for the contract costs of making a space telescope. There is discussion of the methodology for collecting the data, definition of the statistical analysis methodology, single variable model results, testing of historical models and an introduction of the multi variable models.

  9. A Comparison of Selected Statistical Techniques to Model Soil Cation Exchange Capacity

    Science.gov (United States)

    Khaledian, Yones; Brevik, Eric C.; Pereira, Paulo; Cerdà, Artemi; Fattah, Mohammed A.; Tazikeh, Hossein

    2017-04-01

    Cation exchange capacity (CEC) measures the soil's ability to hold positively charged ions and is an important indicator of soil quality (Khaledian et al., 2016). However, other soil properties are more commonly determined and reported, such as texture, pH, organic matter and biology. We attempted to predict CEC using different advanced statistical methods including monotone analysis of variance (MONANOVA), artificial neural networks (ANNs), principal components regressions (PCR), and particle swarm optimization (PSO) in order to compare the utility of these approaches and identify the best predictor. We analyzed 170 soil samples from four different nations (USA, Spain, Iran and Iraq) under three land uses (agriculture, pasture, and forest). Seventy percent of the samples (120 samples) were selected as the calibration set and the remaining 50 samples (30%) were used as the prediction set. The results indicated that the MONANOVA (R2= 0.82 and Root Mean Squared Error (RMSE) =6.32) and ANNs (R2= 0.82 and RMSE=5.53) were the best models to estimate CEC, PSO (R2= 0.80 and RMSE=5.54) and PCR (R2= 0.70 and RMSE=6.48) also worked well and the overall results were very similar to each other. Clay (positively correlated) and sand (negatively correlated) were the most influential variables for predicting CEC for the entire data set, while the most influential variables for the various countries and land uses were different and CEC was affected by different variables in different situations. Although the MANOVA and ANNs provided good predictions of the entire dataset, PSO gives a formula to estimate soil CEC using commonly tested soil properties. Therefore, PSO shows promise as a technique to estimate soil CEC. Establishing effective pedotransfer functions to predict CEC would be productive where there are limitations of time and money, and other commonly analyzed soil properties are available. References Khaledian, Y., Kiani, F., Ebrahimi, S., Brevik, E.C., Aitkenhead

  10. Solution identification and quantitative analysis of fiber-capacitive drop analyzer based on multivariate statistical methods

    Science.gov (United States)

    Chen, Zhe; Qiu, Zurong; Huo, Xinming; Fan, Yuming; Li, Xinghua

    2017-03-01

    A fiber-capacitive drop analyzer is an instrument which monitors a growing droplet to produce a capacitive opto-tensiotrace (COT). Each COT is an integration of fiber light intensity signals and capacitance signals and can reflect the unique physicochemical property of a liquid. In this study, we propose a solution analytical and concentration quantitative method based on multivariate statistical methods. Eight characteristic values are extracted from each COT. A series of COT characteristic values of training solutions at different concentrations compose a data library of this kind of solution. A two-stage linear discriminant analysis is applied to analyze different solution libraries and establish discriminant functions. Test solutions can be discriminated by these functions. After determining the variety of test solutions, Spearman correlation test and principal components analysis are used to filter and reduce dimensions of eight characteristic values, producing a new representative parameter. A cubic spline interpolation function is built between the parameters and concentrations, based on which we can calculate the concentration of the test solution. Methanol, ethanol, n-propanol, and saline solutions are taken as experimental subjects in this paper. For each solution, nine or ten different concentrations are chosen to be the standard library, and the other two concentrations compose the test group. By using the methods mentioned above, all eight test solutions are correctly identified and the average relative error of quantitative analysis is 1.11%. The method proposed is feasible which enlarges the applicable scope of recognizing liquids based on the COT and improves the concentration quantitative precision, as well.

  11. Leak detection and localization in a pipeline system by application of statistical analysis techniques

    International Nuclear Information System (INIS)

    Fukuda, Toshio; Mitsuoka, Toyokazu.

    1985-01-01

    The detection of leak in piping system is an important diagnostic technique for facilities to prevent accidents and to take maintenance measures, since the occurrence of leak lowers productivity and causes environmental destruction. As the first step, it is necessary to detect the occurrence of leak without delay, and as the second step, if the place of leak occurrence in piping system can be presumed, accident countermeasures become easy. The detection of leak by pressure is usually used for detecting large leak. But the method depending on pressure is simple and advantageous, therefore the extension of the detecting technique by pressure gradient method to the detection of smaller scale leak using statistical analysis techniques was examined for a pipeline in steady operation in this study. Since the flow in a pipe irregularly varies during pumping, statistical means is required for the detection of small leak by pressure. The index for detecting leak proposed in this paper is the difference of the pressure gradient at the both ends of a pipeline. The experimental results on water and air in nylon tubes are reported. (Kako, I.)

  12. Multivariate Receptor Models for Spatially Correlated Multipollutant Data

    KAUST Repository

    Jun, Mikyoung

    2013-08-01

    The goal of multivariate receptor modeling is to estimate the profiles of major pollution sources and quantify their impacts based on ambient measurements of pollutants. Traditionally, multivariate receptor modeling has been applied to multiple air pollutant data measured at a single monitoring site or measurements of a single pollutant collected at multiple monitoring sites. Despite the growing availability of multipollutant data collected from multiple monitoring sites, there has not yet been any attempt to incorporate spatial dependence that may exist in such data into multivariate receptor modeling. We propose a spatial statistics extension of multivariate receptor models that enables us to incorporate spatial dependence into estimation of source composition profiles and contributions given the prespecified number of sources and the model identification conditions. The proposed method yields more precise estimates of source profiles by accounting for spatial dependence in the estimation. More importantly, it enables predictions of source contributions at unmonitored sites as well as when there are missing values at monitoring sites. The method is illustrated with simulated data and real multipollutant data collected from eight monitoring sites in Harris County, Texas. Supplementary materials for this article, including data and R code for implementing the methods, are available online on the journal web site. © 2013 Copyright Taylor and Francis Group, LLC.

  13. An Outlyingness Matrix for Multivariate Functional Data Classification

    KAUST Repository

    Dai, Wenlin

    2017-08-25

    The classification of multivariate functional data is an important task in scientific research. Unlike point-wise data, functional data are usually classified by their shapes rather than by their scales. We define an outlyingness matrix by extending directional outlyingness, an effective measure of the shape variation of curves that combines the direction of outlyingness with conventional statistical depth. We propose two classifiers based on directional outlyingness and the outlyingness matrix, respectively. Our classifiers provide better performance compared with existing depth-based classifiers when applied on both univariate and multivariate functional data from simulation studies. We also test our methods on two data problems: speech recognition and gesture classification, and obtain results that are consistent with the findings from the simulated data.

  14. Multivariate statistical study with a factor analysis of foraminiferal fauna from the Chilka Lake, India

    Digital Repository Service at National Institute of Oceanography (India)

    Jayalakshmy, K.V.; Rao, K.K.

    Harbour, En- gland: a reappraisal using multivariate tech- niques. J. Paleontol., 43 (3) : 660-675. Imbrie, J. and F.B. Phleger. 1963. Analisis por vectores de los foraminiferos bentonicos del area de San Diego, California. Soc. Geol. Mex., Bol., 26...

  15. Multivariate Approaches to Classification in Extragalactic Astronomy

    Directory of Open Access Journals (Sweden)

    Didier eFraix-Burnet

    2015-08-01

    Full Text Available Clustering objects into synthetic groups is a natural activity of any science. Astrophysics is not an exception and is now facing a deluge of data. For galaxies, the one-century old Hubble classification and the Hubble tuning fork are still largely in use, together with numerous mono- or bivariate classifications most often made by eye. However, a classification must be driven by the data, and sophisticated multivariate statistical tools are used more and more often. In this paper we review these different approaches in order to situate them in the general context of unsupervised and supervised learning. We insist on the astrophysical outcomes of these studies to show that multivariate analyses provide an obvious path toward a renewal of our classification of galaxies and are invaluable tools to investigate the physics and evolution of galaxies.

  16. TRAN-STAT: statistics for environmental studies, Number 22. Comparison of soil-sampling techniques for plutonium at Rocky Flats

    International Nuclear Information System (INIS)

    Gilbert, R.O.; Bernhardt, D.E.; Hahn, P.B.

    1983-01-01

    A summary of a field soil sampling study conducted around the Rocky Flats Colorado plant in May 1977 is preseted. Several different soil sampling techniques that had been used in the area were applied at four different sites. One objective was to comparethe average 239 - 240 Pu concentration values obtained by the various soil sampling techniques used. There was also interest in determining whether there are differences in the reproducibility of the various techniques and how the techniques compared with the proposed EPA technique of sampling to 1 cm depth. Statistically significant differences in average concentrations between the techniques were found. The differences could be largely related to the differences in sampling depth-the primary physical variable between the techniques. The reproducibility of the techniques was evaluated by comparing coefficients of variation. Differences between coefficients of variation were not statistically significant. Average (median) coefficients ranged from 21 to 42 percent for the five sampling techniques. A laboratory study indicated that various sample treatment and particle sizing techniques could increase the concentration of plutonium in the less than 10 micrometer size fraction by up to a factor of about 4 compared to the 2 mm size fraction

  17. Integrated environmental monitoring and multivariate data analysis-A case study.

    Science.gov (United States)

    Eide, Ingvar; Westad, Frank; Nilssen, Ingunn; de Freitas, Felipe Sales; Dos Santos, Natalia Gomes; Dos Santos, Francisco; Cabral, Marcelo Montenegro; Bicego, Marcia Caruso; Figueira, Rubens; Johnsen, Ståle

    2017-03-01

    The present article describes integration of environmental monitoring and discharge data and interpretation using multivariate statistics, principal component analysis (PCA), and partial least squares (PLS) regression. The monitoring was carried out at the Peregrino oil field off the coast of Brazil. One sensor platform and 3 sediment traps were placed on the seabed. The sensors measured current speed and direction, turbidity, temperature, and conductivity. The sediment trap samples were used to determine suspended particulate matter that was characterized with respect to a number of chemical parameters (26 alkanes, 16 PAHs, N, C, calcium carbonate, and Ba). Data on discharges of drill cuttings and water-based drilling fluid were provided on a daily basis. The monitoring was carried out during 7 campaigns from June 2010 to October 2012, each lasting 2 to 3 months due to the capacity of the sediment traps. The data from the campaigns were preprocessed, combined, and interpreted using multivariate statistics. No systematic difference could be observed between campaigns or traps despite the fact that the first campaign was carried out before drilling, and 1 of 3 sediment traps was located in an area not expected to be influenced by the discharges. There was a strong covariation between suspended particulate matter and total N and organic C suggesting that the majority of the sediment samples had a natural and biogenic origin. Furthermore, the multivariate regression showed no correlation between discharges of drill cuttings and sediment trap or turbidity data taking current speed and direction into consideration. Because of this lack of correlation with discharges from the drilling location, a more detailed evaluation of chemical indicators providing information about origin was carried out in addition to numerical modeling of dispersion and deposition. The chemical indicators and the modeling of dispersion and deposition support the conclusions from the multivariate

  18. Stock price forecasting for companies listed on Tehran stock exchange using multivariate adaptive regression splines model and semi-parametric splines technique

    Science.gov (United States)

    Rounaghi, Mohammad Mahdi; Abbaszadeh, Mohammad Reza; Arashi, Mohammad

    2015-11-01

    One of the most important topics of interest to investors is stock price changes. Investors whose goals are long term are sensitive to stock price and its changes and react to them. In this regard, we used multivariate adaptive regression splines (MARS) model and semi-parametric splines technique for predicting stock price in this study. The MARS model as a nonparametric method is an adaptive method for regression and it fits for problems with high dimensions and several variables. semi-parametric splines technique was used in this study. Smoothing splines is a nonparametric regression method. In this study, we used 40 variables (30 accounting variables and 10 economic variables) for predicting stock price using the MARS model and using semi-parametric splines technique. After investigating the models, we select 4 accounting variables (book value per share, predicted earnings per share, P/E ratio and risk) as influencing variables on predicting stock price using the MARS model. After fitting the semi-parametric splines technique, only 4 accounting variables (dividends, net EPS, EPS Forecast and P/E Ratio) were selected as variables effective in forecasting stock prices.

  19. Estimation of Seismic Wavelets Based on the Multivariate Scale Mixture of Gaussians Model

    Directory of Open Access Journals (Sweden)

    Jing-Huai Gao

    2009-12-01

    Full Text Available This paper proposes a new method for estimating seismic wavelets. Suppose a seismic wavelet can be modeled by a formula with three free parameters (scale, frequency and phase. We can transform the estimation of the wavelet into determining these three parameters. The phase of the wavelet is estimated by constant-phase rotation to the seismic signal, while the other two parameters are obtained by the Higher-order Statistics (HOS (fourth-order cumulant matching method. In order to derive the estimator of the Higher-order Statistics (HOS, the multivariate scale mixture of Gaussians (MSMG model is applied to formulating the multivariate joint probability density function (PDF of the seismic signal. By this way, we can represent HOS as a polynomial function of second-order statistics to improve the anti-noise performance and accuracy. In addition, the proposed method can work well for short time series.

  20. Assessing the hydrogeochemical processes affecting groundwater pollution in arid areas using an integration of geochemical equilibrium and multivariate statistical techniques

    International Nuclear Information System (INIS)

    El Alfy, Mohamed; Lashin, Aref; Abdalla, Fathy; Al-Bassam, Abdulaziz

    2017-01-01

    Rapid economic expansion poses serious problems for groundwater resources in arid areas, which typically have high rates of groundwater depletion. In this study, integration of hydrochemical investigations involving chemical and statistical analyses are conducted to assess the factors controlling hydrochemistry and potential pollution in an arid region. Fifty-four groundwater samples were collected from the Dhurma aquifer in Saudi Arabia, and twenty-one physicochemical variables were examined for each sample. Spatial patterns of salinity and nitrate were mapped using fitted variograms. The nitrate spatial distribution shows that nitrate pollution is a persistent problem affecting a wide area of the aquifer. The hydrochemical investigations and cluster analysis reveal four significant clusters of groundwater zones. Five main factors were extracted, which explain >77% of the total data variance. These factors indicated that the chemical characteristics of the groundwater were influenced by rock–water interactions and anthropogenic factors. The identified clusters and factors were validated with hydrochemical investigations. The geogenic factors include the dissolution of various minerals (calcite, aragonite, gypsum, anhydrite, halite and fluorite) and ion exchange processes. The anthropogenic factors include the impact of irrigation return flows and the application of potassium, nitrate, and phosphate fertilizers. Over time, these anthropogenic factors will most likely contribute to further declines in groundwater quality. - Highlights: • Hydrochemical investigations were carried out in Dhurma aquifer in Saudi Arabia. • The factors controlling potential groundwater pollution in an arid region were studied. • Chemical and statistical analyses are integrated to assess these factors. • Five main factors were extracted, which explain >77% of the total data variance. • The chemical characteristics of the groundwater were influenced by rock–water interactions

  1. Using Apparent Density of Paper from Hardwood Kraft Pulps to Predict Sheet Properties, based on Unsupervised Classification and Multivariable Regression Techniques

    Directory of Open Access Journals (Sweden)

    Ofélia Anjos

    2015-07-01

    Full Text Available Paper properties determine the product application potential and depend on the raw material, pulping conditions, and pulp refining. The aim of this study was to construct mathematical models that predict quantitative relations between the paper density and various mechanical and optical properties of the paper. A dataset of properties of paper handsheets produced with pulps of Acacia dealbata, Acacia melanoxylon, and Eucalyptus globulus beaten at 500, 2500, and 4500 revolutions was used. Unsupervised classification techniques were combined to assess the need to perform separated prediction models for each species, and multivariable regression techniques were used to establish such prediction models. It was possible to develop models with a high goodness of fit using paper density as the independent variable (or predictor for all variables except tear index and zero-span tensile strength, both dry and wet.

  2. Nonparametric indices of dependence between components for inhomogeneous multivariate random measures and marked sets

    OpenAIRE

    van Lieshout, Maria Nicolette Margaretha

    2018-01-01

    We propose new summary statistics to quantify the association between the components in coverage-reweighted moment stationary multivariate random sets and measures. They are defined in terms of the coverage-reweighted cumulant densities and extend classic functional statistics for stationary random closed sets. We study the relations between these statistics and evaluate them explicitly for a range of models. Unbiased estimators are given for all statistics and applied to simulated examples a...

  3. Mathematical and Statistical Techniques for Systems Medicine: The Wnt Signaling Pathway as a Case Study

    KAUST Repository

    MacLean, Adam L.; Harrington, Heather A.; Stumpf, Michael P. H.; Byrne, Helen M.

    2015-01-01

    mathematical and statistical techniques that enable modelers to gain insight into (models of) gene regulation and generate testable predictions. We introduce a range of modeling frameworks, but focus on ordinary differential equation (ODE) models since

  4. Models and Inference for Multivariate Spatial Extremes

    KAUST Repository

    Vettori, Sabrina

    2017-12-07

    The development of flexible and interpretable statistical methods is necessary in order to provide appropriate risk assessment measures for extreme events and natural disasters. In this thesis, we address this challenge by contributing to the developing research field of Extreme-Value Theory. We initially study the performance of existing parametric and non-parametric estimators of extremal dependence for multivariate maxima. As the dimensionality increases, non-parametric estimators are more flexible than parametric methods but present some loss in efficiency that we quantify under various scenarios. We introduce a statistical tool which imposes the required shape constraints on non-parametric estimators in high dimensions, significantly improving their performance. Furthermore, by embedding the tree-based max-stable nested logistic distribution in the Bayesian framework, we develop a statistical algorithm that identifies the most likely tree structures representing the data\\'s extremal dependence using the reversible jump Monte Carlo Markov Chain method. A mixture of these trees is then used for uncertainty assessment in prediction through Bayesian model averaging. The computational complexity of full likelihood inference is significantly decreased by deriving a recursive formula for the nested logistic model likelihood. The algorithm performance is verified through simulation experiments which also compare different likelihood procedures. Finally, we extend the nested logistic representation to the spatial framework in order to jointly model multivariate variables collected across a spatial region. This situation emerges often in environmental applications but is not often considered in the current literature. Simulation experiments show that the new class of multivariate max-stable processes is able to detect both the cross and inner spatial dependence of a number of extreme variables at a relatively low computational cost, thanks to its Bayesian hierarchical

  5. Hydrogeochemical characterization of groundwater of peninsular Indian region using multivariate statistical techniques

    Science.gov (United States)

    Jacintha, T. German Amali; Rawat, Kishan Singh; Mishra, Anoop; Singh, Sudhir Kumar

    2017-10-01

    Groundwater quality of Chennai, Tamil Nadu (India) has been assessed during different seasons of year 2012. Three physical (pH, EC, and TDS) and four chemical parameters (Ca2+, Cl-, TH, Mg2+ and SO4 2-) from 18 bore wells were assessed. The results showed that pH of majority of groundwater samples indicates a slightly basic condition (7.99post-monsoon and 8.35pre-monsoon). TH was slightly hard [322.11 mg/lpre-monsoon, 299.37 mg/lpost-monsoon but lies under World Health Organization (WHO) upper limit]. EC, TDS, Ca2+ and Mg2+ concentrations were under WHO permissible limit during post-monsoon (1503.42 μS/cm, 1009.37, 66.58 and 32.42 mg/l respectively) and pre-monsoon (1371.58 μS/cm, 946.84, 71.79 and 34.79 mg/l, respectively). EC shows a good correlation with SO4 2- ( R 2 = 0.59pre-monsoon, 0.77post-monsoon) which indicates that SO4 2- plays a major role in EC of ground water of bore wells. SO4 2- has also showed positive correlations with TDS ( R 2 = 0.84pre-monsoon, 0.95post-monsoon) and TH ( R 2 = 0.70pre-monsoon, 0.75post-monsoon). The principal component analysis (PCA)/factor analysis (FA) was carried out; Factor1 explains 59.154 and 69.278 % of the total variance during pre- and post-monsoon, respectively, with a strong positive loading on Ca2+, Mg2+, SO4 2-, TDS and a negative loading on pH. Factor2 accounts for 13.94 and 14.22 % of the total variance during pre- and post-monsoon, respectively, and was characterized by strong positive loading of only pH and poor/negative loading of EC, Ca2+, Mg2+, SO4 2-, TDS and TH during pre- and post-monsoon. We recommend routine monitoring and thorough treatment before consumption. Further, this study has demonstrated the effectiveness of PCA/FA to assess the hydrogeochemical processes governing the groundwater chemistry in the area.

  6. Data on electrical energy conservation using high efficiency motors for the confidence bounds using statistical techniques.

    Science.gov (United States)

    Shaikh, Muhammad Mujtaba; Memon, Abdul Jabbar; Hussain, Manzoor

    2016-09-01

    In this article, we describe details of the data used in the research paper "Confidence bounds for energy conservation in electric motors: An economical solution using statistical techniques" [1]. The data presented in this paper is intended to show benefits of high efficiency electric motors over the standard efficiency motors of similar rating in the industrial sector of Pakistan. We explain how the data was collected and then processed by means of formulas to show cost effectiveness of energy efficient motors in terms of three important parameters: annual energy saving, cost saving and payback periods. This data can be further used to construct confidence bounds for the parameters using statistical techniques as described in [1].

  7. Discrimination of source reactor type by multivariate statistical analysis of uranium and plutonium isotopic concentrations in unknown irradiated nuclear fuel material.

    Science.gov (United States)

    Robel, Martin; Kristo, Michael J

    2008-11-01

    The problem of identifying the provenance of unknown nuclear material in the environment by multivariate statistical analysis of its uranium and/or plutonium isotopic composition is considered. Such material can be introduced into the environment as a result of nuclear accidents, inadvertent processing losses, illegal dumping of waste, or deliberate trafficking in nuclear materials. Various combinations of reactor type and fuel composition were analyzed using Principal Components Analysis (PCA) and Partial Least Squares Discriminant Analysis (PLSDA) of the concentrations of nine U and Pu isotopes in fuel as a function of burnup. Real-world variation in the concentrations of (234)U and (236)U in the fresh (unirradiated) fuel was incorporated. The U and Pu were also analyzed separately, with results that suggest that, even after reprocessing or environmental fractionation, Pu isotopes can be used to determine both the source reactor type and the initial fuel composition with good discrimination.

  8. Discrimination between glycosylation patterns of therapeutic antibodies using a microfluidic platform, MALDI-MS and multivariate statistics.

    Science.gov (United States)

    Thuy, Tran Thi; Tengstrand, Erik; Aberg, Magnus; Thorsén, Gunnar

    2012-11-01

    Optimal glycosylation with respect to the efficacy, serum half-life time, and immunogenic properties is essential in the generation of therapeutic antibodies. The glycosylation pattern can be affected by several different parameters during the manufacture of antibodies and may change significantly over cultivation time. Fast and robust methods for determination of the glycosylation patterns of therapeutic antibodies are therefore needed. We have recently presented an efficient method for the determination of glycans on therapeutic antibodies using a microfluidic CD platform for sample preparation prior to matrix-assisted laser-desorption mass spectrometry analysis. In the present work, this method is applied to analyse the glycosylation patterns of three commercially available therapeutic antibodies and one intended for therapeutic use. Two of the antibodies produced in mouse myeloma cell line (SP2/0) and one produced in Chinese hamster ovary (CHO) cells exhibited similar glycosylation patterns but could still be readily differentiated from each other using multivariate statistical methods. The two antibodies with most similar glycosylation patterns were also studied in an assessment of the method's applicability for quality control of therapeutic antibodies. The method presented in this paper is highly automated and rapid. It can therefore efficiently generate data that helps to keep a production process within the desired design space or assess that an identical product is being produced after changes to the process. Copyright © 2012 Elsevier B.V. All rights reserved.

  9. Statistical analysis of Thematic Mapper Simulator data for the geobotanical discrimination of rock types in southwest Oregon

    Science.gov (United States)

    Morrissey, L. A.; Weinstock, K. J.; Mouat, D. A.; Card, D. H.

    1984-01-01

    An evaluation of Thematic Mapper Simulator (TMS) data for the geobotanical discrimination of rock types based on vegetative cover characteristics is addressed in this research. A methodology for accomplishing this evaluation utilizing univariate and multivariate techniques is presented. TMS data acquired with a Daedalus DEI-1260 multispectral scanner were integrated with vegetation and geologic information for subsequent statistical analyses, which included a chi-square test, an analysis of variance, stepwise discriminant analysis, and Duncan's multiple range test. Results indicate that ultramafic rock types are spectrally separable from nonultramafics based on vegetative cover through the use of statistical analyses.

  10. Multivariate return periods of sea storms for coastal erosion risk assessment

    Directory of Open Access Journals (Sweden)

    S. Corbella

    2012-08-01

    Full Text Available The erosion of a beach depends on various storm characteristics. Ideally, the risk associated with a storm would be described by a single multivariate return period that is also representative of the erosion risk, i.e. a 100 yr multivariate storm return period would cause a 100 yr erosion return period. Unfortunately, a specific probability level may be associated with numerous combinations of storm characteristics. These combinations, despite having the same multivariate probability, may cause very different erosion outcomes. This paper explores this ambiguity problem in the context of copula based multivariate return periods and using a case study at Durban on the east coast of South Africa. Simulations were used to correlate multivariate return periods of historical events to return periods of estimated storm induced erosion volumes. In addition, the relationship of the most-likely design event (Salvadori et al., 2011 to coastal erosion was investigated. It was found that the multivariate return periods for wave height and duration had the highest correlation to erosion return periods. The most-likely design event was found to be an inadequate design method in its current form. We explore the inclusion of conditions based on the physical realizability of wave events and the use of multivariate linear regression to relate storm parameters to erosion computed from a process based model. Establishing a link between storm statistics and erosion consequences can resolve the ambiguity between multivariate storm return periods and associated erosion return periods.

  11. Multivariable control in nuclear power stations -survey of design methods

    International Nuclear Information System (INIS)

    Mcmorran, P.D.

    1979-12-01

    The development of larger nuclear generating stations increases the importance of dynamic interaction between controllers, because each control action may affect several plant outputs. Multivariable control provides the techniques to design controllers which perform well under these conditions. This report is a foundation for further work on the application of multivariable control in AECL. It covers the requirements of control and the fundamental mathematics used, then reviews the most important linear methods, based on both state-space and frequency-response concepts. State-space methods are derived from analysis of the system differential equations, while frequency-response methods use the input-output transfer function. State-space methods covered include linear-quadratic optimal control, pole shifting, and the theory of state observers and estimators. Frequency-response methods include the inverse Nyquist array method, and classical non-interactive techniques. Transfer-function methods are particularly emphasized since they can incorporate ill-defined design criteria. The underlying concepts, and the application strengths and weaknesses of each design method are presented. A review of significant applications is also given. It is concluded that the inverse Nyquist array method, a frequency-response technique based on inverse transfer-function matrices, is preferred for the design of multivariable controllers for nuclear power plants. This method may be supplemented by information obtained from a modal analysis of the plant model. (auth)

  12. Multivariate methods for analysis of environmental reference materials using laser-induced breakdown spectroscopy

    Directory of Open Access Journals (Sweden)

    Shikha Awasthi

    2017-06-01

    Full Text Available Analysis of emission from laser-induced plasma has a unique capability for quantifying the major and minor elements present in any type of samples under optimal analysis conditions. Chemometric techniques are very effective and reliable tools for quantification of multiple components in complex matrices. The feasibility of laser-induced breakdown spectroscopy (LIBS in combination with multivariate analysis was investigated for the analysis of environmental reference materials (RMs. In the present work, different (Certified/Standard Reference Materials of soil and plant origin were analyzed using LIBS and the presence of Al, Ca, Mg, Fe, K, Mn and Si were identified in the LIBS spectra of these materials. Multivariate statistical methods (Partial Least Square Regression and Partial Least Square Discriminant Analysis were employed for quantitative analysis of the constituent elements using the LIBS spectral data. Calibration models were used to predict the concentrations of the different elements of test samples and subsequently, the concentrations were compared with certified concentrations to check the authenticity of models. The non-destructive analytical method namely Instrumental Neutron Activation Analysis (INAA using high flux reactor neutrons and high resolution gamma-ray spectrometry was also used for intercomparison of results of two RMs by LIBS.

  13. Determination of dominant biogeochemical processes in a contaminated aquifer-wetland system using multivariate statistical analysis

    Science.gov (United States)

    Baez-Cazull, S. E.; McGuire, J.T.; Cozzarelli, I.M.; Voytek, M.A.

    2008-01-01

    Determining the processes governing aqueous biogeochemistry in a wetland hydrologically linked to an underlying contaminated aquifer is challenging due to the complex exchange between the systems and their distinct responses to changes in precipitation, recharge, and biological activities. To evaluate temporal and spatial processes in the wetland-aquifer system, water samples were collected using cm-scale multichambered passive diffusion samplers (peepers) to span the wetland-aquifer interface over a period of 3 yr. Samples were analyzed for major cations and anions, methane, and a suite of organic acids resulting in a large dataset of over 8000 points, which was evaluated using multivariate statistics. Principal component analysis (PCA) was chosen with the purpose of exploring the sources of variation in the dataset to expose related variables and provide insight into the biogeochemical processes that control the water chemistry of the system. Factor scores computed from PCA were mapped by date and depth. Patterns observed suggest that (i) fermentation is the process controlling the greatest variability in the dataset and it peaks in May; (ii) iron and sulfate reduction were the dominant terminal electron-accepting processes in the system and were associated with fermentation but had more complex seasonal variability than fermentation; (iii) methanogenesis was also important and associated with bacterial utilization of minerals as a source of electron acceptors (e.g., barite BaSO4); and (iv) seasonal hydrological patterns (wet and dry periods) control the availability of electron acceptors through the reoxidation of reduced iron-sulfur species enhancing iron and sulfate reduction. Copyright ?? 2008 by the American Society of Agronomy, Crop Science Society of America, and Soil Science Society of America. All rights reserved.

  14. An Outlyingness Matrix for Multivariate Functional Data Classification

    KAUST Repository

    Dai, Wenlin; Genton, Marc G.

    2017-01-01

    outlyingness with conventional statistical depth. We propose two classifiers based on directional outlyingness and the outlyingness matrix, respectively. Our classifiers provide better performance compared with existing depth-based classifiers when applied on both univariate and multivariate functional data from simulation studies. We also test our methods on two data problems: speech recognition and gesture classification, and obtain results that are consistent with the findings from the simulated data.

  15. Lasso and probabilistic inequalities for multivariate point processes

    OpenAIRE

    Hansen, Niels Richard; Reynaud-Bouret, Patricia; Rivoirard, Vincent

    2012-01-01

    Due to its low computational cost, Lasso is an attractive regularization method for high-dimensional statistical settings. In this paper, we consider multivariate counting processes depending on an unknown function parameter to be estimated by linear combinations of a fixed dictionary. To select coefficients, we propose an adaptive $\\ell_{1}$-penalization methodology, where data-driven weights of the penalty are derived from new Bernstein type inequalities for martingales. Oracle inequalities...

  16. Input saturation in nonlinear multivariable processes resolved by nonlinear decoupling

    Directory of Open Access Journals (Sweden)

    Jens G. Balchen

    1995-04-01

    Full Text Available A new method is presented for the resolution of the problem of input saturation in nonlinear multivariable process control by means of elementary nonlinear decoupling (END. Input saturation can have serious consequences particularly in multivariable control because it may lead to very undesirable system behaviour and quite often system instability. Many authors have searched for systematic techniques for designing multivariable control systems in which saturation may occur in any of the control variables (inputs, manipulated variables. No generally accepted method seems to have been presented so far which gives a solution in closed form. The method of elementary nonlinear decoupling (END can be applied directly to the case of saturation control variables by deriving as many control strategies as there are combinations of saturating control variables. The method is demonstrated by the multivariable control of a simulated Fluidized Catalytic Cracker (FCC with very convincing results.

  17. Modern nonparametric, robust and multivariate methods festschrift in honour of Hannu Oja

    CERN Document Server

    Taskinen, Sara

    2015-01-01

    Written by leading experts in the field, this edited volume brings together the latest findings in the area of nonparametric, robust and multivariate statistical methods. The individual contributions cover a wide variety of topics ranging from univariate nonparametric methods to robust methods for complex data structures. Some examples from statistical signal processing are also given. The volume is dedicated to Hannu Oja on the occasion of his 65th birthday and is intended for researchers as well as PhD students with a good knowledge of statistics.

  18. Studies on coal flotation in flotation column using statistical technique

    Energy Technology Data Exchange (ETDEWEB)

    M.S. Jena; S.K. Biswal; K.K. Rao; P.S.R. Reddy [Institute of Minerals & Materials Technology (IMMT), Orissa (India)

    2009-07-01

    Flotation of Indian high ash coking coal fines to obtain clean coal has been reported earlier by many authors. Here an attempt has been made to systematically analyse factors influencing the flotation process using statistical design of experiments technique. Studies carried out in a 100 mm diameter column using factorial design to establish weightage of factors such as feed rate, air rate and collector dosage indicated that all three parameters have equal influence on the flotation process. Subsequently RSM-CCD design was used to obtain best result and it is observed that 94% combustibles can be recovered with 82.5% weight recovery at 21.4% ash from a feed containing 31.3% ash content.

  19. The Dirichet-Multinomial model for multivariate randomized response data and small samples

    NARCIS (Netherlands)

    Avetisyan, Marianna; Fox, Gerardus J.A.

    2012-01-01

    In survey sampling the randomized response (RR) technique can be used to obtain truthful answers to sensitive questions. Although the individual answers are masked due to the RR technique, individual (sensitive) response rates can be estimated when observing multivariate response data. The

  20. Clinical Decision Support: Statistical Hopes and Challenges

    Czech Academy of Sciences Publication Activity Database

    Kalina, Jan; Zvárová, Jana

    2016-01-01

    Roč. 4, č. 1 (2016), s. 30-34 ISSN 1805-8698 Grant - others:Nadační fond na opdporu vědy(CZ) Neuron Institutional support: RVO:67985807 Keywords : decision support * data mining * multivariate statistics * psychiatry * information based medicine Subject RIV: BB - Applied Statistics, Operational Research

  1. Ultimate compression after impact load prediction in graphite/epoxy coupons using neural network and multivariate statistical analyses

    Science.gov (United States)

    Gregoire, Alexandre David

    2011-07-01

    The goal of this research was to accurately predict the ultimate compressive load of impact damaged graphite/epoxy coupons using a Kohonen self-organizing map (SOM) neural network and multivariate statistical regression analysis (MSRA). An optimized use of these data treatment tools allowed the generation of a simple, physically understandable equation that predicts the ultimate failure load of an impacted damaged coupon based uniquely on the acoustic emissions it emits at low proof loads. Acoustic emission (AE) data were collected using two 150 kHz resonant transducers which detected and recorded the AE activity given off during compression to failure of thirty-four impacted 24-ply bidirectional woven cloth laminate graphite/epoxy coupons. The AE quantification parameters duration, energy and amplitude for each AE hit were input to the Kohonen self-organizing map (SOM) neural network to accurately classify the material failure mechanisms present in the low proof load data. The number of failure mechanisms from the first 30% of the loading for twenty-four coupons were used to generate a linear prediction equation which yielded a worst case ultimate load prediction error of 16.17%, just outside of the +/-15% B-basis allowables, which was the goal for this research. Particular emphasis was placed upon the noise removal process which was largely responsible for the accuracy of the results.

  2. Fragments analysis of Marajoara pubic covers using a portable system of X-ray fluorescence and multivariate statistics

    International Nuclear Information System (INIS)

    Freitas, Renato; Rabello, Angela; Lima, Tania

    2011-01-01

    Full text: In this work it was characterized the elemental composition of 102 fragments of Marajoara pubic covers, belonging to the National Museum collection, using EDXRF and multivariate statistics analysis. The objective was to identify possible groups of samples that presented similar characteristics. This information will be useful in the development of a systematic classification of these artifacts. Provenance studies of ancient ceramics are based on the assumption that pottery produced from a specific clay will present a similar chemical composition, which will distinguish them from pottery produced from a different clay. In this way, the pottery is assigned to particular production groups, which are then correlated with their respective origins. EDXRF measurements were carried out with a portable system, developed in the Nuclear Instrumentation Laboratory, consisting of an X-ray tube Oxford TF3005 with tungsten (W) anode, operating at 25 kV and 100 μA, and a Si-PIN XR-100CR detector from Amptek. In each one of the 102 fragments, six points were analyzed (three in the front part and three in the reverse) with an acquisition time of 600 s and a beam collimation of 2 mm. The spectra were processed and analyzed using the software QXAS-AXIL from IAEA. PCA was applied to the XRF results revealing a clear cluster separation to the samples. (author)

  3. Bootstrap-based confidence estimation in PCA and multivariate statistical process control

    DEFF Research Database (Denmark)

    Babamoradi, Hamid

    be used to detect outliers in the data since the outliers can distort the bootstrap estimates. Bootstrap-based confidence limits were suggested as alternative to the asymptotic limits for control charts and contribution plots in MSPC (Paper II). The results showed that in case of the Q-statistic......Traditional/Asymptotic confidence estimation has limited applicability since it needs statistical theories to estimate the confidences, which are not available for all indicators/parameters. Furthermore, in case the theories are available for a specific indicator/parameter, the theories are based....... The goal was to improve process monitoring by improving the quality of MSPC charts and contribution plots. Bootstrapping algorithm to build confidence limits was illustrated in a case study format (Paper I). The main steps in the algorithm were discussed where a set of sensible choices (plus...

  4. An Analysis of Research Methods and Statistical Techniques Used by Doctoral Dissertation at the Education Sciences in Turkey

    Science.gov (United States)

    Karadag, Engin

    2010-01-01

    To assess research methods and analysis of statistical techniques employed by educational researchers, this study surveyed unpublished doctoral dissertation from 2003 to 2007. Frequently used research methods consisted of experimental research; a survey; a correlational study; and a case study. Descriptive statistics, t-test, ANOVA, factor…

  5. Spatio-temporal patterns and source apportionment of pollution in Qiantang River (China) using neural-based modeling and multivariate statistical techniques

    Science.gov (United States)

    Su, Shiliang; Zhi, Junjun; Lou, Liping; Huang, Fang; Chen, Xia; Wu, Jiaping

    Characterizing the spatio-temporal patterns and apportioning the pollution sources of water bodies are important for the management and protection of water resources. The main objective of this study is to describe the dynamics of water quality and provide references for improving river pollution control practices. Comprehensive application of neural-based modeling and different multivariate methods was used to evaluate the spatio-temporal patterns and source apportionment of pollution in Qiantang River, China. Measurement data were obtained and pretreated for 13 variables from 41 monitoring sites for the period of 2001-2004. A self-organizing map classified the 41 monitoring sites into three groups (Group A, B and C), representing different pollution characteristics. Four significant parameters (dissolved oxygen, biochemical oxygen demand, total phosphorus and total lead) were identified by discriminant analysis for distinguishing variations of different years, with about 80% correct assignment for temporal variation. Rotated principal component analysis (PCA) identified four potential pollution sources for Group A (domestic sewage and agricultural pollution, industrial wastewater pollution, mineral weathering, vehicle exhaust and sand mining), five for Group B (heavy metal pollution, agricultural runoff, vehicle exhaust and sand mining, mineral weathering, chemical plants discharge) and another five for Group C (vehicle exhaust and sand mining, chemical plants discharge, soil weathering, biochemical pollution, mineral weathering). The identified potential pollution sources explained 75.6% of the total variances for Group A, 75.0% for Group B and 80.0% for Group C, respectively. Receptor-based source apportionment was applied to further estimate source contributions for each pollution variable in the three groups, which facilitated and supported the PCA results. These results could assist managers to develop optimal strategies and determine priorities for river

  6. Statistical techniques applied to aerial radiometric surveys (STAARS): series introduction and the principal-components-analysis method

    International Nuclear Information System (INIS)

    Pirkle, F.L.

    1981-04-01

    STAARS is a new series which is being published to disseminate information concerning statistical procedures for interpreting aerial radiometric data. The application of a particular data interpretation technique to geologic understanding for delineating regions favorable to uranium deposition is the primary concern of STAARS. Statements concerning the utility of a technique on aerial reconnaissance data as well as detailed aerial survey data will be included

  7. Integrated Application of Multivariate Statistical Methods to Source Apportionment of Watercourses in the Liao River Basin, Northeast China

    Directory of Open Access Journals (Sweden)

    Jiabo Chen

    2016-10-01

    Full Text Available Source apportionment of river water pollution is critical in water resource management and aquatic conservation. Comprehensive application of various GIS-based multivariate statistical methods was performed to analyze datasets (2009–2011 on water quality in the Liao River system (China. Cluster analysis (CA classified the 12 months of the year into three groups (May–October, February–April and November–January and the 66 sampling sites into three groups (groups A, B and C based on similarities in water quality characteristics. Discriminant analysis (DA determined that temperature, dissolved oxygen (DO, pH, chemical oxygen demand (CODMn, 5-day biochemical oxygen demand (BOD5, NH4+–N, total phosphorus (TP and volatile phenols were significant variables affecting temporal variations, with 81.2% correct assignments. Principal component analysis (PCA and positive matrix factorization (PMF identified eight potential pollution factors for each part of the data structure, explaining more than 61% of the total variance. Oxygen-consuming organics from cropland and woodland runoff were the main latent pollution factor for group A. For group B, the main pollutants were oxygen-consuming organics, oil, nutrients and fecal matter. For group C, the evaluated pollutants primarily included oxygen-consuming organics, oil and toxic organics.

  8. Hydrochemical Characteristics and Multivariate Statistical Analysis of Natural Water System: A Case Study in Kangding County, Southwestern China

    Directory of Open Access Journals (Sweden)

    Yunhui Zhang

    2018-01-01

    Full Text Available The utilization for water resource has been of great concern to human life. To assess the natural water system in Kangding County, the integrated methods of hydrochemical analysis, multivariate statistics and geochemical modelling were conducted on surface water, groundwater, and thermal water samples. Surface water and groundwater were dominated by Ca-HCO3 type, while thermal water belonged to Ca-HCO3 and Na-Cl-SO4 types. The analyzing results concluded the driving factors that affect hydrochemical components. Following the results of the combined assessments, hydrochemical process was controlled by the dissolution of carbonate and silicate minerals with slight influence from anthropogenic activity. The mixing model of groundwater and thermal water was calculated using silica-enthalpy method, yielding cold-water fraction of 0.56–0.79 and an estimated reservoir temperature of 130–199 °C, respectively. δD and δ18O isotopes suggested that surface water, groundwater and thermal springs were of meteoric origin. Thermal water should have deep circulation through the Xianshuihe fault zone, while groundwater flows through secondary fractures where it recharges with thermal water. Those analytical results were used to construct a hydrological conceptual model, providing a better understanding of the natural water system in Kangding County.

  9. Music Genre Classification using the multivariate AR feature integration model

    DEFF Research Database (Denmark)

    Ahrendt, Peter; Meng, Anders

    2005-01-01

    informative decisions about musical genre. For the MIREX music genre contest several authors derive long time features based either on statistical moments and/or temporal structure in the short time features. In our contribution we model a segment (1.2 s) of short time features (texture) using a multivariate...... autoregressive model. Other authors have applied simpler statistical models such as the mean-variance model, which also has been included in several of this years MIREX submissions, see e.g. Tzanetakis (2005); Burred (2005); Bergstra et al. (2005); Lidy and Rauber (2005)....

  10. Handbook of univariate and multivariate data analysis with IBM SPSS

    CERN Document Server

    Ho, Robert

    2013-01-01

    Using the same accessible, hands-on approach as its best-selling predecessor, the Handbook of Univariate and Multivariate Data Analysis with IBM SPSS, Second Edition explains how to apply statistical tests to experimental findings, identify the assumptions underlying the tests, and interpret the findings. This second edition now covers more topics and has been updated with the SPSS statistical package for Windows.New to the Second EditionThree new chapters on multiple discriminant analysis, logistic regression, and canonical correlationNew section on how to deal with missing dataCoverage of te

  11. Using multivariate analyses and GIS to identify pollutants and their spatial patterns in urban soils in Galway, Ireland

    International Nuclear Information System (INIS)

    Zhang Chaosheng

    2006-01-01

    Galway is a small but rapidly growing tourism city in western Ireland. To evaluate its environmental quality, a total of 166 surface soil samples (0-10 cm depth) were collected from parks and grasslands at the density of 1 sample per 0.25 km 2 at the end of 2004. All samples were analysed using ICP-AES for the near-total concentrations of 26 chemical elements. Multivariate statistics and GIS techniques were applied to classify the elements and to identify elements influenced by human activities. Cluster analysis (Canada) and principal component analysis (PCA) classified the elements into two groups: the first group predominantly derived from natural sources, the second being influenced by human activities. GIS mapping is a powerful tool in identifying the possible sources of pollutants. Relatively high concentrations of Cu, Pb and Zn were found in the city centre, old residential areas, and along major traffic routes, showing significant effects of traffic pollution. The element As is enriched in soils of the old built-up areas, which can be attributed to coal and peat combustion for home heating. Such significant spatial patterns of pollutants displayed by urban soils may imply potential health threat to residents of the contaminated areas of the city. - Multivariate statistics and GIS are useful tools to identify pollutants in urban soils

  12. Multivariate statistical monitoring as applied to clean-in-place (CIP) and steam-in-place (SIP) operations in biopharmaceutical manufacturing.

    Science.gov (United States)

    Roy, Kevin; Undey, Cenk; Mistretta, Thomas; Naugle, Gregory; Sodhi, Manbir

    2014-01-01

    Multivariate statistical process monitoring (MSPM) is becoming increasingly utilized to further enhance process monitoring in the biopharmaceutical industry. MSPM can play a critical role when there are many measurements and these measurements are highly correlated, as is typical for many biopharmaceutical operations. Specifically, for processes such as cleaning-in-place (CIP) and steaming-in-place (SIP, also known as sterilization-in-place), control systems typically oversee the execution of the cycles, and verification of the outcome is based on offline assays. These offline assays add to delays and corrective actions may require additional setup times. Moreover, this conventional approach does not take interactive effects of process variables into account and cycle optimization opportunities as well as salient trends in the process may be missed. Therefore, more proactive and holistic online continued verification approaches are desirable. This article demonstrates the application of real-time MSPM to processes such as CIP and SIP with industrial examples. The proposed approach has significant potential for facilitating enhanced continuous verification, improved process understanding, abnormal situation detection, and predictive monitoring, as applied to CIP and SIP operations. © 2014 American Institute of Chemical Engineers.

  13. Growth curve models and statistical diagnostics

    CERN Document Server

    Pan, Jian-Xin

    2002-01-01

    Growth-curve models are generalized multivariate analysis-of-variance models. These models are especially useful for investigating growth problems on short times in economics, biology, medical research, and epidemiology. This book systematically introduces the theory of the GCM with particular emphasis on their multivariate statistical diagnostics, which are based mainly on recent developments made by the authors and their collaborators. The authors provide complete proofs of theorems as well as practical data sets and MATLAB code.

  14. Effectiveness of Multivariate Time Series Classification Using Shapelets

    Directory of Open Access Journals (Sweden)

    A. P. Karpenko

    2015-01-01

    Full Text Available Typically, time series classifiers require signal pre-processing (filtering signals from noise and artifact removal, etc., enhancement of signal features (amplitude, frequency, spectrum, etc., classification of signal features in space using the classical techniques and classification algorithms of multivariate data. We consider a method of classifying time series, which does not require enhancement of the signal features. The method uses the shapelets of time series (time series shapelets i.e. small fragments of this series, which reflect properties of one of its classes most of all.Despite the significant number of publications on the theory and shapelet applications for classification of time series, the task to evaluate the effectiveness of this technique remains relevant. An objective of this publication is to study the effectiveness of a number of modifications of the original shapelet method as applied to the multivariate series classification that is a littlestudied problem. The paper presents the problem statement of multivariate time series classification using the shapelets and describes the shapelet–based basic method of binary classification, as well as various generalizations and proposed modification of the method. It also offers the software that implements a modified method and results of computational experiments confirming the effectiveness of the algorithmic and software solutions.The paper shows that the modified method and the software to use it allow us to reach the classification accuracy of about 85%, at best. The shapelet search time increases in proportion to input data dimension.

  15. Displaying an Outlier in Multivariate Data | Gordor | Journal of ...

    African Journals Online (AJOL)

    ... a multivariate data set is proposed. The technique involves the projection of the multidimensional data onto a single dimension called the outlier displaying component. When the observations are plotted on this component the outlier is appreciably revealed. Journal of Applied Science and Technology (JAST), Vol. 4, Nos.

  16. Multi-Scale Pixel-Based Image Fusion Using Multivariate Empirical Mode Decomposition

    Directory of Open Access Journals (Sweden)

    Naveed ur Rehman

    2015-05-01

    Full Text Available A novel scheme to perform the fusion of multiple images using the multivariate empirical mode decomposition (MEMD algorithm is proposed. Standard multi-scale fusion techniques make a priori assumptions regarding input data, whereas standard univariate empirical mode decomposition (EMD-based fusion techniques suffer from inherent mode mixing and mode misalignment issues, characterized respectively by either a single intrinsic mode function (IMF containing multiple scales or the same indexed IMFs corresponding to multiple input images carrying different frequency information. We show that MEMD overcomes these problems by being fully data adaptive and by aligning common frequency scales from multiple channels, thus enabling their comparison at a pixel level and subsequent fusion at multiple data scales. We then demonstrate the potential of the proposed scheme on a large dataset of real-world multi-exposure and multi-focus images and compare the results against those obtained from standard fusion algorithms, including the principal component analysis (PCA, discrete wavelet transform (DWT and non-subsampled contourlet transform (NCT. A variety of image fusion quality measures are employed for the objective evaluation of the proposed method. We also report the results of a hypothesis testing approach on our large image dataset to identify statistically-significant performance differences.

  17. A Review of Statistical Techniques for 2x2 and RxC Categorical Data Tables In SPSS

    Directory of Open Access Journals (Sweden)

    Cengiz BAL

    2009-11-01

    Full Text Available In this study, a review of statistical techniques for RxC categorical data tables is explained in detail. The emphasis is given to the association of techniques and their corresponding data considerations. Some suggestions to how to handle specific categorical data tables in SPSS and common mistakes in the interpretation of the SPSS outputs are shown.

  18. Principal Feature Analysis: A Multivariate Feature Selection Method for fMRI Data

    Directory of Open Access Journals (Sweden)

    Lijun Wang

    2013-01-01

    Full Text Available Brain decoding with functional magnetic resonance imaging (fMRI requires analysis of complex, multivariate data. Multivoxel pattern analysis (MVPA has been widely used in recent years. MVPA treats the activation of multiple voxels from fMRI data as a pattern and decodes brain states using pattern classification methods. Feature selection is a critical procedure of MVPA because it decides which features will be included in the classification analysis of fMRI data, thereby improving the performance of the classifier. Features can be selected by limiting the analysis to specific anatomical regions or by computing univariate (voxel-wise or multivariate statistics. However, these methods either discard some informative features or select features with redundant information. This paper introduces the principal feature analysis as a novel multivariate feature selection method for fMRI data processing. This multivariate approach aims to remove features with redundant information, thereby selecting fewer features, while retaining the most information.

  19. Finer discrimination of brain activation with local multivariate distance

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    The organization of human brain function is diverse on different spatial scales.Various cognitive states are alwavs represented as distinct activity patterns across the specific brain region on fine scales.Conventional univariate analysis of functional MRI data seeks to determine how a particular cognitive state is encoded in brain activity by analyzing each voxel separately without considering the fine-scale patterns information contained in the local brain regions.In this paper,a local multivariate distance mapping(LMDM)technique is proposed to detect the brain activation and to map the fine-scale brain activity patterns.LMDM directly represents the local brain activity with the patterns across multiple voxels rather than individual voxels,and it employs the multivariate distance between different patterns to discriminate the brain state on fine scales.Experiments with simulated and real fMRI data demonstrate that LMDM technique can dramatically increase the sensitivity of the detection for the fine-scale brain activity pettems which contain the subtle information of the experimental conditions.

  20. Micro-Raman Imaging for Biology with Multivariate Spectral Analysis

    KAUST Repository

    Malvaso, Federica

    2015-05-05

    Raman spectroscopy is a noninvasive technique that can provide complex information on the vibrational state of the molecules. It defines the unique fingerprint that allow the identification of the various chemical components within a given sample. The aim of the following thesis work is to analyze Raman maps related to three pairs of different cells, highlighting differences and similarities through multivariate algorithms. The first pair of analyzed cells are human embryonic stem cells (hESCs), while the other two pairs are induced pluripotent stem cells (iPSCs) derived from T lymphocytes and keratinocytes, respectively. Although two different multivariate techniques were employed, ie Principal Component Analysis and Cluster Analysis, the same results were achieved: the iPSCs derived from T-lymphocytes show a higher content of genetic material both compared with the iPSCs derived from keratinocytes and the hESCs . On the other side, equally evident, was that iPS cells derived from keratinocytes assume a molecular distribution very similar to hESCs.

  1. Hydrochemical evaluation of groundwater in the Blue Nile Basin, eastern Sudan, using conventional and multivariate techniques

    Science.gov (United States)

    Hussein, Mohammed Tahir

    Hydrochemical evaluation of groundwater systems can be carried out using conventional and multivariate techniques, namely cluster, factor analyses and others such as correspondence analysis. The main objective of this study is to investigate the groundwater quality in the Blue Nile basin of eastern Sudan, and to workout a hydrochemical evaluation for the aquifer system. Conventional methods and multivariate techniques were applied to achieve these goals. Two water-bearing layers exist in the study area: the Nubian Sandstone Formation and the Al-Atshan Formation. The Nubian aquifer is recharged mainly from the Blue Nile and Dinder Rivers through lateral subsurface flow and through direct rainfall in outcrop areas. The Al-Atshan aquifer receives water through underground flow from River Rahad and from rainfall infiltration. The prevailing hydrochemical processes are simple dissolution, mixing, partial ion exchange and ion exchange. Limited reverse ion exchange has been witnessed in the Nubian aquifer. Three factors control the overall mineralization and water quality of the Blue Nile Basin. The first factor includes high values of total dissolved solids, electrical conductivity, sodium, potassium, chloride, bicarbonate, sulphate and magnesium. The second factor includes calcium and pH. The third factor is due to fluoride concentration in the groundwater. The study highlights the descriptive capabilities of conventional and multivariate techniques as effective tools in groundwater quality evaluation. Une étude hydrochimique de systèmes aquifères a pu être réalisée au moyen des techniques conventionnelles et multidimensionnelles, telles que les analyses de cluster et factorielles, ainsi que d'autres comme l'analyse des correspondances. Le principal objectif de ce travail est d'étudier la qualité des eaux souterraines du bassin du Nil bleu au Soudan oriental, et de réaliser une évaluation hydrochimique du système aquifère. Des méthodes conventionnelles et

  2. Hydrochemical analysis of groundwater using multivariate statistical methods - The Volta region, Ghana

    Science.gov (United States)

    Banoeng-Yakubo, B.; Yidana, S.M.; Nti, E.

    2009-01-01

    Q and R-mode multivariate statistical analyses were applied to groundwater chemical data from boreholes and wells in the northern section of the Volta region Ghana. The objective was to determine the processes that affect the hydrochemistry and the variation of these processes in space among the three main geological terrains: the Buem formation, Voltaian System and the Togo series that underlie the area. The analyses revealed three zones in the groundwater flow system: recharge, intermediate and discharge regions. All three zones are clearly different with respect to all the major chemical parameters, with concentrations increasing from the perceived recharge areas through the intermediate regions to the discharge areas. R-mode HCA and factor analysis (using varimax rotation and Kaiser Criterion) were then applied to determine the significant sources of variation in the hydrochemistry. This study finds that groundwater hydrochemistry in the area is controlled by the weathering of silicate and carbonate minerals, as well as the chemistry of infiltrating precipitation. This study finds that the ??D and ??18O data from the area fall along the Global Meteoric Water Line (GMWL). An equation of regression derived for the relationship between ??D and ??18O bears very close semblance to the equation which describes the GMWL. On the basis of this, groundwater in the study area is probably meteoric and fresh. The apparently low salinities and sodicities of the groundwater seem to support this interpretation. The suitability of groundwater for domestic and irrigation purposes is related to its source, which determines its constitution. A plot of the sodium adsorption ratio (SAR) and salinity (EC) data on a semilog axis, suggests that groundwater serves good irrigation quality in the area. Sixty percent (60%), 20% and 20% of the 67 data points used in this study fall within the medium salinity - low sodicity (C2-S1), low salinity -low sodicity (C1-S1) and high salinity - low

  3. Statistical Pattern Recognition

    CERN Document Server

    Webb, Andrew R

    2011-01-01

    Statistical pattern recognition relates to the use of statistical techniques for analysing data measurements in order to extract information and make justified decisions.  It is a very active area of study and research, which has seen many advances in recent years. Applications such as data mining, web searching, multimedia data retrieval, face recognition, and cursive handwriting recognition, all require robust and efficient pattern recognition techniques. This third edition provides an introduction to statistical pattern theory and techniques, with material drawn from a wide range of fields,

  4. Bayesian Inference of a Multivariate Regression Model

    Directory of Open Access Journals (Sweden)

    Marick S. Sinay

    2014-01-01

    Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.

  5. TMVA - Toolkit for Multivariate Data Analysis with ROOT Users guide

    CERN Document Server

    Höcker, A; Tegenfeldt, F; Voss, H; Voss, K; Christov, A; Henrot-Versillé, S; Jachowski, M; Krasznahorkay, A; Mahalalel, Y; Prudent, X; Speckmayer, P

    2007-01-01

    Multivariate machine learning techniques for the classification of data from high-energy physics (HEP) experiments have become standard tools in most HEP analyses. The multivariate classifiers themselves have significantly evolved in recent years, also driven by developments in other areas inside and outside science. TMVA is a toolkit integrated in ROOT which hosts a large variety of multivariate classification algorithms. They range from rectangular cut optimisation (using a genetic algorithm) and likelihood estimators, over linear and non-linear discriminants (neural networks), to sophisticated recent developments like boosted decision trees and rule ensemble fitting. TMVA organises the simultaneous training, testing, and performance evaluation of all these classifiers with a user-friendly interface, and expedites the application of the trained classifiers to the analysis of data sets with unknown sample composition.

  6. Dissolution comparisons using a Multivariate Statistical Distance (MSD) test and a comparison of various approaches for calculating the measurements of dissolution profile comparison.

    Science.gov (United States)

    Cardot, J-M; Roudier, B; Schütz, H

    2017-07-01

    The f 2 test is generally used for comparing dissolution profiles. In cases of high variability, the f 2 test is not applicable, and the Multivariate Statistical Distance (MSD) test is frequently proposed as an alternative by the FDA and EMA. The guidelines provide only general recommendations. MSD tests can be performed either on raw data with or without time as a variable or on parameters of models. In addition, data can be limited-as in the case of the f 2 test-to dissolutions of up to 85% or to all available data. In the context of the present paper, the recommended calculation included all raw dissolution data up to the first point greater than 85% as a variable-without the various times as parameters. The proposed MSD overcomes several drawbacks found in other methods.

  7. Statistical Techniques Used in Three Applied Linguistics Journals: "Language Learning,""Applied Linguistics" and "TESOL Quarterly," 1980-1986: Implications for Readers and Researchers.

    Science.gov (United States)

    Teleni, Vicki; Baldauf, Richard B., Jr.

    A study investigated the statistical techniques used by applied linguists and reported in three journals, "Language Learning,""Applied Linguistics," and "TESOL Quarterly," between 1980 and 1986. It was found that 47% of the published articles used statistical procedures. In these articles, 63% of the techniques used could be called basic, 28%…

  8. Measures of dependence for multivariate Lévy distributions

    Science.gov (United States)

    Boland, J.; Hurd, T. R.; Pivato, M.; Seco, L.

    2001-02-01

    Recent statistical analysis of a number of financial databases is summarized. Increasing agreement is found that logarithmic equity returns show a certain type of asymptotic behavior of the largest events, namely that the probability density functions have power law tails with an exponent α≈3.0. This behavior does not vary much over different stock exchanges or over time, despite large variations in trading environments. The present paper proposes a class of multivariate distributions which generalizes the observed qualities of univariate time series. A new consequence of the proposed class is the "spectral measure" which completely characterizes the multivariate dependences of the extreme tails of the distribution. This measure on the unit sphere in M-dimensions, in principle completely general, can be determined empirically by looking at extreme events. If it can be observed and determined, it will prove to be of importance for scenario generation in portfolio risk management.

  9. Ischemic risk stratification by means of multivariate analysis of the heart rate variability

    International Nuclear Information System (INIS)

    Valencia, José F; Vallverdú, Montserrat; Caminal, Pere; Porta, Alberto; Voss, Andreas; Schroeder, Rico; Vázquez, Rafael; Bayés de Luna, Antonio

    2013-01-01

    In this work, a univariate and multivariate statistical analysis of indexes derived from heart rate variability (HRV) was conducted to stratify patients with ischemic dilated cardiomyopathy (IDC) in cardiac risk groups. Indexes conditional entropy, refined multiscale entropy (RMSE), detrended fluctuation analysis, time and frequency analysis, were applied to the RR interval series (beat-to-beat series), for single and multiscale complexity analysis of the HRV in IDC patients. Also, clinical parameters were considered. Two different end-points after a follow-up of three years were considered: (i) analysis A, with 151 survivor patients as a low risk group and 13 patients that suffered sudden cardiac death as a high risk group; (ii) analysis B, with 192 survivor patients as a low risk group and 30 patients that suffered cardiac mortality as a high risk group. A univariate and multivariate linear discriminant analysis was used as a statistical technique for classifying patients in risk groups. Sensitivity (Sen) and specificity (Spe) were calculated as diagnostic criteria in order to evaluate the performance of the indexes and their linear combinations. Sen and Spe values of 80.0% and 72.9%, respectively, were obtained during daytime by combining one clinical parameter and one index from RMSE, and during nighttime Sen = 80% and Spe = 73.4% were attained by combining one clinical factor and two indexes from RMSE. In particular, relatively long time scales were more relevant for classifying patients into risk groups during nighttime, while during daytime shorter scales performed better. The results suggest that the left atrial size, indexed to body surface and RMSE indexes are those that allow enhanced classification of ischemic patients in their respective risk groups, confirming that a single measurement is not enough to fully characterize ischemic risk patients and the clinical relevance of HRV complexity measures. (paper)

  10. Preparing systems engineering and computing science students in disciplined methods, quantitative, and advanced statistical techniques to improve process performance

    Science.gov (United States)

    McCray, Wilmon Wil L., Jr.

    The research was prompted by a need to conduct a study that assesses process improvement, quality management and analytical techniques taught to students in U.S. colleges and universities undergraduate and graduate systems engineering and the computing science discipline (e.g., software engineering, computer science, and information technology) degree programs during their academic training that can be applied to quantitatively manage processes for performance. Everyone involved in executing repeatable processes in the software and systems development lifecycle processes needs to become familiar with the concepts of quantitative management, statistical thinking, process improvement methods and how they relate to process-performance. Organizations are starting to embrace the de facto Software Engineering Institute (SEI) Capability Maturity Model Integration (CMMI RTM) Models as process improvement frameworks to improve business processes performance. High maturity process areas in the CMMI model imply the use of analytical, statistical, quantitative management techniques, and process performance modeling to identify and eliminate sources of variation, continually improve process-performance; reduce cost and predict future outcomes. The research study identifies and provides a detail discussion of the gap analysis findings of process improvement and quantitative analysis techniques taught in U.S. universities systems engineering and computing science degree programs, gaps that exist in the literature, and a comparison analysis which identifies the gaps that exist between the SEI's "healthy ingredients " of a process performance model and courses taught in U.S. universities degree program. The research also heightens awareness that academicians have conducted little research on applicable statistics and quantitative techniques that can be used to demonstrate high maturity as implied in the CMMI models. The research also includes a Monte Carlo simulation optimization

  11. Basic principles of Hasse diagram technique in chemistry.

    Science.gov (United States)

    Brüggemann, Rainer; Voigt, Kristina

    2008-11-01

    Principles of partial order applied to ranking are explained. The Hasse diagram technique (HDT) is the application of partial order theory based on a data matrix. In this paper, HDT is introduced in a stepwise procedure, and some elementary theorems are exemplified. The focus is to show how the multivariate character of a data matrix is realized by HDT and in which cases one should apply other mathematical or statistical methods. Many simple examples illustrate the basic theoretical ideas. Finally, it is shown that HDT is a useful alternative for the evaluation of antifouling agents, which was originally performed by amoeba diagrams.

  12. Multivariate Statistical Process Optimization in the Industrial Production of Enzymes

    DEFF Research Database (Denmark)

    Klimkiewicz, Anna

    of productyield. The potential of NIR technology to monitor the activity of the enzyme has beenthe subject of a feasibility study presented in PAPER I. It included (a) evaluation onwhich of the two real-time NIR flow cell configurations is the preferred arrangementfor monitoring of the retentate stream downstream...... strategies for theorganization of these datasets, with varying number of timestamps, into datastructures fit for latent variable (LV) modeling, have been compared. The ultimateaim of the data mining steps is the construction of statistical ‘soft models’ whichcapture the principle or latent behavior...

  13. Identification of Civil Engineering Structures using Multivariate ARMAV and RARMAV Models

    DEFF Research Database (Denmark)

    Kirkegaard, Poul Henning; Andersen, P.; Brincker, Rune

    This paper presents how to make system identification of civil engineering structures using multivariate auto-regressive moving-average vector (ARMAV) models. Further, the ARMAV technique is extended to a recursive technique (RARMAV). The ARMAV model is used to identify measured stationary data....... The results show the usefulness of the approaches for identification of civil engineering structures excited by natural excitation...

  14. Search for the Higgs Boson in the $ZH\\to\\mu^+\\mu^- b\\bar{b}$ Channel at CDF Using Novel Multivariate Techniques

    Energy Technology Data Exchange (ETDEWEB)

    Pilot, Justin R. [Ohio State U.

    2011-01-01

    We present a search for the Standard Model Higgs Boson using the process $ZH\\to\\mu^+\\mu^- b\\bar{b}$. We use a dataset corresponding to 9.2 fb$^{-1}$ of integrated luminosity from proton-antiproton collisions with center-of-mass energy 1.96 TeV at the Fermilab Tevatron, collected with the CDF II detector. This analysis benefits from several new multivariate techniques that have not been used in previous analyses at CDF. We use a multivariate function to select muon candidates, increasing signal acceptance while simultaneously keeping fake rates small. We employ an inclusive trigger selection to further increase acceptance. To enhance signal discrimination, we utilize a multi-layer approach consisting of expert discriminants. This multi-layer discriminant method helps isolate the two main classes of background events, $t\\bar{t}$ and $Z$+jets production. It also includes a flavor separator, to distinguish light flavor jets from jets consistent with the decay of a $B$-hadron. Wit h this novel multi-layer approach, we proceed to set limits on the $ZH$ production cross section times branching ratio. For a Higgs boson with mass 115 GeV/$c^2$, we observe (expect) a limit of 8.0 (4.9) times the Standard Model prediction.

  15. The Dirichlet-Multinomial Model for Multivariate Randomized Response Data and Small Samples

    Science.gov (United States)

    Avetisyan, Marianna; Fox, Jean-Paul

    2012-01-01

    In survey sampling the randomized response (RR) technique can be used to obtain truthful answers to sensitive questions. Although the individual answers are masked due to the RR technique, individual (sensitive) response rates can be estimated when observing multivariate response data. The beta-binomial model for binary RR data will be generalized…

  16. GIS-based bivariate statistical techniques for groundwater potential analysis (an example of Iran)

    Science.gov (United States)

    Haghizadeh, Ali; Moghaddam, Davoud Davoudi; Pourghasemi, Hamid Reza

    2017-12-01

    Groundwater potential analysis prepares better comprehension of hydrological settings of different regions. This study shows the potency of two GIS-based data driven bivariate techniques namely statistical index (SI) and Dempster-Shafer theory (DST) to analyze groundwater potential in Broujerd region of Iran. The research was done using 11 groundwater conditioning factors and 496 spring positions. Based on the ground water potential maps (GPMs) of SI and DST methods, 24.22% and 23.74% of the study area is covered by poor zone of groundwater potential, and 43.93% and 36.3% of Broujerd region is covered by good and very good potential zones, respectively. The validation of outcomes displayed that area under the curve (AUC) of SI and DST techniques are 81.23% and 79.41%, respectively, which shows SI method has slightly a better performance than the DST technique. Therefore, SI and DST methods are advantageous to analyze groundwater capacity and scrutinize the complicated relation between groundwater occurrence and groundwater conditioning factors, which permits investigation of both systemic and stochastic uncertainty. Finally, it can be realized that these techniques are very beneficial for groundwater potential analyzing and can be practical for water-resource management experts.

  17. Selecting minimum dataset soil variables using PLSR as a regressive multivariate method

    Science.gov (United States)

    Stellacci, Anna Maria; Armenise, Elena; Castellini, Mirko; Rossi, Roberta; Vitti, Carolina; Leogrande, Rita; De Benedetto, Daniela; Ferrara, Rossana M.; Vivaldi, Gaetano A.

    2017-04-01

    Long-term field experiments and science-based tools that characterize soil status (namely the soil quality indices, SQIs) assume a strategic role in assessing the effect of agronomic techniques and thus in improving soil management especially in marginal environments. Selecting key soil variables able to best represent soil status is a critical step for the calculation of SQIs. Current studies show the effectiveness of statistical methods for variable selection to extract relevant information deriving from multivariate datasets. Principal component analysis (PCA) has been mainly used, however supervised multivariate methods and regressive techniques are progressively being evaluated (Armenise et al., 2013; de Paul Obade et al., 2016; Pulido Moncada et al., 2014). The present study explores the effectiveness of partial least square regression (PLSR) in selecting critical soil variables, using a dataset comparing conventional tillage and sod-seeding on durum wheat. The results were compared to those obtained using PCA and stepwise discriminant analysis (SDA). The soil data derived from a long-term field experiment in Southern Italy. On samples collected in April 2015, the following set of variables was quantified: (i) chemical: total organic carbon and nitrogen (TOC and TN), alkali-extractable C (TEC and humic substances - HA-FA), water extractable N and organic C (WEN and WEOC), Olsen extractable P, exchangeable cations, pH and EC; (ii) physical: texture, dry bulk density (BD), macroporosity (Pmac), air capacity (AC), and relative field capacity (RFC); (iii) biological: carbon of the microbial biomass quantified with the fumigation-extraction method. PCA and SDA were previously applied to the multivariate dataset (Stellacci et al., 2016). PLSR was carried out on mean centered and variance scaled data of predictors (soil variables) and response (wheat yield) variables using the PLS procedure of SAS/STAT. In addition, variable importance for projection (VIP

  18. Connection between perturbation theory, projection-operator techniques, and statistical linearization for nonlinear systems

    International Nuclear Information System (INIS)

    Budgor, A.B.; West, B.J.

    1978-01-01

    We employ the equivalence between Zwanzig's projection-operator formalism and perturbation theory to demonstrate that the approximate-solution technique of statistical linearization for nonlinear stochastic differential equations corresponds to the lowest-order β truncation in both the consolidated perturbation expansions and in the ''mass operator'' of a renormalized Green's function equation. Other consolidated equations can be obtained by selectively modifying this mass operator. We particularize the results of this paper to the Duffing anharmonic oscillator equation

  19. Depth-weighted robust multivariate regression with application to sparse data

    KAUST Repository

    Dutta, Subhajit; Genton, Marc G.

    2017-01-01

    A robust method for multivariate regression is developed based on robust estimators of the joint location and scatter matrix of the explanatory and response variables using the notion of data depth. The multivariate regression estimator possesses desirable affine equivariance properties, achieves the best breakdown point of any affine equivariant estimator, and has an influence function which is bounded in both the response as well as the predictor variable. To increase the efficiency of this estimator, a re-weighted estimator based on robust Mahalanobis distances of the residual vectors is proposed. In practice, the method is more stable than existing methods that are constructed using subsamples of the data. The resulting multivariate regression technique is computationally feasible, and turns out to perform better than several popular robust multivariate regression methods when applied to various simulated data as well as a real benchmark data set. When the data dimension is quite high compared to the sample size it is still possible to use meaningful notions of data depth along with the corresponding depth values to construct a robust estimator in a sparse setting.

  20. Depth-weighted robust multivariate regression with application to sparse data

    KAUST Repository

    Dutta, Subhajit

    2017-04-05

    A robust method for multivariate regression is developed based on robust estimators of the joint location and scatter matrix of the explanatory and response variables using the notion of data depth. The multivariate regression estimator possesses desirable affine equivariance properties, achieves the best breakdown point of any affine equivariant estimator, and has an influence function which is bounded in both the response as well as the predictor variable. To increase the efficiency of this estimator, a re-weighted estimator based on robust Mahalanobis distances of the residual vectors is proposed. In practice, the method is more stable than existing methods that are constructed using subsamples of the data. The resulting multivariate regression technique is computationally feasible, and turns out to perform better than several popular robust multivariate regression methods when applied to various simulated data as well as a real benchmark data set. When the data dimension is quite high compared to the sample size it is still possible to use meaningful notions of data depth along with the corresponding depth values to construct a robust estimator in a sparse setting.