Institute of Scientific and Technical Information of China (English)
Xiu-rui GENG; Lu-yan JI; Kang SUN
2016-01-01
Non-negative matrix factorization (NMF) has been widely used in mixture analysis for hyperspectral remote sensing. When used for spectral unmixing analysis, however, it has two main shortcomings: (1) since the dimensionality of hyperspectral data is usually very large, NMF tends to suffer from large computational complexity for the popular multiplicative iteration rule;(2) NMF is sensitive to noise (outliers), and thus the corrupted data will make the results of NMF meaningless. Although principal component analysis (PCA) can be used to mitigate these two problems, the transformed data will contain negative numbers, hindering the direct use of the multiplicative iteration rule of NMF. In this paper, we analyze the impact of PCA on NMF, and fi nd that multiplicative NMF can also be applicable to data after principal component transformation. Based on this conclusion, we present a method to perform NMF in the principal component space, named ‘principal component NMF’ (PCNMF). Experimental results show that PCNMF is both accurate and time-saving.
Bro, R.; Smilde, A.K.
2014-01-01
Principal component analysis is one of the most important and powerful methods in chemometrics as well as in a wealth of other areas. This paper provides a description of how to understand, use, and interpret principal component analysis. The paper focuses on the use of principal component analysis
Robust Principal Component Analysis?
Candes, Emmanuel J; Ma, Yi; Wright, John
2009-01-01
This paper is about a curious phenomenon. Suppose we have a data matrix, which is the superposition of a low-rank component and a sparse component. Can we recover each component individually? We prove that under some suitable assumptions, it is possible to recover both the low-rank and the sparse components exactly by solving a very convenient convex program called Principal Component Pursuit; among all feasible decompositions, simply minimize a weighted combination of the nuclear norm and of the L1 norm. This suggests the possibility of a principled approach to robust principal component analysis since our methodology and results assert that one can recover the principal components of a data matrix even though a positive fraction of its entries are arbitrarily corrupted. This extends to the situation where a fraction of the entries are missing as well. We discuss an algorithm for solving this optimization problem, and present applications in the area of video surveillance, where our methodology allows for th...
Stable Principal Component Pursuit
Zhou, Zihan; Wright, John; Candes, Emmanuel; Ma, Yi
2010-01-01
In this paper, we study the problem of recovering a low-rank matrix (the principal components) from a high-dimensional data matrix despite both small entry-wise noise and gross sparse errors. Recently, it has been shown that a convex program, named Principal Component Pursuit (PCP), can recover the low-rank matrix when the data matrix is corrupted by gross sparse errors. We further prove that the solution to a related convex program (a relaxed PCP) gives an estimate of the low-rank matrix that is simultaneously stable to small entrywise noise and robust to gross sparse errors. More precisely, our result shows that the proposed convex program recovers the low-rank matrix even though a positive fraction of its entries are arbitrarily corrupted, with an error bound proportional to the noise level. We present simulation results to support our result and demonstrate that the new convex program accurately recovers the principal components (the low-rank matrix) under quite broad conditions. To our knowledge, this is...
Recursive principal components analysis.
Voegtlin, Thomas
2005-10-01
A recurrent linear network can be trained with Oja's constrained Hebbian learning rule. As a result, the network learns to represent the temporal context associated to its input sequence. The operation performed by the network is a generalization of Principal Components Analysis (PCA) to time-series, called Recursive PCA. The representations learned by the network are adapted to the temporal statistics of the input. Moreover, sequences stored in the network may be retrieved explicitly, in the reverse order of presentation, thus providing a straight-forward neural implementation of a logical stack.
Compressive Principal Component Pursuit
Wright, John; Min, Kerui; Ma, Yi
2012-01-01
We consider the problem of recovering a target matrix that is a superposition of low-rank and sparse components, from a small set of linear measurements. This problem arises in compressed sensing of structured high-dimensional signals such as videos and hyperspectral images, as well as in the analysis of transformation invariant low-rank recovery. We analyze the performance of the natural convex heuristic for solving this problem, under the assumption that measurements are chosen uniformly at random. We prove that this heuristic exactly recovers low-rank and sparse terms, provided the number of observations exceeds the number of intrinsic degrees of freedom of the component signals by a polylogarithmic factor. Our analysis introduces several ideas that may be of independent interest for the more general problem of compressed sensing and decomposing superpositions of multiple structured signals.
Fast Steerable Principal Component Analysis
Zhao, Zhizhen; Shkolnisky, Yoel; Singer, Amit
2016-01-01
Cryo-electron microscopy nowadays often requires the analysis of hundreds of thousands of 2D images as large as a few hundred pixels in each direction. Here we introduce an algorithm that efficiently and accurately performs principal component analysis (PCA) for a large set of two-dimensional images, and, for each image, the set of its uniform rotations in the plane and their reflections. For a dataset consisting of $n$ images of size $L \\times L$ pixels, the computational complexity of our a...
Parametric functional principal component analysis.
Sang, Peijun; Wang, Liangliang; Cao, Jiguo
2017-03-10
Functional principal component analysis (FPCA) is a popular approach in functional data analysis to explore major sources of variation in a sample of random curves. These major sources of variation are represented by functional principal components (FPCs). Most existing FPCA approaches use a set of flexible basis functions such as B-spline basis to represent the FPCs, and control the smoothness of the FPCs by adding roughness penalties. However, the flexible representations pose difficulties for users to understand and interpret the FPCs. In this article, we consider a variety of applications of FPCA and find that, in many situations, the shapes of top FPCs are simple enough to be approximated using simple parametric functions. We propose a parametric approach to estimate the top FPCs to enhance their interpretability for users. Our parametric approach can also circumvent the smoothing parameter selecting process in conventional nonparametric FPCA methods. In addition, our simulation study shows that the proposed parametric FPCA is more robust when outlier curves exist. The parametric FPCA method is demonstrated by analyzing several datasets from a variety of applications. © 2017, The International Biometric Society.
Interpretable functional principal component analysis.
Lin, Zhenhua; Wang, Liangliang; Cao, Jiguo
2016-09-01
Functional principal component analysis (FPCA) is a popular approach to explore major sources of variation in a sample of random curves. These major sources of variation are represented by functional principal components (FPCs). The intervals where the values of FPCs are significant are interpreted as where sample curves have major variations. However, these intervals are often hard for naïve users to identify, because of the vague definition of "significant values". In this article, we develop a novel penalty-based method to derive FPCs that are only nonzero precisely in the intervals where the values of FPCs are significant, whence the derived FPCs possess better interpretability than the FPCs derived from existing methods. To compute the proposed FPCs, we devise an efficient algorithm based on projection deflation techniques. We show that the proposed interpretable FPCs are strongly consistent and asymptotically normal under mild conditions. Simulation studies confirm that with a competitive performance in explaining variations of sample curves, the proposed FPCs are more interpretable than the traditional counterparts. This advantage is demonstrated by analyzing two real datasets, namely, electroencephalography data and Canadian weather data.
Fast Steerable Principal Component Analysis.
Zhao, Zhizhen; Shkolnisky, Yoel; Singer, Amit
2016-03-01
Cryo-electron microscopy nowadays often requires the analysis of hundreds of thousands of 2-D images as large as a few hundred pixels in each direction. Here, we introduce an algorithm that efficiently and accurately performs principal component analysis (PCA) for a large set of 2-D images, and, for each image, the set of its uniform rotations in the plane and their reflections. For a dataset consisting of n images of size L × L pixels, the computational complexity of our algorithm is O(nL(3) + L(4)), while existing algorithms take O(nL(4)). The new algorithm computes the expansion coefficients of the images in a Fourier-Bessel basis efficiently using the nonuniform fast Fourier transform. We compare the accuracy and efficiency of the new algorithm with traditional PCA and existing algorithms for steerable PCA.
Principal component regression analysis with SPSS.
Liu, R X; Kuang, J; Gong, Q; Hou, X L
2003-06-01
The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.
Sparse Principal Component Analysis with missing observations
Lounici, Karim
2012-01-01
In this paper, we study the problem of sparse Principal Component Analysis (PCA) in the high-dimensional setting with missing observations. Our goal is to estimate the first principal component when we only have access to partial observations. Existing estimation techniques are usually derived for fully observed data sets and require a prior knowledge of the sparsity of the first principal component in order to achieve good statistical guarantees. Our contributions is threefold. First, we establish the first information-theoretic lower bound for the sparse PCA problem with missing observations. Second, we propose a simple procedure that does not require any prior knowledge on the sparsity of the unknown first principal component or any imputation of the missing observations, adapts to the unknown sparsity of the first principal component and achieves the optimal rate of estimation up to a logarithmic factor. Third, if the covariance matrix of interest admits a sparse first principal component and is in additi...
Ma, Yehao; Li, Xian; Huang, Pingjie; Hou, Dibo; Wang, Qiang; Zhang, Guangxin
2017-04-01
In many situations the THz spectroscopic data observed from complex samples represent the integrated result of several interrelated variables or feature components acting together. The actual information contained in the original data might be overlapping and there is a necessity to investigate various approaches for model reduction and data unmixing. The development and use of low-rank approximate nonnegative matrix factorization (NMF) and smooth constraint NMF (CNMF) algorithms for feature components extraction and identification in the fields of terahertz time domain spectroscopy (THz-TDS) data analysis are presented. The evolution and convergence properties of NMF and CNMF methods based on sparseness, independence and smoothness constraints for the resulting nonnegative matrix factors are discussed. For general NMF, its cost function is nonconvex and the result is usually susceptible to initialization and noise corruption, and may fall into local minima and lead to unstable decomposition. To reduce these drawbacks, smoothness constraint is introduced to enhance the performance of NMF. The proposed algorithms are evaluated by several THz-TDS data decomposition experiments including a binary system and a ternary system simulating some applications such as medicine tablet inspection. Results show that CNMF is more capable of finding optimal solutions and more robust for random initialization in contrast to NMF. The investigated method is promising for THz data resolution contributing to unknown mixture identification.
Permutation Tests in Principal Component Analysis.
Pohlmann, John T.; Perkins, Kyle; Brutten, Shelia
Structural changes in an English as a Second Language (ESL) 30-item reading comprehension test were examined through principal components analysis on a small sample (n=31) of students. Tests were administered on three occasions during intensive ESL training. Principal components analysis of the items was performed for each test occasion.…
Principal component regression for crop yield estimation
Suryanarayana, T M V
2016-01-01
This book highlights the estimation of crop yield in Central Gujarat, especially with regard to the development of Multiple Regression Models and Principal Component Regression (PCR) models using climatological parameters as independent variables and crop yield as a dependent variable. It subsequently compares the multiple linear regression (MLR) and PCR results, and discusses the significance of PCR for crop yield estimation. In this context, the book also covers Principal Component Analysis (PCA), a statistical procedure used to reduce a number of correlated variables into a smaller number of uncorrelated variables called principal components (PC). This book will be helpful to the students and researchers, starting their works on climate and agriculture, mainly focussing on estimation models. The flow of chapters takes the readers in a smooth path, in understanding climate and weather and impact of climate change, and gradually proceeds towards downscaling techniques and then finally towards development of ...
COPD phenotype description using principal components analysis
DEFF Research Database (Denmark)
Roy, Kay; Smith, Jacky; Kolsum, Umme
2009-01-01
BACKGROUND: Airway inflammation in COPD can be measured using biomarkers such as induced sputum and Fe(NO). This study set out to explore the heterogeneity of COPD using biomarkers of airway and systemic inflammation and pulmonary function by principal components analysis (PCA). SUBJECTS...... AND METHODS: In 127 COPD patients (mean FEV1 61%), pulmonary function, Fe(NO), plasma CRP and TNF-alpha, sputum differential cell counts and sputum IL8 (pg/ml) were measured. Principal components analysis as well as multivariate analysis was performed. RESULTS: PCA identified four main components (% variance...... associations between the variables within components 1 and 2. CONCLUSION: COPD is a multi dimensional disease. Unrelated components of disease were identified, including neutrophilic airway inflammation which was associated with systemic inflammation, and sputum eosinophils which were related to increased Fe...
Outlier Mining Based on Principal Component Estimation
Institute of Scientific and Technical Information of China (English)
Hu Yang; Ting Yang
2005-01-01
Outlier mining is an important aspect in data mining and the outlier mining based on Cook distance is most commonly used. But we know that when the data have multicollinearity, the traditional Cook method is no longer effective. Considering the excellence of the principal component estimation, we use it to substitute the least squares estimation, and then give the Cook distance measurement based on principal component estimation, which can be used in outlier mining. At the same time, we have done some research on related theories and application problems.
Principal Component Analysis in ECG Signal Processing
Directory of Open Access Journals (Sweden)
Andreas Bollmann
2007-01-01
Full Text Available This paper reviews the current status of principal component analysis in the area of ECG signal processing. The fundamentals of PCA are briefly described and the relationship between PCA and Karhunen-Loève transform is explained. Aspects on PCA related to data with temporal and spatial correlations are considered as adaptive estimation of principal components is. Several ECG applications are reviewed where PCA techniques have been successfully employed, including data compression, ST-T segment analysis for the detection of myocardial ischemia and abnormalities in ventricular repolarization, extraction of atrial fibrillatory waves for detailed characterization of atrial fibrillation, and analysis of body surface potential maps.
EEG source imaging with spatio-temporal tomographic nonnegative independent component analysis.
Valdés-Sosa, Pedro A; Vega-Hernández, Mayrim; Sánchez-Bornot, José Miguel; Martínez-Montes, Eduardo; Bobes, María Antonieta
2009-06-01
This article describes a spatio-temporal EEG/MEG source imaging (ESI) that extracts a parsimonious set of "atoms" or components, each the outer product of both a spatial and a temporal signature. The sources estimated are localized as smooth, minimally overlapping patches of cortical activation that are obtained by constraining spatial signatures to be nonnegative (NN), orthogonal, sparse, and smooth-in effect integrating ESI with NN-ICA. This constitutes a generalization of work by this group on the use of multiple penalties for ESI. A multiplicative update algorithm is derived being stable, fast and converging within seconds near the optimal solution. This procedure, spatio-temporal tomographic NN ICA (STTONNICA), is equally able to recover superficial or deep sources without additional weighting constraints as tested with simulations. STTONNICA analysis of ERPs to familiar and unfamiliar faces yields an occipital-fusiform atom activated by all faces and a more frontal atom that only is active with familiar faces. The temporal signatures are at present unconstrained but can be required to be smooth, complex, or following a multivariate autoregressive model.
Stochastic convex sparse principal component analysis.
Baytas, Inci M; Lin, Kaixiang; Wang, Fei; Jain, Anil K; Zhou, Jiayu
2016-12-01
Principal component analysis (PCA) is a dimensionality reduction and data analysis tool commonly used in many areas. The main idea of PCA is to represent high-dimensional data with a few representative components that capture most of the variance present in the data. However, there is an obvious disadvantage of traditional PCA when it is applied to analyze data where interpretability is important. In applications, where the features have some physical meanings, we lose the ability to interpret the principal components extracted by conventional PCA because each principal component is a linear combination of all the original features. For this reason, sparse PCA has been proposed to improve the interpretability of traditional PCA by introducing sparsity to the loading vectors of principal components. The sparse PCA can be formulated as an ℓ1 regularized optimization problem, which can be solved by proximal gradient methods. However, these methods do not scale well because computation of the exact gradient is generally required at each iteration. Stochastic gradient framework addresses this challenge by computing an expected gradient at each iteration. Nevertheless, stochastic approaches typically have low convergence rates due to the high variance. In this paper, we propose a convex sparse principal component analysis (Cvx-SPCA), which leverages a proximal variance reduced stochastic scheme to achieve a geometric convergence rate. We further show that the convergence analysis can be significantly simplified by using a weak condition which allows a broader class of objectives to be applied. The efficiency and effectiveness of the proposed method are demonstrated on a large-scale electronic medical record cohort.
Principal component analysis implementation in Java
Wójtowicz, Sebastian; Belka, Radosław; Sławiński, Tomasz; Parian, Mahnaz
2015-09-01
In this paper we show how PCA (Principal Component Analysis) method can be implemented using Java programming language. We consider using PCA algorithm especially in analysed data obtained from Raman spectroscopy measurements, but other applications of developed software should also be possible. Our goal is to create a general purpose PCA application, ready to run on every platform which is supported by Java.
Principal component analysis of phenolic acid spectra
Phenolic acids are common plant metabolites that exhibit bioactive properties and have applications in functional food and animal feed formulations. The ultraviolet (UV) and infrared (IR) spectra of four closely related phenolic acid structures were evaluated by principal component analysis (PCA) to...
Principal component analysis of psoriasis lesions images
DEFF Research Database (Denmark)
Maletti, Gabriela Mariel; Ersbøll, Bjarne Kjær
2003-01-01
A set of RGB images of psoriasis lesions is used. By visual examination of these images, there seem to be no common pattern that could be used to find and align the lesions within and between sessions. It is expected that the principal components of the original images could be useful during future...
Principal component analysis of symmetric fuzzy data
Giordani, Paolo; Kiers, Henk A.L.
2004-01-01
Principal Component Analysis (PCA) is a well-known tool often used for the exploratory analysis of a numerical data set. Here an extension of classical PCA is proposed, which deals with fuzzy data (in short PCAF), where the elementary datum cannot be recognized exactly by a specific number but by a
Probabilistic Principal Component Analysis for Metabolomic Data.
LENUS (Irish Health Repository)
Nyamundanda, Gift
2010-11-23
Abstract Background Data from metabolomic studies are typically complex and high-dimensional. Principal component analysis (PCA) is currently the most widely used statistical technique for analyzing metabolomic data. However, PCA is limited by the fact that it is not based on a statistical model. Results Here, probabilistic principal component analysis (PPCA) which addresses some of the limitations of PCA, is reviewed and extended. A novel extension of PPCA, called probabilistic principal component and covariates analysis (PPCCA), is introduced which provides a flexible approach to jointly model metabolomic data and additional covariate information. The use of a mixture of PPCA models for discovering the number of inherent groups in metabolomic data is demonstrated. The jackknife technique is employed to construct confidence intervals for estimated model parameters throughout. The optimal number of principal components is determined through the use of the Bayesian Information Criterion model selection tool, which is modified to address the high dimensionality of the data. Conclusions The methods presented are illustrated through an application to metabolomic data sets. Jointly modeling metabolomic data and covariates was successfully achieved and has the potential to provide deeper insight to the underlying data structure. Examination of confidence intervals for the model parameters, such as loadings, allows for principled and clear interpretation of the underlying data structure. A software package called MetabolAnalyze, freely available through the R statistical software, has been developed to facilitate implementation of the presented methods in the metabolomics field.
Real-Time Principal-Component Analysis
Duong, Vu; Duong, Tuan
2005-01-01
A recently written computer program implements dominant-element-based gradient descent and dynamic initial learning rate (DOGEDYN), which was described in Method of Real-Time Principal-Component Analysis (NPO-40034) NASA Tech Briefs, Vol. 29, No. 1 (January 2005), page 59. To recapitulate: DOGEDYN is a method of sequential principal-component analysis (PCA) suitable for such applications as data compression and extraction of features from sets of data. In DOGEDYN, input data are represented as a sequence of vectors acquired at sampling times. The learning algorithm in DOGEDYN involves sequential extraction of principal vectors by means of a gradient descent in which only the dominant element is used at each iteration. Each iteration includes updating of elements of a weight matrix by amounts proportional to a dynamic initial learning rate chosen to increase the rate of convergence by compensating for the energy lost through the previous extraction of principal components. In comparison with a prior method of gradient-descent-based sequential PCA, DOGEDYN involves less computation and offers a greater rate of learning convergence. The sequential DOGEDYN computations require less memory than would parallel computations for the same purpose. The DOGEDYN software can be executed on a personal computer.
Principal component analysis for authorship attribution
Directory of Open Access Journals (Sweden)
Amir Jamak
2012-01-01
Full Text Available Background: To recognize the authors of the texts by the use of statistical tools, one first needs to decide about the features to be used as author characteristics, and then extract these features from texts. The features extracted from texts are mostly the counts of so called function words. Objectives: The data extracted are processed further to compress as a data with less number of features, such a way that the compressed data still has the power of effective discriminators. In this case feature space has less dimensionality then the text itself. Methods/Approach: In this paper, the data collected by counting words and characters in around a thousand paragraphs of each sample book, underwent a principal component analysis performed using neural networks. Once the analysis was complete, the first of the principal components is used to distinguish the books authored by a certain author. Results: The achieved results show that every author leaves a unique signature in written text that can be discovered by analyzing counts of short words per paragraph. Conclusions: In this article we have demonstrated that based on analyzing counts of short words per paragraph authorship could be traced using principal component analysis. Methodology could be used for other purposes, like fraud detection in auditing.
Multilevel sparse functional principal component analysis.
Di, Chongzhi; Crainiceanu, Ciprian M; Jank, Wolfgang S
2014-01-29
We consider analysis of sparsely sampled multilevel functional data, where the basic observational unit is a function and data have a natural hierarchy of basic units. An example is when functions are recorded at multiple visits for each subject. Multilevel functional principal component analysis (MFPCA; Di et al. 2009) was proposed for such data when functions are densely recorded. Here we consider the case when functions are sparsely sampled and may contain only a few observations per function. We exploit the multilevel structure of covariance operators and achieve data reduction by principal component decompositions at both between and within subject levels. We address inherent methodological differences in the sparse sampling context to: 1) estimate the covariance operators; 2) estimate the functional principal component scores; 3) predict the underlying curves. Through simulations the proposed method is able to discover dominating modes of variations and reconstruct underlying curves well even in sparse settings. Our approach is illustrated by two applications, the Sleep Heart Health Study and eBay auctions.
Boosting Principal Component Analysis by Genetic Algorithm
Directory of Open Access Journals (Sweden)
Divya Somvanshi
2010-07-01
Full Text Available This paper presents a new method of feature extraction by combining principal component analysis and genetic algorithm. Use of multiple pre-processors in combination with principal component analysis generates alternate feature spaces for data representation. The present method works out the fusion of these multiple spaces to create higher dimensionality feature vectors. The fused feature vectors are given chromosome representation by taking feature components to be genes. Then these feature vectors are allowed to undergo genetic evolution individually. For genetic algorithm, initial population is created by calculating probability distance matrix, and by applying a probability distance metric such that all the genes which lie farther than a defined threshold are tripped to zero. The genetic evolution of fused feature vector brings out most significant feature components (genes as survivours. A measure of significance is adapted on the basis of frequency of occurrence of the surviving genes in the current population. Finally, the feature vector is obtained by weighting the original feature components in proportion to their significance. The present algorithm is validated in combination with a neural network classifier based on error backpropagation algorithm, and by analysing a number of benchmark datasets available in the open sources.Defence Science Journal, 2010, 60(4, pp.392-398, DOI:http://dx.doi.org/10.14429/dsj.60.495
Radar fall detection using principal component analysis
Jokanovic, Branka; Amin, Moeness; Ahmad, Fauzia; Boashash, Boualem
2016-05-01
Falls are a major cause of fatal and nonfatal injuries in people aged 65 years and older. Radar has the potential to become one of the leading technologies for fall detection, thereby enabling the elderly to live independently. Existing techniques for fall detection using radar are based on manual feature extraction and require significant parameter tuning in order to provide successful detections. In this paper, we employ principal component analysis for fall detection, wherein eigen images of observed motions are employed for classification. Using real data, we demonstrate that the PCA based technique provides performance improvement over the conventional feature extraction methods.
Principal components analysis of Jupiter VIMS spectra
Bellucci, G.; Formisano, V.; D'Aversa, E.; Brown, R.H.; Baines, K.H.; Bibring, J.-P.; Buratti, B.J.; Capaccioni, F.; Cerroni, P.; Clark, R.N.; Coradini, A.; Cruikshank, D.P.; Drossart, P.; Jaumann, R.; Langevin, Y.; Matson, D.L.; McCord, T.B.; Mennella, V.; Nelson, R.M.; Nicholson, P.D.; Sicardy, B.; Sotin, Christophe; Chamberlain, M.C.; Hansen, G.; Hibbits, K.; Showalter, M.; Filacchione, G.
2004-01-01
During Cassini - Jupiter flyby occurred in December 2000, Visual-Infrared mapping spectrometer (VIMS) instrument took several image cubes of Jupiter at different phase angles and distances. We have analysed the spectral images acquired by the VIMS visual channel by means of a principal component analysis technique (PCA). The original data set consists of 96 spectral images in the 0.35-1.05 ??m wavelength range. The product of the analysis are new PC bands, which contain all the spectral variance of the original data. These new components have been used to produce a map of Jupiter made of seven coherent spectral classes. The map confirms previously published work done on the Great Red Spot by using NIMS data. Some other new findings, presently under investigation, are presented. ?? 2004 Published by Elsevier Ltd on behalf of COSPAR.
Principal Component Pursuit with Reduced Linear Measurements
Ganesh, Arvind; Wright, John; Ma, Yi
2012-01-01
In this paper, we study the problem of decomposing a superposition of a low-rank matrix and a sparse matrix when a relatively few linear measurements are available. This problem arises in many data processing tasks such as aligning multiple images or rectifying regular texture, where the goal is to recover a low-rank matrix with a large fraction of corrupted entries in the presence of nonlinear domain transformation. We consider a natural convex heuristic to this problem which is a variant to the recently proposed Principal Component Pursuit. We prove that under suitable conditions, this convex program guarantees to recover the correct low-rank and sparse components despite reduced measurements. Our analysis covers both random and deterministic measurement models.
Nonlinear principal component analysis and its applications
Mori, Yuichi; Makino, Naomichi
2016-01-01
This book expounds the principle and related applications of nonlinear principal component analysis (PCA), which is useful method to analyze mixed measurement levels data. In the part dealing with the principle, after a brief introduction of ordinary PCA, a PCA for categorical data (nominal and ordinal) is introduced as nonlinear PCA, in which an optimal scaling technique is used to quantify the categorical variables. The alternating least squares (ALS) is the main algorithm in the method. Multiple correspondence analysis (MCA), a special case of nonlinear PCA, is also introduced. All formulations in these methods are integrated in the same manner as matrix operations. Because any measurement levels data can be treated consistently as numerical data and ALS is a very powerful tool for estimations, the methods can be utilized in a variety of fields such as biometrics, econometrics, psychometrics, and sociology. In the applications part of the book, four applications are introduced: variable selection for mixed...
Face Recognition Based on Principal Component Analysis
Directory of Open Access Journals (Sweden)
Ali Javed
2013-02-01
Full Text Available The purpose of the proposed research work is to develop a computer system that can recognize a person by comparing the characteristics of face to those of known individuals. The main focus is on frontal two dimensional images that are taken in a controlled environment i.e. the illumination and the background will be constant. All the other methods of person’s identification and verification like iris scan or finger print scan require high quality and costly equipment’s but in face recognition we only require a normal camera giving us a 2-D frontal image of the person that will be used for the process of the person’s recognition. Principal Component Analysis technique has been used in the proposed system of face recognition. The purpose is to compare the results of the technique under the different conditions and to find the most efficient approach for developing a facial recognition system
Principal Components Analysis In Medical Imaging
Weaver, J. B.; Huddleston, A. L.
1986-06-01
Principal components analysis, PCA, is basically a data reduction technique. PCA has been used in several problems in diagnostic radiology: processing radioisotope brain scans (Ref.1), automatic alignment of radionuclide images (Ref. 2), processing MRI images (Ref. 3,4), analyzing first-pass cardiac studies (Ref. 5) correcting for attenuation in bone mineral measurements (Ref. 6) and in dual energy x-ray imaging (Ref. 6,7). This paper will progress as follows; a brief introduction to the mathematics of PCA will be followed by two brief examples of how PCA has been used in the literature. Finally my own experience with PCA in dual-energy x-ray imaging will be given.
Integrating Data Transformation in Principal Components Analysis
Maadooliat, Mehdi
2015-01-02
Principal component analysis (PCA) is a popular dimension reduction method to reduce the complexity and obtain the informative aspects of high-dimensional datasets. When the data distribution is skewed, data transformation is commonly used prior to applying PCA. Such transformation is usually obtained from previous studies, prior knowledge, or trial-and-error. In this work, we develop a model-based method that integrates data transformation in PCA and finds an appropriate data transformation using the maximum profile likelihood. Extensions of the method to handle functional data and missing values are also developed. Several numerical algorithms are provided for efficient computation. The proposed method is illustrated using simulated and real-world data examples.
Principal components analysis of population admixture.
Directory of Open Access Journals (Sweden)
Jianzhong Ma
Full Text Available With the availability of high-density genotype information, principal components analysis (PCA is now routinely used to detect and quantify the genetic structure of populations in both population genetics and genetic epidemiology. An important issue is how to make appropriate and correct inferences about population relationships from the results of PCA, especially when admixed individuals are included in the analysis. We extend our recently developed theoretical formulation of PCA to allow for admixed populations. Because the sampled individuals are treated as features, our generalized formulation of PCA directly relates the pattern of the scatter plot of the top eigenvectors to the admixture proportions and parameters reflecting the population relationships, and thus can provide valuable guidance on how to properly interpret the results of PCA in practice. Using our formulation, we theoretically justify the diagnostic of two-way admixture. More importantly, our theoretical investigations based on the proposed formulation yield a diagnostic of multi-way admixture. For instance, we found that admixed individuals with three parental populations are distributed inside the triangle formed by their parental populations and divide the triangle into three smaller triangles whose areas have the same proportions in the big triangle as the corresponding admixture proportions. We tested and illustrated these findings using simulated data and data from HapMap III and the Human Genome Diversity Project.
Mapping ash properties using principal components analysis
Pereira, Paulo; Brevik, Eric; Cerda, Artemi; Ubeda, Xavier; Novara, Agata; Francos, Marcos; Rodrigo-Comino, Jesus; Bogunovic, Igor; Khaledian, Yones
2017-04-01
In post-fire environments ash has important benefits for soils, such as protection and source of nutrients, crucial for vegetation recuperation (Jordan et al., 2016; Pereira et al., 2015a; 2016a,b). The thickness and distribution of ash are fundamental aspects for soil protection (Cerdà and Doerr, 2008; Pereira et al., 2015b) and the severity at which was produced is important for the type and amount of elements that is released in soil solution (Bodi et al., 2014). Ash is very mobile material, and it is important were it will be deposited. Until the first rainfalls are is very mobile. After it, bind in the soil surface and is harder to erode. Mapping ash properties in the immediate period after fire is complex, since it is constantly moving (Pereira et al., 2015b). However, is an important task, since according the amount and type of ash produced we can identify the degree of soil protection and the nutrients that will be dissolved. The objective of this work is to apply to map ash properties (CaCO3, pH, and select extractable elements) using a principal component analysis (PCA) in the immediate period after the fire. Four days after the fire we established a grid in a 9x27 m area and took ash samples every 3 meters for a total of 40 sampling points (Pereira et al., 2017). The PCA identified 5 different factors. Factor 1 identified high loadings in electrical conductivity, calcium, and magnesium and negative with aluminum and iron, while Factor 3 had high positive loadings in total phosphorous and silica. Factor 3 showed high positive loadings in sodium and potassium, factor 4 high negative loadings in CaCO3 and pH, and factor 5 high loadings in sodium and potassium. The experimental variograms of the extracted factors showed that the Gaussian model was the most precise to model factor 1, the linear to model factor 2 and the wave hole effect to model factor 3, 4 and 5. The maps produced confirm the patternd observed in the experimental variograms. Factor 1 and 2
Incremental Tensor Principal Component Analysis for Handwritten Digit Recognition
Directory of Open Access Journals (Sweden)
Chang Liu
2014-01-01
Full Text Available To overcome the shortcomings of traditional dimensionality reduction algorithms, incremental tensor principal component analysis (ITPCA based on updated-SVD technique algorithm is proposed in this paper. This paper proves the relationship between PCA, 2DPCA, MPCA, and the graph embedding framework theoretically and derives the incremental learning procedure to add single sample and multiple samples in detail. The experiments on handwritten digit recognition have demonstrated that ITPCA has achieved better recognition performance than that of vector-based principal component analysis (PCA, incremental principal component analysis (IPCA, and multilinear principal component analysis (MPCA algorithms. At the same time, ITPCA also has lower time and space complexity.
FUZZY PRINCIPAL COMPONENT ANALYSIS AND ITS KERNEL BASED MODEL
Institute of Scientific and Technical Information of China (English)
无
2007-01-01
Principal Component Analysis (PCA) is one of the most important feature extraction methods, and Kernel Principal Component Analysis (KPCA) is a nonlinear extension of PCA based on kernel methods. In real world, each input data may not be fully assigned to one class and it may partially belong to other classes. Based on the theory of fuzzy sets, this paper presents Fuzzy Principal Component Analysis (FPCA) and its nonlinear extension model, i.e., Kernel-based Fuzzy Principal Component Analysis (KFPCA). The experimental results indicate that the proposed algorithms have good performances.
(NDSI) and Normalised Difference Principal Component Snow Index
African Journals Online (AJOL)
Phila Sibandze
Snow is a common global meteorological phenomenon known to be a critical component .... This procedure generated eight spectrally independent principal components. ... therefore chosen for calculation of the NDPCSI as they showed high ...
A principal component analysis of transmission spectra of wine distillates
Rogovaya, M. V.; Sinitsyn, G. V.; Khodasevich, M. A.
2014-11-01
A chemometric method of decomposing multidimensional data into a small-sized space, the principal component method, has been applied to the transmission spectra of vintage Moldovan wine distillates. A sample of 42 distillates aged from four to 7 years from six producers has been used to show the possibility of identifying a producer in a two-dimensional space of principal components describing 94.5% of the data-matrix dispersion. Analysis of the loads into the first two principal components has shown that, in order to measure the optical characteristics of the samples under study using only two wavelengths, it is necessary to select 380 and 540 nm, instead of the standard 420 and 520 nm, to describe the variability of the distillates by one principal component or 370 and 520 nm to describe the variability by two principal components.
Monte Carlo Algorithm for Least Dependent Non-Negative Mixture Decomposition
Astakhov, S A; Kraskov, A; Grassberger, P; Astakhov, Sergey A.; St\\"ogbauer, Harald; Kraskov, Alexander; Grassberger, Peter
2006-01-01
We propose a simulated annealing algorithm (called SNICA for "stochastic non-negative independent component analysis") for blind decomposition of linear mixtures of non-negative sources with non-negative coefficients. The de-mixing is based on a Metropolis type Monte Carlo search for least dependent components, with the mutual information between recovered components as a cost function and their non-negativity as a hard constraint. Elementary moves are shears in two-dimensional subspaces and rotations in three-dimensional subspaces. The algorithm is geared at decomposing signals whose probability densities peak at zero, the case typical in analytical spectroscopy and multivariate curve resolution. The decomposition performance on large samples of synthetic mixtures and experimental data is much better than that of traditional blind source separation methods based on principal component analysis (MILCA, FastICA, RADICAL) and chemometrics techniques (SIMPLISMA, ALS, BTEM) The source codes of SNICA, MILCA and th...
Dong, Fengxia; Mitchell, Paul D; Colquhoun, Jed
2015-01-01
Measuring farm sustainability performance is a crucial component for improving agricultural sustainability. While extensive assessments and indicators exist that reflect the different facets of agricultural sustainability, because of the relatively large number of measures and interactions among them, a composite indicator that integrates and aggregates over all variables is particularly useful. This paper describes and empirically evaluates a method for constructing a composite sustainability indicator that individually scores and ranks farm sustainability performance. The method first uses non-negative polychoric principal component analysis to reduce the number of variables, to remove correlation among variables and to transform categorical variables to continuous variables. Next the method applies common-weight data envelope analysis to these principal components to individually score each farm. The method solves weights endogenously and allows identifying important practices in sustainability evaluation. An empirical application to Wisconsin cranberry farms finds heterogeneity in sustainability practice adoption, implying that some farms could adopt relevant practices to improve the overall sustainability performance of the industry. Copyright © 2014 Elsevier Ltd. All rights reserved.
Principal Component Analysis In Radar Polarimetry
Directory of Open Access Journals (Sweden)
A. Danklmayer
2005-01-01
Full Text Available Second order moments of multivariate (often Gaussian joint probability density functions can be described by the covariance or normalised correlation matrices or by the Kennaugh matrix (Kronecker matrix. In Radar Polarimetry the application of the covariance matrix is known as target decomposition theory, which is a special application of the extremely versatile Principle Component Analysis (PCA. The basic idea of PCA is to convert a data set, consisting of correlated random variables into a new set of uncorrelated variables and order the new variables according to the value of their variances. It is important to stress that uncorrelatedness does not necessarily mean independent which is used in the much stronger concept of Independent Component Analysis (ICA. Both concepts agree for multivariate Gaussian distribution functions, representing the most random and least structured distribution. In this contribution, we propose a new approach in applying the concept of PCA to Radar Polarimetry. Therefore, new uncorrelated random variables will be introduced by means of linear transformations with well determined loading coefficients. This in turn, will allow the decomposition of the original random backscattering target variables into three point targets with new random uncorrelated variables whose variances agree with the eigenvalues of the covariance matrix. This allows a new interpretation of existing decomposition theorems.
Water quality assessment using SVD-based principal component ...
African Journals Online (AJOL)
Water quality assessment using SVD-based principal component analysis of hydrological data. ... value decomposition (SVD) of hydrological data was tested for water quality assessment. ... EMAIL FREE FULL TEXT EMAIL FREE FULL TEXT
PRINCIPAL COMPONENT ANALYSIS - A POWERFUL TOOL IN COMPUTING MARKETING INFORMATION
National Research Council Canada - National Science Library
Cristinel Constantin
2014-01-01
... that need to solve the marketing problem a company face with. The literature stresses the need to avoid the multicollinearity phenomenon in multivariate analysis and the features of Principal Component Analysis (PCA...
Weighted principal component analysis: a weighted covariance eigendecomposition approach
Delchambre, Ludovic
2014-01-01
We present a new straightforward principal component analysis (PCA) method based on the diagonalization of the weighted variance-covariance matrix through two spectral decomposition methods: power iteration and Rayleigh quotient iteration. This method allows one to retrieve a given number of orthogonal principal components amongst the most meaningful ones for the case of problems with weighted and/or missing data. Principal coefficients are then retrieved by fitting principal components to the data while providing the final decomposition. Tests performed on real and simulated cases show that our method is optimal in the identification of the most significant patterns within data sets. We illustrate the usefulness of this method by assessing its quality on the extrapolation of Sloan Digital Sky Survey quasar spectra from measured wavelengths to shorter and longer wavelengths. Our new algorithm also benefits from a fast and flexible implementation.
Principal Component-Discrimination Model and Its Application
Institute of Scientific and Technical Information of China (English)
韩天锡; 魏雪丽; 蒋淳; 张玉琍
2004-01-01
Having researched for many years, seismologists in China presented about 80 earthquake prediction factors which reflected omen information of earthquake. How to concentrate the information that the 80 earthquake prediction factors have and how to choose the main factors to predict earthquakes precisely have become one of the topics in seismology. The model of principal component-discrimination consists of principal component analysis, correlation analysis, weighted method of principal factor coefficients and Mahalanobis distance discrimination analysis. This model combines the method of maximization earthquake prediction factor information with the weighted method of principal factor coefficients and correlation analysis to choose earthquake prediction variables, applying Mahalanobis distance discrimination to establishing earthquake prediction discrimination model. This model was applied to analyzing the earthquake data of Northern China area and obtained good prediction results.
Longitudinal functional principal component modelling via Stochastic Approximation Monte Carlo
Martinez, Josue G.
2010-06-01
The authors consider the analysis of hierarchical longitudinal functional data based upon a functional principal components approach. In contrast to standard frequentist approaches to selecting the number of principal components, the authors do model averaging using a Bayesian formulation. A relatively straightforward reversible jump Markov Chain Monte Carlo formulation has poor mixing properties and in simulated data often becomes trapped at the wrong number of principal components. In order to overcome this, the authors show how to apply Stochastic Approximation Monte Carlo (SAMC) to this problem, a method that has the potential to explore the entire space and does not become trapped in local extrema. The combination of reversible jump methods and SAMC in hierarchical longitudinal functional data is simplified by a polar coordinate representation of the principal components. The approach is easy to implement and does well in simulated data in determining the distribution of the number of principal components, and in terms of its frequentist estimation properties. Empirical applications are also presented.
Shen, Chenhua
2017-02-01
We applied traditional principal component analysis (TPCA) and nonstationary principal component analysis (NSPCA) to determine principal components in the six daily air-pollutant concentration series (SO2, NO2, CO, O3, PM2.5 and PM10) in Nanjing from January 2013 to March 2016. The results show that using TPCA, two principal components can reflect the variance of these series: primary pollutants (SO2, NO2, CO, PM2.5 and PM10) and secondary pollutants (e.g., O3). However, using NSPCA, three principal components can be determined to reflect the detrended variance of these series: 1) a mixture of primary and secondary pollutants, 2) primary pollutants and 3) secondary pollutants. Various approaches can obtain different principal components. This phenomenon is closely related to methods for calculating the cross-correlation between each of the air pollutants. NSPCA is a more applicable, reliable method for analyzing the principal components of a series in the presence of nonstationarity and for a long-range correlation than can TPCA. Moreover, using detrended cross-correlation analysis (DCCA), the cross-correlation between O3 and NO2 is negative at a short timescale and positive at a long timescale. In hourly timescales, O3 is negatively correlated with NO2 due to a photochemical interaction, and in daily timescales, O3 is positively correlated with NO2 because of the decomposition of O3. In monthly timescales, the cross-correlation between O3 with NO2 has similar performance to those of O3 with meteorological elements. DCCA is again shown to be more appropriate for disclosing the cross-correlation between series in the presence of nonstationarity than is Pearson's method. DCCA can improve our understanding of their interactional mechanisms.
Sparse logistic principal components analysis for binary data
Lee, Seokho
2010-09-01
We develop a new principal components analysis (PCA) type dimension reduction method for binary data. Different from the standard PCA which is defined on the observed data, the proposed PCA is defined on the logit transform of the success probabilities of the binary observations. Sparsity is introduced to the principal component (PC) loading vectors for enhanced interpretability and more stable extraction of the principal components. Our sparse PCA is formulated as solving an optimization problem with a criterion function motivated from a penalized Bernoulli likelihood. A Majorization-Minimization algorithm is developed to efficiently solve the optimization problem. The effectiveness of the proposed sparse logistic PCA method is illustrated by application to a single nucleotide polymorphism data set and a simulation study. © Institute ol Mathematical Statistics, 2010.
Using Kernel Principal Components for Color Image Segmentation
Wesolkowski, Slawo
2002-11-01
Distinguishing objects on the basis of color is fundamental to humans. In this paper, a clustering approach is used to segment color images. Clustering is usually done using a single point or vector as a cluster prototype. The data can be clustered in the input or feature space where the feature space is some nonlinear transformation of the input space. The idea of kernel principal component analysis (KPCA) was introduced to align data along principal components in the kernel or feature space. KPCA is a nonlinear transformation of the input data that finds the eigenvectors along which this data has maximum information content (or variation). The principal components resulting from KPCA are nonlinear in the input space and represent principal curves. This is a necessary step as colors in RGB are not linearly correlated especially considering illumination effects such as shading or highlights. The performance of the k-means (Euclidean distance-based) and Mixture of Principal Components (vector angle-based) algorithms are analyzed in the context of the input space and the feature space obtained using KPCA. Results are presented on a color image segmentation task. The results are discussed and further extensions are suggested.
Strictly nonnegative tensors and nonnegative tensor partition
Institute of Scientific and Technical Information of China (English)
HU ShengLong; HUANG ZhengHai; QI LiQun
2014-01-01
We introduce a new class of nonnegative tensors—strictly nonnegative tensors.A weakly irreducible nonnegative tensor is a strictly nonnegative tensor but not vice versa.We show that the spectral radius of a strictly nonnegative tensor is always positive.We give some necessary and su？cient conditions for the six wellconditional classes of nonnegative tensors,introduced in the literature,and a full relationship picture about strictly nonnegative tensors with these six classes of nonnegative tensors.We then establish global R-linear convergence of a power method for finding the spectral radius of a nonnegative tensor under the condition of weak irreducibility.We show that for a nonnegative tensor T,there always exists a partition of the index set such that every tensor induced by the partition is weakly irreducible;and the spectral radius of T can be obtained from those spectral radii of the induced tensors.In this way,we develop a convergent algorithm for finding the spectral radius of a general nonnegative tensor without any additional assumption.Some preliminary numerical results show the feasibility and effectiveness of the algorithm.
Non-negative matrix factorization and term structure of interest rates
Takada, Hellinton H.; Stern, Julio M.
2015-01-01
Non-Negative Matrix Factorization (NNMF) is a technique for dimensionality reduction with a wide variety of applications from text mining to identification of concentrations in chemistry. NNMF deals with non-negative data and results in non-negative factors and factor loadings. Consequently, it is a natural choice when studying the term structure of interest rates. In this paper, NNMF is applied to obtain factors from the term structure of interest rates and the procedure is compared with other very popular techniques: principal component analysis and Nelson-Siegel model. The NNMF approximation for the term structure of interest rates is better in terms of fitting. From a practitioner point of view, the NNMF factors and factor loadings obtained possess straightforward financial interpretations due to their non-negativeness.
The application of Principal Component Analysis to materials science data
Directory of Open Access Journals (Sweden)
Changwon Suh
2006-01-01
Full Text Available The relationship between apparently disparate sets of data is a critical component of interpreting materials' behavior, especially in terms of assessing the impact of the microscopic characteristics of materials on their macroscopic or engineering behavior. In this paper we demonstrate the value of principal component analysis of property data associated with high temperature superconductivity to examine the statistical impact of the materials' intrinsic characteristics on high temperature superconducting behavior
PRINCIPAL COMPONENT ANALYSIS IN APPLICATION TO OBJECT ORIENTATION
Institute of Scientific and Technical Information of China (English)
无
2000-01-01
This paper proposes a new method based on principal component analysis to find the direction of an object in any pose.Experiments show that this method is fast,can be applied to objects with any pixel distribution and keep the original properties of objects invariant.It is a new application of PCA in image analysis.
Principal component analysis of image gradient orientations for face recognition
Tzimiropoulos, Georgios; Zafeiriou, Stefanos; Pantic, Maja
We introduce the notion of Principal Component Analysis (PCA) of image gradient orientations. As image data is typically noisy, but noise is substantially different from Gaussian, traditional PCA of pixel intensities very often fails to estimate reliably the low-dimensional subspace of a given data
Sparse Principal Component Analysis in Medical Shape Modeling
DEFF Research Database (Denmark)
Sjöstrand, Karl; Stegmann, Mikkel Bille; Larsen, Rasmus
2006-01-01
Principal component analysis (PCA) is a widely used tool in medical image analysis for data reduction, model building, and data understanding and exploration. While PCA is a holistic approach where each new variable is a linear combination of all original variables, sparse PCA (SPCA) aims...
Incremental principal component pursuit for video background modeling
Energy Technology Data Exchange (ETDEWEB)
Rodriquez-Valderrama, Paul A.; Wohlberg, Brendt
2017-03-14
An incremental Principal Component Pursuit (PCP) algorithm for video background modeling that is able to process one frame at a time while adapting to changes in background, with a computational complexity that allows for real-time processing, having a low memory footprint and is robust to translational and rotational jitter.
Sparse principal component analysis in hyperspectral change detection
DEFF Research Database (Denmark)
Nielsen, Allan Aasbjerg; Larsen, Rasmus; Vestergaard, Jacob Schack
2011-01-01
This contribution deals with change detection by means of sparse principal component analysis (PCA) of simple differences of calibrated, bi-temporal HyMap data. Results show that if we retain only 15 nonzero loadings (out of 126) in the sparse PCA the resulting change scores appear visually very ...
Principal Component Clustering Approach to Teaching Quality Discriminant Analysis
Xian, Sidong; Xia, Haibo; Yin, Yubo; Zhai, Zhansheng; Shang, Yan
2016-01-01
Teaching quality is the lifeline of the higher education. Many universities have made some effective achievement about evaluating the teaching quality. In this paper, we establish the Students' evaluation of teaching (SET) discriminant analysis model and algorithm based on principal component clustering analysis. Additionally, we classify the SET…
The dynamics of on-line principal component analysis
Biehl, M.; Schlösser, E.
1998-01-01
The learning dynamics of an on-line algorithm for principal component analysis is described exactly in the thermodynamic limit by means of coupled ordinary differential equations for a set of order parameters. It is demonstrated that learning is delayed significantly because existing symmetries amon
Convergence of algorithms used for principal component analysis
Institute of Scientific and Technical Information of China (English)
张俊华; 陈翰馥
1997-01-01
The convergence of algorithms used for principal component analysis is analyzed. The algorithms are proved to converge to eigenvectors and eigenvalues of a matrix A which is the expectation of observed random samples. The conditions required here are considerably weaker than those used in previous work.
Principal Component Analysis: Most Favourite Tool in Chemometrics
Indian Academy of Sciences (India)
Keshav Kumar
2017-08-01
Principal component analysis (PCA) is the most commonlyused chemometric technique. It is an unsupervised patternrecognition technique. PCA has found applications in chemistry,biology, medicine and economics. The present work attemptsto understand how PCA work and how can we interpretits results.
Principal Component Surface (2011) for Fish Bay, St. John
National Oceanic and Atmospheric Administration, Department of Commerce — This image represents a 0.3x0.3 meter principal component analysis (PCA) surface for areas inside Fish Bay, St. John in the U.S. Virgin Islands (USVI). It was...
Principal Component Surface (2011) for Coral Bay, St. John
National Oceanic and Atmospheric Administration, Department of Commerce — This image represents a 0.3x0.3 meter principal component analysis (PCA) surface for areas inside Coral Bay, St. John in the U.S. Virgin Islands (USVI). It was...
PEMBUATAN PERANGKAT LUNAK PENGENALAN WAJAH MENGGUNAKAN PRINCIPAL COMPONENTS ANALYSIS
Directory of Open Access Journals (Sweden)
Kartika Gunadi
2001-01-01
Full Text Available Face recognition is one of many important researches, and today, many applications have implemented it. Through development of techniques like Principal Components Analysis (PCA, computers can now outperform human in many face recognition tasks, particularly those in which large database of faces must be searched. Principal Components Analysis was used to reduce facial image dimension into fewer variables, which are easier to observe and handle. Those variables then fed into artificial neural networks using backpropagation method to recognise the given facial image. The test results show that PCA can provide high face recognition accuracy. For the training faces, a correct identification of 100% could be obtained. From some of network combinations that have been tested, a best average correct identification of 91,11% could be obtained for the test faces while the worst average result is 46,67 % correct identification Abstract in Bahasa Indonesia : Pengenalan wajah manusia merupakan salah satu bidang penelitian yang penting, dan dewasa ini banyak aplikasi yang dapat menerapkannya. Melalui pengembangan suatu teknik seperti Principal Components Analysis (PCA, komputer sekarang dapat melebihi kemampuan otak manusia dalam berbagai tugas pengenalan wajah, terutama tugas-tugas yang membutuhkan pencarian pada database wajah yang besar. Principal Components Analysis digunakan untuk mereduksi dimensi gambar wajah sehingga menghasilkan variabel yang lebih sedikit yang lebih mudah untuk diobsevasi dan ditangani. Hasil yang diperoleh kemudian akan dimasukkan ke suatu jaringan saraf tiruan dengan metode Backpropagation untuk mengenali gambar wajah yang telah diinputkan ke dalam sistem. Hasil pengujian sistem menunjukkan bahwa penggunaan PCA untuk pengenalan wajah dapat memberikan tingkat akurasi yang cukup tinggi. Untuk gambar wajah yang diikutsertakankan dalam latihan, dapat diperoleh 100% identifikasi yang benar. Dari beberapa kombinasi jaringan yang
Principal Component Analysis - A Powerful Tool in Computing Marketing Information
Directory of Open Access Journals (Sweden)
Constantin C.
2014-12-01
Full Text Available This paper is about an instrumental research regarding a powerful multivariate data analysis method which can be used by the researchers in order to obtain valuable information for decision makers that need to solve the marketing problem a company face with. The literature stresses the need to avoid the multicollinearity phenomenon in multivariate analysis and the features of Principal Component Analysis (PCA in reducing the number of variables that could be correlated with each other to a small number of principal components that are uncorrelated. In this respect, the paper presents step-by-step the process of applying the PCA in marketing research when we use a large number of variables that naturally are collinear.
QUALITY CONTROL OF SEMICONDUCTOR PACKAGING BASED ON PRINCIPAL COMPONENTS ANALYSIS
Institute of Scientific and Technical Information of China (English)
无
2007-01-01
5 critical quality characteristics must be controlled in the surface mount and wire-bond process in semiconductor packaging. And these characteristics are correlated with each other. So the principal components analysis(PCA) is used in the analysis of the sample data firstly. And then the process is controlled with hotelling T2 control chart for the first several principal components which contain sufficient information. Furthermore, a software tool is developed for this kind of problems. And with sample data from a surface mounting device(SMD) process, it is demonstrated that the T2 control chart with PCA gets the same conclusion as without PCA, but the problem is transformed from high-dimensional one to a lower dimensional one, i.e., from 5 to 2 in this demonstration.
SAS program for quantitative stratigraphic correlation by principal components
Hohn, M.E.
1985-01-01
A SAS program is presented which constructs a composite section of stratigraphic events through principal components analysis. The variables in the analysis are stratigraphic sections and the observational units are range limits of taxa. The program standardizes data in each section, extracts eigenvectors, estimates missing range limits, and computes the composite section from scores of events on the first principal component. Provided is an option of several types of diagnostic plots; these help one to determine conservative range limits or unrealistic estimates of missing values. Inspection of the graphs and eigenvalues allow one to evaluate goodness of fit between the composite and measured data. The program is extended easily to the creation of a rank-order composite. ?? 1985.
Distribution of the residual roots in principal components analysis.
Directory of Open Access Journals (Sweden)
A. M. Kshirsagar
1964-10-01
Full Text Available The latent of distribution of latent roots of the covariance martix of normal variables, when a hypothetical linear function of the variables is eliminated, is derived in this paper. The relation between original roots and the residual roots- after elimination of, is also derived by an analytical method. An exact test for the goodness of fit of a single nonisotropic hypothetical principal components, using the residual roots, is then obtained.
Selecting the Number of Principal Components in Functional Data.
Li, Yehua; Wang, Naisyin; Carroll, Raymond J
2013-12-19
Functional principal component analysis (FPCA) has become the most widely used dimension reduction tool for functional data analysis. We consider functional data measured at random, subject-specific time points, contaminated with measurement error, allowing for both sparse and dense functional data, and propose novel information criteria to select the number of principal component in such data. We propose a Bayesian information criterion based on marginal modeling that can consistently select the number of principal components for both sparse and dense functional data. For dense functional data, we also developed an Akaike information criterion (AIC) based on the expected Kullback-Leibler information under a Gaussian assumption. In connecting with factor analysis in multivariate time series data, we also consider the information criteria by Bai & Ng (2002) and show that they are still consistent for dense functional data, if a prescribed undersmoothing scheme is undertaken in the FPCA algorithm. We perform intensive simulation studies and show that the proposed information criteria vastly outperform existing methods for this type of data. Surprisingly, our empirical evidence shows that our information criteria proposed for dense functional data also perform well for sparse functional data. An empirical example using colon carcinogenesis data is also provided to illustrate the results.
Selecting the Number of Principal Components in Functional Data
Li, Yehua
2013-12-01
Functional principal component analysis (FPCA) has become the most widely used dimension reduction tool for functional data analysis. We consider functional data measured at random, subject-specific time points, contaminated with measurement error, allowing for both sparse and dense functional data, and propose novel information criteria to select the number of principal component in such data. We propose a Bayesian information criterion based on marginal modeling that can consistently select the number of principal components for both sparse and dense functional data. For dense functional data, we also develop an Akaike information criterion based on the expected Kullback-Leibler information under a Gaussian assumption. In connecting with the time series literature, we also consider a class of information criteria proposed for factor analysis of multivariate time series and show that they are still consistent for dense functional data, if a prescribed undersmoothing scheme is undertaken in the FPCA algorithm. We perform intensive simulation studies and show that the proposed information criteria vastly outperform existing methods for this type of data. Surprisingly, our empirical evidence shows that our information criteria proposed for dense functional data also perform well for sparse functional data. An empirical example using colon carcinogenesis data is also provided to illustrate the results. Supplementary materials for this article are available online. © 2013 American Statistical Association.
Functional Principal Components Analysis of Shanghai Stock Exchange 50 Index
Directory of Open Access Journals (Sweden)
Zhiliang Wang
2014-01-01
Full Text Available The main purpose of this paper is to explore the principle components of Shanghai stock exchange 50 index by means of functional principal component analysis (FPCA. Functional data analysis (FDA deals with random variables (or process with realizations in the smooth functional space. One of the most popular FDA techniques is functional principal component analysis, which was introduced for the statistical analysis of a set of financial time series from an explorative point of view. FPCA is the functional analogue of the well-known dimension reduction technique in the multivariate statistical analysis, searching for linear transformations of the random vector with the maximal variance. In this paper, we studied the monthly return volatility of Shanghai stock exchange 50 index (SSE50. Using FPCA to reduce dimension to a finite level, we extracted the most significant components of the data and some relevant statistical features of such related datasets. The calculated results show that regarding the samples as random functions is rational. Compared with the ordinary principle component analysis, FPCA can solve the problem of different dimensions in the samples. And FPCA is a convenient approach to extract the main variance factors.
Principal components analysis in the space of phylogenetic trees
Nye, Tom M W
2012-01-01
Phylogenetic analysis of DNA or other data commonly gives rise to a collection or sample of inferred evolutionary trees. Principal Components Analysis (PCA) cannot be applied directly to collections of trees since the space of evolutionary trees on a fixed set of taxa is not a vector space. This paper describes a novel geometrical approach to PCA in tree-space that constructs the first principal path in an analogous way to standard linear Euclidean PCA. Given a data set of phylogenetic trees, a geodesic principal path is sought that maximizes the variance of the data under a form of projection onto the path. Due to the high dimensionality of tree-space and the nonlinear nature of this problem, the computational complexity is potentially very high, so approximate optimization algorithms are used to search for the optimal path. Principal paths identified in this way reveal and quantify the main sources of variation in the original collection of trees in terms of both topology and branch lengths. The approach is...
Principal component analysis of minimal excitatory postsynaptic potentials.
Astrelin, A V; Sokolov, M V; Behnisch, T; Reymann, K G; Voronin, L L
1998-02-20
'Minimal' excitatory postsynaptic potentials (EPSPs) are often recorded from central neurones, specifically for quantal analysis. However the EPSPs may emerge from activation of several fibres or transmission sites so that formal quantal analysis may give false results. Here we extended application of the principal component analysis (PCA) to minimal EPSPs. We tested a PCA algorithm and a new graphical 'alignment' procedure against both simulated data and hippocampal EPSPs. Minimal EPSPs were recorded before and up to 3.5 h following induction of long-term potentiation (LTP) in CA1 neurones. In 29 out of 45 EPSPs, two (N=22) or three (N=7) components were detected which differed in latencies, rise time (Trise) or both. The detected differences ranged from 0.6 to 7.8 ms for the latency and from 1.6-9 ms for Trise. Different components behaved differently following LTP induction. Cases were found when one component was potentiated immediately after tetanus whereas the other with a delay of 15-60 min. The immediately potentiated component could decline in 1-2 h so that the two components contributed differently into early (reflections of synchronized quantal releases. In general, the results demonstrate PCA applicability to separate EPSPs into different components and its usefulness for precise analysis of synaptic transmission.
Quality Aware Compression of Electrocardiogram Using Principal Component Analysis.
Gupta, Rajarshi
2016-05-01
Electrocardiogram (ECG) compression finds wide application in various patient monitoring purposes. Quality control in ECG compression ensures reconstruction quality and its clinical acceptance for diagnostic decision making. In this paper, a quality aware compression method of single lead ECG is described using principal component analysis (PCA). After pre-processing, beat extraction and PCA decomposition, two independent quality criteria, namely, bit rate control (BRC) or error control (EC) criteria were set to select optimal principal components, eigenvectors and their quantization level to achieve desired bit rate or error measure. The selected principal components and eigenvectors were finally compressed using a modified delta and Huffman encoder. The algorithms were validated with 32 sets of MIT Arrhythmia data and 60 normal and 30 sets of diagnostic ECG data from PTB Diagnostic ECG data ptbdb, all at 1 kHz sampling. For BRC with a CR threshold of 40, an average Compression Ratio (CR), percentage root mean squared difference normalized (PRDN) and maximum absolute error (MAE) of 50.74, 16.22 and 0.243 mV respectively were obtained. For EC with an upper limit of 5 % PRDN and 0.1 mV MAE, the average CR, PRDN and MAE of 9.48, 4.13 and 0.049 mV respectively were obtained. For mitdb data 117, the reconstruction quality could be preserved up to CR of 68.96 by extending the BRC threshold. The proposed method yields better results than recently published works on quality controlled ECG compression.
Principal Component Analysis on Semen Quality among Chinese Young Men
Institute of Scientific and Technical Information of China (English)
Jun-qing WU; Er-sheng GAO; Jian-guo TAO; Cui-ling LIANG; Wenying LI; Qiu-ying YANG; Kang-shou YAO; Wei-qun LU; Lu CHEN
2003-01-01
Objective To understand the current semen quality status among Chinese young men and influential factors in China and to explore its evaluation index. Methods A total of 562 healthy male volunteers were recruited during their premarital examinations in seven provincials and municipal regions' MCH centers; descriptive and principal component analyses were used to analyze data.Results The findings show that semen volume (2.61±1.10 mL), sperm density (64.47±34.59×106/mL), percentage of sperm forward progression (59.89%±17.11%), percentage of sperm viability (77.19%±11.87%), and percentage of normal sperm morphology (78.23%±9.15%). The first principal component function is Z1=-8.512 54 + 0.001 36X1' +0.031 92X2'+0.043 52X3'+ 0.039 84X4', which is closely related to percentage of sperm viability (X3), percentage of sperm forward progression (X2), and percentage of normal sperm morphology (X4);The second principal component function is: Z2=0.491 92+ 0.080 80X1- 0.000 58X2 - 0.005 10X3 - 0.018 07X4, which depends on the total sperm count (X1). Conclusion Only 42.3% subjects meet all the common WHO standard of semen quality. The multiple analysis of Z1 showed that the highest Z1 are among subjects from Guizhou, workers, or town residents. Multiple analysis of Z2 showed that the older age when the subjects had the first sexual impulse, the longer period of sexual abstinence and more quantity of sperm they had; the more sexual activity subjects had, the less amount of sperm they had.
Self-aggregation in scaled principal component space
Energy Technology Data Exchange (ETDEWEB)
Ding, Chris H.Q.; He, Xiaofeng; Zha, Hongyuan; Simon, Horst D.
2001-10-05
Automatic grouping of voluminous data into meaningful structures is a challenging task frequently encountered in broad areas of science, engineering and information processing. These data clustering tasks are frequently performed in Euclidean space or a subspace chosen from principal component analysis (PCA). Here we describe a space obtained by a nonlinear scaling of PCA in which data objects self-aggregate automatically into clusters. Projection into this space gives sharp distinctions among clusters. Gene expression profiles of cancer tissue subtypes, Web hyperlink structure and Internet newsgroups are analyzed to illustrate interesting properties of the space.
PRINCIPAL COMPONENT ANALYSIS (PCA DAN APLIKASINYA DENGAN SPSS
Directory of Open Access Journals (Sweden)
Hermita Bus Umar
2009-03-01
Full Text Available PCA (Principal Component Analysis are statistical techniques applied to a single set of variables when the researcher is interested in discovering which variables in the setform coherent subset that are relativity independent of one another.Variables that are correlated with one another but largely independent of other subset of variables are combined into factors. The Coals of PCA to which each variables is explained by each dimension. Step in PCA include selecting and mean measuring a set of variables, preparing the correlation matrix, extracting a set offactors from the correlation matrixs. Rotating the factor to increase interpretabilitv and interpreting the result.
Nonlinear Principal Component Analysis Using Strong Tracking Filter
Institute of Scientific and Technical Information of China (English)
无
2007-01-01
The paper analyzes the problem of blind source separation (BSS) based on the nonlinear principal component analysis (NPCA) criterion. An adaptive strong tracking filter (STF) based algorithm was developed, which is immune to system model mismatches. Simulations demonstrate that the algorithm converges quickly and has satisfactory steady-state accuracy. The Kalman filtering algorithm and the recursive leastsquares type algorithm are shown to be special cases of the STF algorithm. Since the forgetting factor is adaptively updated by adjustment of the Kalman gain, the STF scheme provides more powerful tracking capability than the Kalman filtering algorithm and recursive least-squares algorithm.
ANOVA-principal component analysis and ANOVA-simultaneous component analysis: a comparison.
Zwanenburg, G.; Hoefsloot, H.C.J.; Westerhuis, J.A.; Jansen, J.J.; Smilde, A.K.
2011-01-01
ANOVA-simultaneous component analysis (ASCA) is a recently developed tool to analyze multivariate data. In this paper, we enhance the explorative capability of ASCA by introducing a projection of the observations on the principal component subspace to visualize the variation among the measurements.
Manisera, M.; Kooij, A.J. van der; Dusseldorp, E.
2010-01-01
The component structure of 14 Likert-type items measuring different aspects of job satisfaction was investigated using nonlinear Principal Components Analysis (NLPCA). NLPCA allows for analyzing these items at an ordinal or interval level. The participants were 2066 workers from five types of social
Robust Principal Component Test in Gross Error Detection and Identification
Institute of Scientific and Technical Information of China (English)
无
2007-01-01
Principle component analysis (PCA) based chi-square test is more sensitive to subtle gross errors and has greater power to correctly detect gross errors than classical chi-square test. However, classical principal component test (PCT) is non-robust and can be very sensitive to one or more outliers. In this paper, a Huber function liked robust weight factor was added in the collective chi-square test to eliminate the influence of gross errors on the PCT. Meanwhile, robust chi-square test was applied to modified simultaneous estimation of gross error (MSEGE) strategy to detect and identify multiple gross errors. Simulation results show that the proposed robust test can reduce the possibility of type Ⅱ errors effectively. Adding robust chi-square test into MSEGE does not obviously improve the power of multiple gross error identification, the proposed approach considers the influence of outliers on hypothesis statistic test and is more reasonable.
Principal Component Analysis of Thermal Dorsal Hand Vein Pattern Architecture
Directory of Open Access Journals (Sweden)
V. Krishna Sree
2012-12-01
Full Text Available The quest of providing more secure identification system has lead to rise in developing biometric systems. Biometrics such as face, fingerprint and iris have been developed extensively for human identification purpose and also to provide authentic input to many security systems in the past few decades. Dorsal hand vein pattern is an emerging biometric which is unique to every individual. In this study principal component analysis is used to obtain Eigen vein patterns which are low dimensional representation of vein pattern features. The extraction of the vein patterns was obtained by morphological techniques. Noise reduction filters are used to enhance the vein patterns. Principle component analysis is able to reduce the 2-dimensional image database into 1-dimensional Eigen vectors and able to identify all the dorsal hand pattern images.
Improve Survival Prediction Using Principal Components of Gene Expression Data
Institute of Scientific and Technical Information of China (English)
Yi-Jing Shen; Shu-Guang Huang
2006-01-01
The purpose of many microarray studies is to find the association between gene expression and sample characteristics such as treatment type or sample phenotype.There has been a surge of efforts developing different methods for delineating the association. Aside from the high dimensionality of microarray data, one well recognized challenge is the fact that genes could be complicatedly inter-related, thus making many statistical methods inappropriate to use directly on the expression data. Multivariate methods such as principal component analysis (PCA) and clustering are often used as a part of the effort to capture the gene correlation, and the derived components or clusters are used to describe the association between gene expression and sample phenotype. We propose a method for patient population dichotomization using maximally selected test statistics in combination with the PCA method, which shows favorable results. The proposed method is compared with a currently well-recognized method.
A principal components analysis of Rorschach aggression and hostility variables.
Katko, Nicholas J; Meyer, Gregory J; Mihura, Joni L; Bombel, George
2010-11-01
We examined the structure of 9 Rorschach variables related to hostility and aggression (Aggressive Movement, Morbid, Primary Process Aggression, Secondary Process Aggression, Aggressive Content, Aggressive Past, Strong Hostility, Lesser Hostility) in a sample of medical students (N= 225) from the Johns Hopkins Precursors Study (The Johns Hopkins University, 1999). Principal components analysis revealed 2 dimensions accounting for 58% of the total variance. These dimensions extended previous findings for a 2-component model of Rorschach aggressive imagery that had been identified using just 5 or 6 marker variables (Baity & Hilsenroth, 1999; Liebman, Porcerelli, & Abell, 2005). In light of this evidence, we draw an empirical link between the historical research literature and current studies of Rorschach aggression and hostility that helps organize their findings. We also offer suggestions for condensing the array of aggression-related measures to simplify Rorschach aggression scoring.
Nonlinear Process Fault Diagnosis Based on Serial Principal Component Analysis.
Deng, Xiaogang; Tian, Xuemin; Chen, Sheng; Harris, Chris J
2016-12-22
Many industrial processes contain both linear and nonlinear parts, and kernel principal component analysis (KPCA), widely used in nonlinear process monitoring, may not offer the most effective means for dealing with these nonlinear processes. This paper proposes a new hybrid linear-nonlinear statistical modeling approach for nonlinear process monitoring by closely integrating linear principal component analysis (PCA) and nonlinear KPCA using a serial model structure, which we refer to as serial PCA (SPCA). Specifically, PCA is first applied to extract PCs as linear features, and to decompose the data into the PC subspace and residual subspace (RS). Then, KPCA is performed in the RS to extract the nonlinear PCs as nonlinear features. Two monitoring statistics are constructed for fault detection, based on both the linear and nonlinear features extracted by the proposed SPCA. To effectively perform fault identification after a fault is detected, an SPCA similarity factor method is built for fault recognition, which fuses both the linear and nonlinear features. Unlike PCA and KPCA, the proposed method takes into account both linear and nonlinear PCs simultaneously, and therefore, it can better exploit the underlying process's structure to enhance fault diagnosis performance. Two case studies involving a simulated nonlinear process and the benchmark Tennessee Eastman process demonstrate that the proposed SPCA approach is more effective than the existing state-of-the-art approach based on KPCA alone, in terms of nonlinear process fault detection and identification.
Principal components null space analysis for image and video classification.
Vaswani, Namrata; Chellappa, Rama
2006-07-01
We present a new classification algorithm, principal component null space analysis (PCNSA), which is designed for classification problems like object recognition where different classes have unequal and nonwhite noise covariance matrices. PCNSA first obtains a principal components subspace (PCA space) for the entire data. In this PCA space, it finds for each class "i," an Mi-dimensional subspace along which the class' intraclass variance is the smallest. We call this subspace an approximate null space (ANS) since the lowest variance is usually "much smaller" than the highest. A query is classified into class "i" if its distance from the class' mean in the class' ANS is a minimum. We derive upper bounds on classification error probability of PCNSA and use these expressions to compare classification performance of PCNSA with that of subspace linear discriminant analysis (SLDA). We propose a practical modification of PCNSA called progressive-PCNSA that also detects "new" (untrained classes). Finally, we provide an experimental comparison of PCNSA and progressive PCNSA with SLDA and PCA and also with other classification algorithms-linear SVMs, kernel PCA, kernel discriminant analysis, and kernel SLDA, for object recognition and face recognition under large pose/expression variation. We also show applications of PCNSA to two classification problems in video--an action retrieval problem and abnormal activity detection.
Tensorial Kernel Principal Component Analysis for Action Recognition
Directory of Open Access Journals (Sweden)
Cong Liu
2013-01-01
Full Text Available We propose the Tensorial Kernel Principal Component Analysis (TKPCA for dimensionality reduction and feature extraction from tensor objects, which extends the conventional Principal Component Analysis (PCA in two perspectives: working directly with multidimensional data (tensors in their native state and generalizing an existing linear technique to its nonlinear version by applying the kernel trick. Our method aims to remedy the shortcomings of multilinear subspace learning (tensorial PCA developed recently in modelling the nonlinear manifold of tensor objects and brings together the desirable properties of kernel methods and tensor decompositions for significant performance gain when the data are multidimensional and nonlinear dependencies do exist. Our approach begins by formulating TKPCA as an optimization problem. Then, we develop a kernel function based on Grassmann Manifold that can directly take tensorial representation as parameters instead of traditional vectorized representation. Furthermore, a TKPCA-based tensor object recognition is also proposed for application of the action recognition. Experiments with real action datasets show that the proposed method is insensitive to both noise and occlusion and performs well compared with state-of-the-art algorithms.
Using Principal Components as Auxiliary Variables in Missing Data Estimation.
Howard, Waylon J; Rhemtulla, Mijke; Little, Todd D
2015-01-01
To deal with missing data that arise due to participant nonresponse or attrition, methodologists have recommended an "inclusive" strategy where a large set of auxiliary variables are used to inform the missing data process. In practice, the set of possible auxiliary variables is often too large. We propose using principal components analysis (PCA) to reduce the number of possible auxiliary variables to a manageable number. A series of Monte Carlo simulations compared the performance of the inclusive strategy with eight auxiliary variables (inclusive approach) to the PCA strategy using just one principal component derived from the eight original variables (PCA approach). We examined the influence of four independent variables: magnitude of correlations, rate of missing data, missing data mechanism, and sample size on parameter bias, root mean squared error, and confidence interval coverage. Results indicate that the PCA approach results in unbiased parameter estimates and potentially more accuracy than the inclusive approach. We conclude that using the PCA strategy to reduce the number of auxiliary variables is an effective and practical way to reap the benefits of the inclusive strategy in the presence of many possible auxiliary variables.
Direct Numerical Simulation of Combustion Using Principal Component Analysis
Owoyele, Opeoluwa; Echekki, Tarek
2016-11-01
We investigate the potential of accelerating chemistry integration during the direct numerical simulation (DNS) of complex fuels based on the transport equations of representative scalars that span the desired composition space using principal component analysis (PCA). The transported principal components (PCs) offer significant potential to reduce the computational cost of DNS through a reduction in the number of transported scalars, as well as the spatial and temporal resolution requirements. The strategy is demonstrated using DNS of a premixed methane-air flame in a 2D vortical flow and is extended to the 3D geometry to further demonstrate the computational efficiency of PC transport. The PCs are derived from a priori PCA of a subset of the full thermo-chemical scalars' vector. The PCs' chemical source terms and transport properties are constructed and tabulated in terms of the PCs using artificial neural networks (ANN). Comparison of DNS based on a full thermo-chemical state and DNS based on PC transport based on 6 PCs shows excellent agreement even for species that are not included in the PCA reduction. The transported PCs reproduce some of the salient features of strongly curved and strongly strained flames. The 2D DNS results also show a significant reduction of two orders of magnitude in the computational cost of the simulations, which enables an extension of the PCA approach to 3D DNS under similar computational requirements. This work was supported by the National Science Foundation Grant DMS-1217200.
Acceleration of dynamic fluorescence molecular tomography with principal component analysis.
Zhang, Guanglei; He, Wei; Pu, Huangsheng; Liu, Fei; Chen, Maomao; Bai, Jing; Luo, Jianwen
2015-06-01
Dynamic fluorescence molecular tomography (FMT) is an attractive imaging technique for three-dimensionally resolving the metabolic process of fluorescent biomarkers in small animal. When combined with compartmental modeling, dynamic FMT can be used to obtain parametric images which can provide quantitative pharmacokinetic information for drug development and metabolic research. However, the computational burden of dynamic FMT is extremely huge due to its large data sets arising from the long measurement process and the densely sampling device. In this work, we propose to accelerate the reconstruction process of dynamic FMT based on principal component analysis (PCA). Taking advantage of the compression property of PCA, the dimension of the sub weight matrix used for solving the inverse problem is reduced by retaining only a few principal components which can retain most of the effective information of the sub weight matrix. Therefore, the reconstruction process of dynamic FMT can be accelerated by solving the smaller scale inverse problem. Numerical simulation and mouse experiment are performed to validate the performance of the proposed method. Results show that the proposed method can greatly accelerate the reconstruction of parametric images in dynamic FMT almost without degradation in image quality.
Principal component and factor analytic models in international sire evaluation
Directory of Open Access Journals (Sweden)
Jakobsen Jette
2011-09-01
Full Text Available Abstract Background Interbull is a non-profit organization that provides internationally comparable breeding values for globalized dairy cattle breeding programmes. Due to different trait definitions and models for genetic evaluation between countries, each biological trait is treated as a different trait in each of the participating countries. This yields a genetic covariance matrix of dimension equal to the number of countries which typically involves high genetic correlations between countries. This gives rise to several problems such as over-parameterized models and increased sampling variances, if genetic (covariance matrices are considered to be unstructured. Methods Principal component (PC and factor analytic (FA models allow highly parsimonious representations of the (covariance matrix compared to the standard multi-trait model and have, therefore, attracted considerable interest for their potential to ease the burden of the estimation process for multiple-trait across country evaluation (MACE. This study evaluated the utility of PC and FA models to estimate variance components and to predict breeding values for MACE for protein yield. This was tested using a dataset comprising Holstein bull evaluations obtained in 2007 from 25 countries. Results In total, 19 principal components or nine factors were needed to explain the genetic variation in the test dataset. Estimates of the genetic parameters under the optimal fit were almost identical for the two approaches. Furthermore, the results were in a good agreement with those obtained from the full rank model and with those provided by Interbull. The estimation time was shortest for models fitting the optimal number of parameters and prolonged when under- or over-parameterized models were applied. Correlations between estimated breeding values (EBV from the PC19 and PC25 were unity. With few exceptions, correlations between EBV obtained using FA and PC approaches under the optimal fit were
Principal component analysis of FDG PET in amnestic MCI
Energy Technology Data Exchange (ETDEWEB)
Nobili, Flavio; Girtler, Nicola; Brugnolo, Andrea; Dessi, Barbara; Rodriguez, Guido [University of Genoa, Clinical Neurophysiology, Department of Endocrinological and Medical Sciences, Genoa (Italy); S. Martino Hospital, Alzheimer Evaluation Unit, Genoa (Italy); S. Martino Hospital, Head-Neck Department, Genoa (Italy); Salmaso, Dario [CNR, Institute of Cognitive Sciences and Technologies, Rome (Italy); CNR, Institute of Cognitive Sciences and Technologies, Padua (Italy); Morbelli, Silvia [University of Genoa, Nuclear Medicine Unit, Department of Internal Medicine, Genoa (Italy); Piccardo, Arnoldo [Galliera Hospital, Nuclear Medicine Unit, Department of Imaging Diagnostics, Genoa (Italy); Larsson, Stig A. [Karolinska Hospital, Department of Nuclear Medicine, Stockholm (Sweden); Pagani, Marco [CNR, Institute of Cognitive Sciences and Technologies, Rome (Italy); CNR, Institute of Cognitive Sciences and Technologies, Padua (Italy); Karolinska Hospital, Department of Nuclear Medicine, Stockholm (Sweden)
2008-12-15
The purpose of the study is to evaluate the combined accuracy of episodic memory performance and {sup 18}F-FDG PET in identifying patients with amnestic mild cognitive impairment (aMCI) converting to Alzheimer's disease (AD), aMCI non-converters, and controls. Thirty-three patients with aMCI and 15 controls (CTR) were followed up for a mean of 21 months. Eleven patients developed AD (MCI/AD) and 22 remained with aMCI (MCI/MCI). {sup 18}F-FDG PET volumetric regions of interest underwent principal component analysis (PCA) that identified 12 principal components (PC), expressed by coarse component scores (CCS). Discriminant analysis was performed using the significant PCs and episodic memory scores. PCA highlighted relative hypometabolism in PC5, including bilateral posterior cingulate and left temporal pole, and in PC7, including the bilateral orbitofrontal cortex, both in MCI/MCI and MCI/AD vs CTR. PC5 itself plus PC12, including the left lateral frontal cortex (LFC: BAs 44, 45, 46, 47), were significantly different between MCI/AD and MCI/MCI. By a three-group discriminant analysis, CTR were more accurately identified by PET-CCS + delayed recall score (100%), MCI/MCI by PET-CCS + either immediate or delayed recall scores (91%), while MCI/AD was identified by PET-CCS alone (82%). PET increased by 25% the correct allocations achieved by memory scores, while memory scores increased by 15% the correct allocations achieved by PET. Combining memory performance and {sup 18}F-FDG PET yielded a higher accuracy than each single tool in identifying CTR and MCI/MCI. The PC containing bilateral posterior cingulate and left temporal pole was the hallmark of MCI/MCI patients, while the PC including the left LFC was the hallmark of conversion to AD. (orig.)
Principal semantic components of language and the measurement of meaning.
Directory of Open Access Journals (Sweden)
Alexei V Samsonovich
Full Text Available Metric systems for semantics, or semantic cognitive maps, are allocations of words or other representations in a metric space based on their meaning. Existing methods for semantic mapping, such as Latent Semantic Analysis and Latent Dirichlet Allocation, are based on paradigms involving dissimilarity metrics. They typically do not take into account relations of antonymy and yield a large number of domain-specific semantic dimensions. Here, using a novel self-organization approach, we construct a low-dimensional, context-independent semantic map of natural language that represents simultaneously synonymy and antonymy. Emergent semantics of the map principal components are clearly identifiable: the first three correspond to the meanings of "good/bad" (valence, "calm/excited" (arousal, and "open/closed" (freedom, respectively. The semantic map is sufficiently robust to allow the automated extraction of synonyms and antonyms not originally in the dictionaries used to construct the map and to predict connotation from their coordinates. The map geometric characteristics include a limited number ( approximately 4 of statistically significant dimensions, a bimodal distribution of the first component, increasing kurtosis of subsequent (unimodal components, and a U-shaped maximum-spread planar projection. Both the semantic content and the main geometric features of the map are consistent between dictionaries (Microsoft Word and Princeton's WordNet, among Western languages (English, French, German, and Spanish, and with previously established psychometric measures. By defining the semantics of its dimensions, the constructed map provides a foundational metric system for the quantitative analysis of word meaning. Language can be viewed as a cumulative product of human experiences. Therefore, the extracted principal semantic dimensions may be useful to characterize the general semantic dimensions of the content of mental states. This is a fundamental step
Principal Components Analysis of Triaxial Vibration Data From Helicopter Transmissions
Tumer, Irem Y.; Huff, Edward M.
2001-01-01
Research on the nature of the vibration data collected from helicopter transmissions during flight experiments has led to several crucial observations believed to be responsible for the high rates of false alarms and missed detections in aircraft vibration monitoring systems. This work focuses on one such finding, namely, the need to consider additional sources of information about system vibrations. In this light, helicopter transmission vibration data, collected using triaxial accelerometers, were explored in three different directions, analyzed for content, and then combined using Principal Components Analysis (PCA) to analyze changes in directionality. In this paper, the PCA transformation is applied to 176 test conditions/data sets collected from an OH58C helicopter to derive the overall experiment-wide covariance matrix and its principal eigenvectors. The experiment-wide eigenvectors. are then projected onto the individual test conditions to evaluate changes and similarities in their directionality based on the various experimental factors. The paper will present the foundations of the proposed approach, addressing the question of whether experiment-wide eigenvectors accurately model the vibration modes in individual test conditions. The results will further determine the value of using directionality and triaxial accelerometers for vibration monitoring and anomaly detection.
Nonlinear fault diagnosis method based on kernel principal component analysis
Institute of Scientific and Technical Information of China (English)
Yan Weiwu; Zhang Chunkai; Shao Huihe
2005-01-01
To ensure the system run under working order, detection and diagnosis of faults play an important role in industrial process. This paper proposed a nonlinear fault diagnosis method based on kernel principal component analysis (KPCA). In proposed method, using essential information of nonlinear system extracted by KPCA, we constructed KPCA model of nonlinear system under normal working condition. Then new data were projected onto the KPCA model. When new data are incompatible with the KPCA model, it can be concluded that the nonlinear system isout of normal working condition. Proposed method was applied to fault diagnosison rolling bearings. Simulation results show proposed method provides an effective method for fault detection and diagnosis of nonlinear system.
NONLINEAR DATA RECONCILIATION METHOD BASED ON KERNEL PRINCIPAL COMPONENT ANALYSIS
Institute of Scientific and Technical Information of China (English)
无
2003-01-01
In the industrial process situation, principal component analysis (PCA) is a general method in data reconciliation.However, PCA sometime is unfeasible to nonlinear feature analysis and limited in application to nonlinear industrial process.Kernel PCA (KPCA) is extension of PCA and can be used for nonlinear feature analysis.A nonlinear data reconciliation method based on KPCA is proposed.The basic idea of this method is that firstly original data are mapped to high dimensional feature space by nonlinear function, and PCA is implemented in the feature space.Then nonlinear feature analysis is implemented and data are reconstructed by using the kernel.The data reconciliation method based on KPCA is applied to ternary distillation column.Simulation results show that this method can filter the noise in measurements of nonlinear process and reconciliated data can represent the true information of nonlinear process.
Support vector classifier based on principal component analysis
Institute of Scientific and Technical Information of China (English)
无
2008-01-01
Support vector classifier (SVC) has the superior advantages for small sample learning problems with high dimensions,with especially better generalization ability.However there is some redundancy among the high dimensions of the original samples and the main features of the samples may be picked up first to improve the performance of SVC.A principal component analysis (PCA) is employed to reduce the feature dimensions of the original samples and the pre-selected main features efficiently,and an SVC is constructed in the selected feature space to improve the learning speed and identification rate of SVC.Furthermore,a heuristic genetic algorithm-based automatic model selection is proposed to determine the hyperparameters of SVC to evaluate the performance of the learning machines.Experiments performed on the Heart and Adult benchmark data sets demonstrate that the proposed PCA-based SVC not only reduces the test time drastically,but also improves the identify rates effectively.
Recursive Principal Components Analysis Using Eigenvector Matrix Perturbation
Directory of Open Access Journals (Sweden)
Deniz Erdogmus
2004-10-01
Full Text Available Principal components analysis is an important and well-studied subject in statistics and signal processing. The literature has an abundance of algorithms for solving this problem, where most of these algorithms could be grouped into one of the following three approaches: adaptation based on Hebbian updates and deflation, optimization of a second-order statistical criterion (like reconstruction error or output variance, and fixed point update rules with deflation. In this paper, we take a completely different approach that avoids deflation and the optimization of a cost function using gradients. The proposed method updates the eigenvector and eigenvalue matrices simultaneously with every new sample such that the estimates approximately track their true values as would be calculated from the current sample estimate of the data covariance matrix. The performance of this algorithm is compared with that of traditional methods like Sanger's rule and APEX, as well as a structurally similar matrix perturbation-based method.
Principal Component Analysis with Contaminated Data: The High Dimensional Case
Xu, Huan; Mannor, Shie
2010-01-01
We consider the dimensionality-reduction problem (finding a subspace approximation of observed data) for contaminated data in the high dimensional regime, where the number of observations is of the same magnitude as the number of variables of each observation, and the data set contains some (arbitrarily) corrupted observations. We propose a High-dimensional Robust Principal Component Analysis (HR-PCA) algorithm that is tractable, robust to contaminated points, and easily kernelizable. The resulting subspace has a bounded deviation from the desired one, achieves maximal robustness -- a breakdown point of 50% while all existing algorithms have a breakdown point of zero, and unlike ordinary PCA algorithms, achieves optimality in the limit case where the proportion of corrupted points goes to zero.
Method of Real-Time Principal-Component Analysis
Duong, Tuan; Duong, Vu
2005-01-01
Dominant-element-based gradient descent and dynamic initial learning rate (DOGEDYN) is a method of sequential principal-component analysis (PCA) that is well suited for such applications as data compression and extraction of features from sets of data. In comparison with a prior method of gradient-descent-based sequential PCA, this method offers a greater rate of learning convergence. Like the prior method, DOGEDYN can be implemented in software. However, the main advantage of DOGEDYN over the prior method lies in the facts that it requires less computation and can be implemented in simpler hardware. It should be possible to implement DOGEDYN in compact, low-power, very-large-scale integrated (VLSI) circuitry that could process data in real time.
Correlation and principal component analysis in ceramic tiles characterization
Directory of Open Access Journals (Sweden)
Podunavac-Kuzmanović Sanja O.
2015-01-01
Full Text Available The present study deals with the analysis of the characteristics of ceramic wall and floor tiles on the basis of their quality parameters: breaking force, flexural strenght, absorption and shrinking. Principal component analysis was applied in order to detect potential similarities and dissimilarities among the analyzed tile samples, as well as the firing regimes. Correlation analysis was applied in order to find correlations among the studied quality parameters of the tiles. The obtained results indicate particular differences between the samples on the basis of the firing regimes. However, the correlation analysis points out that there is no statistically significant correlation among the quality parameters of the studied samples of the wall and floor ceramic tiles.[Projekat Ministarstva nauke Republike Srbije, br. 172012 i br. III 45008
Suppressing Background Radiation Using Poisson Principal Component Analysis
Tandon, P; Dubrawski, A; Labov, S; Nelson, K
2016-01-01
Performance of nuclear threat detection systems based on gamma-ray spectrometry often strongly depends on the ability to identify the part of measured signal that can be attributed to background radiation. We have successfully applied a method based on Principal Component Analysis (PCA) to obtain a compact null-space model of background spectra using PCA projection residuals to derive a source detection score. We have shown the method's utility in a threat detection system using mobile spectrometers in urban scenes (Tandon et al 2012). While it is commonly assumed that measured photon counts follow a Poisson process, standard PCA makes a Gaussian assumption about the data distribution, which may be a poor approximation when photon counts are low. This paper studies whether and in what conditions PCA with a Poisson-based loss function (Poisson PCA) can outperform standard Gaussian PCA in modeling background radiation to enable more sensitive and specific nuclear threat detection.
Spatial control of groundwater contamination, using principal component analysis
Indian Academy of Sciences (India)
N Subba Rao
2014-06-01
A study on the geochemistry of groundwater was carried out in a river basin of Andhra Pradesh to probe into the spatial controlling processes of groundwater contamination, using principal component analysis (PCA). The PCA transforms the chemical variables, pH, EC, Ca2+, Mg2+, Na+, K+, HCO$^{−}_{3}$, Cl−, SO$^{2−}_{4}$, NO$^{−}_{3}$ and F−, into two orthogonal principal components (PC1 and PC2), accounting for 75% of the total variance of the data matrix. PC1 has high positive loadings of EC, Na+, Cl−, SO$^{2−}_{4}$, Mg2+ and Ca2+, representing a salinity controlled process of geogenic (mineral dissolution, ion exchange, and evaporation), anthropogenic (agricultural activities and domestic wastewaters), and marine (marine clay) origin. The PC2 loadings are highly positive for HCO$^{−}_{3}$, F−, pH and NO$^{−}_{3}$, attributing to the alkalinity and pollution controlled processes of geogenic and anthropogenic origins. The PC scores reflect the change of groundwater quality of geogenic origin from upstream to downstream area with an increase in concentration of chemical variables, which is due to anthropogenic and marine origins with varying topography, soil type, depth of water levels, and water usage. Thus, the groundwater quality shows a variation of chemical facies from Na+ > Ca2+ > Mg2+ > K+: HCO$^{−}_{3}$ > Cl− > SO$^{2−}_{4}$ > NO$^{−}_{3}$ > F− at high topography to Na+ > Mg2+ > Ca2+ > K+: Cl− > HCO$^{−}_{3}$ > SO$^{2−}_{4}$ > NO$^{−}_{3}$ > F− at low topography. With PCA, an effective tool for the spatial controlling processes of groundwater contamination, a subset of explored wells is indexed for continuous monitoring to optimize the expensive effort.
Authentication Scheme Based on Principal Component Analysis for Satellite Images
Directory of Open Access Journals (Sweden)
Ashraf. K. Helmy
2009-09-01
Full Text Available This paper presents a multi-band wavelet image content authentication scheme for satellite images by incorporating the principal component analysis (PCA. The proposed schemeachieves higher perceptual transparency and stronger robustness. Specifically, the developed watermarking scheme can successfully resist common signal processing such as JPEG compression and geometric distortions such as cropping. In addition, the proposed scheme can be parameterized, thus resulting in more security. That is, an attacker may not be able to extract the embedded watermark if the attacker does not know the parameter.In an order to meet these requirements, the host image is transformed to YIQ to decrease the correlation between different bands, Then Multi-band Wavelet transform (M-WT is applied to each channel separately obtaining one approximate sub band and fifteen detail sub bands. PCA is then applied to the coefficients corresponding to the same spatial location in all detail sub bands. The last principle component band represents an excellent domain forinserting the water mark since it represents lowest correlated features in high frequency area of host image.One of the most important aspects of satellite images is spectral signature, the behavior of different features in different spectral bands, the results of proposed algorithm shows that the spectral stamp for different features doesn't tainted after inserting the watermark.
Principal Component Analysis for pattern recognition in volcano seismic spectra
Unglert, Katharina; Jellinek, A. Mark
2016-04-01
Variations in the spectral content of volcano seismicity can relate to changes in volcanic activity. Low-frequency seismic signals often precede or accompany volcanic eruptions. However, they are commonly manually identified in spectra or spectrograms, and their definition in spectral space differs from one volcanic setting to the next. Increasingly long time series of monitoring data at volcano observatories require automated tools to facilitate rapid processing and aid with pattern identification related to impending eruptions. Furthermore, knowledge transfer between volcanic settings is difficult if the methods to identify and analyze the characteristics of seismic signals differ. To address these challenges we have developed a pattern recognition technique based on a combination of Principal Component Analysis and hierarchical clustering applied to volcano seismic spectra. This technique can be used to characterize the dominant spectral components of volcano seismicity without the need for any a priori knowledge of different signal classes. Preliminary results from applying our method to volcanic tremor from a range of volcanoes including K¯ı lauea, Okmok, Pavlof, and Redoubt suggest that spectral patterns from K¯ı lauea and Okmok are similar, whereas at Pavlof and Redoubt spectra have their own, distinct patterns.
Demixed principal component analysis of neural population data
Kobak, Dmitry; Brendel, Wieland; Constantinidis, Christos; Feierstein, Claudia E; Kepecs, Adam; Mainen, Zachary F; Qi, Xue-Lian; Romo, Ranulfo; Uchida, Naoshige; Machens, Christian K
2016-01-01
Neurons in higher cortical areas, such as the prefrontal cortex, are often tuned to a variety of sensory and motor variables, and are therefore said to display mixed selectivity. This complexity of single neuron responses can obscure what information these areas represent and how it is represented. Here we demonstrate the advantages of a new dimensionality reduction technique, demixed principal component analysis (dPCA), that decomposes population activity into a few components. In addition to systematically capturing the majority of the variance of the data, dPCA also exposes the dependence of the neural representation on task parameters such as stimuli, decisions, or rewards. To illustrate our method we reanalyze population data from four datasets comprising different species, different cortical areas and different experimental tasks. In each case, dPCA provides a concise way of visualizing the data that summarizes the task-dependent features of the population response in a single figure. DOI: http://dx.doi.org/10.7554/eLife.10989.001 PMID:27067378
Derivation of Boundary Manikins: A Principal Component Analysis
Young, Karen; Margerum, Sarah; Barr, Abbe; Ferrer, Mike A.; Rajulu, Sudhakar
2008-01-01
When designing any human-system interface, it is critical to provide realistic anthropometry to properly represent how a person fits within a given space. This study aimed to identify a minimum number of boundary manikins or representative models of subjects anthropometry from a target population, which would realistically represent the population. The boundary manikin anthropometry was derived using, Principal Component Analysis (PCA). PCA is a statistical approach to reduce a multi-dimensional dataset using eigenvectors and eigenvalues. The measurements used in the PCA were identified as those measurements critical for suit and cockpit design. The PCA yielded a total of 26 manikins per gender, as well as their anthropometry from the target population. Reduction techniques were implemented to reduce this number further with a final result of 20 female and 22 male subjects. The anthropometry of the boundary manikins was then be used to create 3D digital models (to be discussed in subsequent papers) intended for use by designers to test components of their space suit design, to verify that the requirements specified in the Human Systems Integration Requirements (HSIR) document are met. The end-goal is to allow for designers to generate suits which accommodate the diverse anthropometry of the user population.
Research on Rural Consumer Demand in Hebei Province Based on Principal Component Analysis
MA Hui-zi; Zhao, Bang-hong; Xuan, Yong-sheng
2011-01-01
By selecting me time sequence data concerning influencing factors of rural consumer demand in Hebei Province from 2000 to 2010, this paper uses the principal component analysis method in multiplex econometric statistical analysis, constructs the principal component of consumer demand in Hebei Province, conducts regression on the dependent variable of consumer spending per capita in Hebei Province and the principal component of consumer demand so as to get principal component regression, and t...
A Principal Component Analysis of the Diffuse Interstellar Bands
Ensor, T.; Cami, J.; Bhatt, N. H.; Soddu, A.
2017-02-01
We present a principal component (PC) analysis of 23 line-of-sight parameters (including the strengths of 16 diffuse interstellar bands, DIBs) for a well-chosen sample of single-cloud sightlines representing a broad range of environmental conditions. Our analysis indicates that the majority (˜93%) of the variations in the measurements can be captured by only four parameters The main driver (i.e., the first PC) is the amount of DIB-producing material in the line of sight, a quantity that is extremely well traced by the equivalent width of the λ5797 DIB. The second PC is the amount of UV radiation, which correlates well with the λ5797/λ5780 DIB strength ratio. The remaining two PCs are more difficult to interpret, but are likely related to the properties of dust in the line of sight (e.g., the gas-to-dust ratio). With our PCA results, the DIBs can then be used to estimate these line-of-sight parameters.
Principal Component Analysis of Process Datasets with Missing Values
Directory of Open Access Journals (Sweden)
Kristen A. Severson
2017-07-01
Full Text Available Datasets with missing values arising from causes such as sensor failure, inconsistent sampling rates, and merging data from different systems are common in the process industry. Methods for handling missing data typically operate during data pre-processing, but can also occur during model building. This article considers missing data within the context of principal component analysis (PCA, which is a method originally developed for complete data that has widespread industrial application in multivariate statistical process control. Due to the prevalence of missing data and the success of PCA for handling complete data, several PCA algorithms that can act on incomplete data have been proposed. Here, algorithms for applying PCA to datasets with missing values are reviewed. A case study is presented to demonstrate the performance of the algorithms and suggestions are made with respect to choosing which algorithm is most appropriate for particular settings. An alternating algorithm based on the singular value decomposition achieved the best results in the majority of test cases involving process datasets.
Principal Component Analysis studies of turbulence in optically thick gas
Correia, Caio; Burkhart, Blakesley; Pogosyan, Dmitri; De Medeiros, José Renan
2015-01-01
In this work we investigate the Principal Component Analysis (PCA) sensitivity to the velocity power spectrum in high opacity regimes of the interstellar medium (ISM). For our analysis we use synthetic Position-Position-Velocity (PPV) cubes of fractional Brownian motion (fBm) and magnetohydrodynamics (MHD) simulations, post processed to include radiative transfer effects from CO. We find that PCA analysis is very different from the tools based on the traditional power spectrum of PPV data cubes. Our major finding is that PCA is also sensitive to the phase information of PPV cubes and this allows PCA to detect the changes of the underlying velocity and density spectra at high opacities, where the spectral analysis of the maps provides the universal -3 spectrum in accordance with the predictions of Lazarian \\& Pogosyan (2004) theory. This makes PCA potentially a valuable tool for studies of turbulence at high opacities provided that the proper gauging of the PCA index is made. The later, however, we found t...
Comparison of analytical eddy current models using principal components analysis
Contant, S.; Luloff, M.; Morelli, J.; Krause, T. W.
2017-02-01
Monitoring the gap between the pressure tube (PT) and the calandria tube (CT) in CANDU® fuel channels is essential, as contact between the two tubes can lead to delayed hydride cracking of the pressure tube. Multifrequency transmit-receive eddy current non-destructive evaluation is used to determine this gap, as this method has different depths of penetration and variable sensitivity to noise, unlike single frequency eddy current non-destructive evaluation. An Analytical model based on the Dodd and Deeds solutions, and a second model that accounts for normal and lossy self-inductances, and a non-coaxial pickup coil, are examined for representing the response of an eddy current transmit-receive probe when considering factors that affect the gap response, such as pressure tube wall thickness and pressure tube resistivity. The multifrequency model data was analyzed using principal components analysis (PCA), a statistical method used to reduce the data set into a data set of fewer variables. The results of the PCA of the analytical models were then compared to PCA performed on a previously obtained experimental data set. The models gave similar results under variable PT wall thickness conditions, but the non-coaxial coil model, which accounts for self-inductive losses, performed significantly better than the Dodd and Deeds model under variable resistivity conditions.
A principal component analysis of 39 scientific impact measures
Bollen, Johan; Hagberg, Aric; Chute, Ryan
2009-01-01
The impact of scientific publications has traditionally been expressed in terms of citation counts. However, scientific activity has moved online over the past decade. To better capture scientific impact in the digital era, a variety of new impact measures has been proposed on the basis of social network analysis and usage log data. Here we investigate how these new measures relate to each other, and how accurately and completely they express scientific impact. We performed a principal component analysis of the rankings produced by 39 existing and proposed measures of scholarly impact that were calculated on the basis of both citation and usage log data. Our results indicate that the notion of scientific impact is a multi-dimensional construct that can not be adequately measured by any single indicator, although some measures are more suitable than others. The commonly used citation Impact Factor is not positioned at the core of this construct, but at its periphery, and should thus be used with caution.
Transfer Learning via Multi-View Principal Component Analysis
Institute of Scientific and Technical Information of China (English)
Yang-Sheng Ji; Jia-Jun Chen; Gang Niu; Lin Shang; Xin-Yu Dai
2011-01-01
Transfer learning aims at leveraging the knowledge in labeled source domains to predict the unlabeled data in a target domain, where the distributions are different in domains. Among various methods for transfer learning, one kind of algorithms focus on the correspondence between bridge features and all the other specific features from different domains, and later conduct transfer learning via the single-view correspondence. However, the single-view correspondence may prevent these algorithms from further improvement due to the problem of incorrect correlation discovery. To tackle this problem, we propose a new method for transfer learning in a multi-view correspondence perspective, which is called Multi-View Principal Component Analysis (MVPCA) approach. MVPCA discovers the correspondence between bridge features representative across all domains and specific features from different domains respectively, and conducts the transfer learning by dimensionality reduction in a multi-view way, which can better depict the knowledge transfer. Experiments show that MVPCA can significantly reduce the cross domain prediction error of a baseline non-transfer method. With multi-view correspondence information incorporated to the single-view transfer learning method, MVPCA can further improve the performance of one state-of-the-art single-view method.
PRINCIPAL COMPONENT ANALYSIS STUDIES OF TURBULENCE IN OPTICALLY THICK GAS
Energy Technology Data Exchange (ETDEWEB)
Correia, C.; Medeiros, J. R. De [Departamento de Física Teórica e Experimental, Universidade Federal do Rio Grande do Norte, 59072-970, Natal (Brazil); Lazarian, A. [Astronomy Department, University of Wisconsin, Madison, 475 N. Charter St., WI 53711 (United States); Burkhart, B. [Harvard-Smithsonian Center for Astrophysics, 60 Garden St, MS-20, Cambridge, MA 02138 (United States); Pogosyan, D., E-mail: caioftc@dfte.ufrn.br [Canadian Institute for Theoretical Astrophysics, University of Toronto, Toronto, ON (Canada)
2016-02-20
In this work we investigate the sensitivity of principal component analysis (PCA) to the velocity power spectrum in high-opacity regimes of the interstellar medium (ISM). For our analysis we use synthetic position–position–velocity (PPV) cubes of fractional Brownian motion and magnetohydrodynamics (MHD) simulations, post-processed to include radiative transfer effects from CO. We find that PCA analysis is very different from the tools based on the traditional power spectrum of PPV data cubes. Our major finding is that PCA is also sensitive to the phase information of PPV cubes and this allows PCA to detect the changes of the underlying velocity and density spectra at high opacities, where the spectral analysis of the maps provides the universal −3 spectrum in accordance with the predictions of the Lazarian and Pogosyan theory. This makes PCA a potentially valuable tool for studies of turbulence at high opacities, provided that proper gauging of the PCA index is made. However, we found the latter to not be easy, as the PCA results change in an irregular way for data with high sonic Mach numbers. This is in contrast to synthetic Brownian noise data used for velocity and density fields that show monotonic PCA behavior. We attribute this difference to the PCA's sensitivity to Fourier phase information.
Principal component analysis of gene frequencies of Chinese populations
Institute of Scientific and Technical Information of China (English)
肖春杰; L.L.Cavalli-Sforza; E.Minch; 杜若甫
2000-01-01
Principal components (PCs) were calculated based on gene frequencies of 130 alleles at 38 loci in Chinese populations, and geographic PC maps were constructed. The first PC map of the Han shows the genetic difference between Southern and Northern Mongoloids, while the second PC indicates the gene flow between Caucasoid and Mongoloids. The first PC map of the Chinese ethnic minorities is similar to that of the second PC map of the Han, while their second PC map is similar to the first PC map of the Han. When calculating PC with the gene frequency data from both the Han and ethnic minorities, the first and second PC maps most resemble those of the ethnic minorities alone. The third and fourth PC maps of Chinese populations may reflect historical events that allowed the expansion of the populations in the highly civilized regions. A clear-cut boundary between Southern and Northern Mongoloids in the synthetic map of the Chinese populations was observed in the zone of the Yangtze River. We suggest that the a
Construction Formula of Biological Age Using the Principal Component Analysis
Directory of Open Access Journals (Sweden)
Linpei Jia
2016-01-01
Full Text Available The biological age (BA equation is a prediction model that utilizes an algorithm to combine various biological markers of ageing. Different from traditional concepts, the BA equation does not emphasize the importance of a golden index but focuses on using indices of vital organs to represent the senescence of whole body. This model has been used to assess the ageing process in a more precise way and may predict possible diseases better as compared with the chronological age (CA. The principal component analysis (PCA is applied as one of the common and frequently used methods in the construction of the BA formula. Compared with other methods, PCA has its own study procedures and features. Herein we summarize the up-to-date knowledge about the BA formula construction and discuss the influential factors, so as to give an overview of BA estimate by PCA, including composition of samples, choices of test items, and selection of ageing biomarkers. We also discussed the advantages and disadvantages of PCA with reference to the construction mechanism, accuracy, and practicability of several common methods in the construction of the BA formula.
Principal Component Analysis and Automatic Relevance Determination in Damage Identification
Mdlazi, L; Stander, C J; Scheffer, C; Heyns, P S
2007-01-01
This paper compares two neural network input selection schemes, the Principal Component Analysis (PCA) and the Automatic Relevance Determination (ARD) based on Mac-Kay's evidence framework. The PCA takes all the input data and projects it onto a lower dimension space, thereby reduc-ing the dimension of the input space. This input reduction method often results with parameters that have significant influence on the dynamics of the data being diluted by those that do not influence the dynamics of the data. The ARD selects the most relevant input parameters and discards those that do not contribute significantly to the dynamics of the data being modelled. The ARD sometimes results with important input parameters being discarded thereby compromising the dynamics of the data. The PCA and ARD methods are implemented together with a Multi-Layer-Perceptron (MLP) network for fault identification in structures and the performance of the two methods is as-sessed. It is observed that ARD and PCA give similar accu-racy le...
Biological agent detection based on principal component analysis
Mudigonda, Naga R.; Kacelenga, Ray
2006-05-01
This paper presents an algorithm, based on principal component analysis for the detection of biological threats using General Dynamics Canada's 4WARN Sentry 3000 biodetection system. The proposed method employs a statistical method for estimating background biological activity so as to make the algorithm adaptive to varying background situations. The method attempts to characterize the pattern of change that occurs in the fluorescent particle counts distribution and uses the information to suppress false-alarms. The performance of the method was evaluated using a total of 68 tests including 51 releases of Bacillus Globigii (BG), six releases of BG in the presence of obscurants, six releases of obscurants only, and five releases of ovalbumin at the Ambient Breeze Tunnel Test facility, Battelle, OH. The peak one-minute average concentration of BG used in the tests ranged from 10 - 65 Agent Containing Particles per Liter of Air (ACPLA). The obscurants used in the tests included diesel smoke, white grenade smoke, and salt solution. The method successfully detected BG at a sensitivity of 10 ACPLA and resulted in an overall probability of detection of 94% for BG without generating any false-alarms for obscurants at a detection threshold of 0.6 on a scale of 0 to 1. Also, the method successfully detected BG in the presence of diesel smoke and salt water fumes. The system successfully responded to all the five ovalbumin releases with noticeable trends in algorithm output and alarmed for two releases at the selected detection threshold.
A principal component analysis of 39 scientific impact measures.
Directory of Open Access Journals (Sweden)
Johan Bollen
Full Text Available BACKGROUND: The impact of scientific publications has traditionally been expressed in terms of citation counts. However, scientific activity has moved online over the past decade. To better capture scientific impact in the digital era, a variety of new impact measures has been proposed on the basis of social network analysis and usage log data. Here we investigate how these new measures relate to each other, and how accurately and completely they express scientific impact. METHODOLOGY: We performed a principal component analysis of the rankings produced by 39 existing and proposed measures of scholarly impact that were calculated on the basis of both citation and usage log data. CONCLUSIONS: Our results indicate that the notion of scientific impact is a multi-dimensional construct that can not be adequately measured by any single indicator, although some measures are more suitable than others. The commonly used citation Impact Factor is not positioned at the core of this construct, but at its periphery, and should thus be used with caution.
Principal component approach in variance component estimation for international sire evaluation
Directory of Open Access Journals (Sweden)
Jakobsen Jette
2011-05-01
Full Text Available Abstract Background The dairy cattle breeding industry is a highly globalized business, which needs internationally comparable and reliable breeding values of sires. The international Bull Evaluation Service, Interbull, was established in 1983 to respond to this need. Currently, Interbull performs multiple-trait across country evaluations (MACE for several traits and breeds in dairy cattle and provides international breeding values to its member countries. Estimating parameters for MACE is challenging since the structure of datasets and conventional use of multiple-trait models easily result in over-parameterized genetic covariance matrices. The number of parameters to be estimated can be reduced by taking into account only the leading principal components of the traits considered. For MACE, this is readily implemented in a random regression model. Methods This article compares two principal component approaches to estimate variance components for MACE using real datasets. The methods tested were a REML approach that directly estimates the genetic principal components (direct PC and the so-called bottom-up REML approach (bottom-up PC, in which traits are sequentially added to the analysis and the statistically significant genetic principal components are retained. Furthermore, this article evaluates the utility of the bottom-up PC approach to determine the appropriate rank of the (covariance matrix. Results Our study demonstrates the usefulness of both approaches and shows that they can be applied to large multi-country models considering all concerned countries simultaneously. These strategies can thus replace the current practice of estimating the covariance components required through a series of analyses involving selected subsets of traits. Our results support the importance of using the appropriate rank in the genetic (covariance matrix. Using too low a rank resulted in biased parameter estimates, whereas too high a rank did not result in
Xie, Jianwen; Douglas, Pamela K; Wu, Ying Nian; Brody, Arthur L; Anderson, Ariana E
2017-04-15
Brain networks in fMRI are typically identified using spatial independent component analysis (ICA), yet other mathematical constraints provide alternate biologically-plausible frameworks for generating brain networks. Non-negative matrix factorization (NMF) would suppress negative BOLD signal by enforcing positivity. Spatial sparse coding algorithms (L1 Regularized Learning and K-SVD) would impose local specialization and a discouragement of multitasking, where the total observed activity in a single voxel originates from a restricted number of possible brain networks. The assumptions of independence, positivity, and sparsity to encode task-related brain networks are compared; the resulting brain networks within scan for different constraints are used as basis functions to encode observed functional activity. These encodings are then decoded using machine learning, by using the time series weights to predict within scan whether a subject is viewing a video, listening to an audio cue, or at rest, in 304 fMRI scans from 51 subjects. The sparse coding algorithm of L1 Regularized Learning outperformed 4 variations of ICA (pcoding algorithms. Holding constant the effect of the extraction algorithm, encodings using sparser spatial networks (containing more zero-valued voxels) had higher classification accuracy (pcoding algorithms suggests that algorithms which enforce sparsity, discourage multitasking, and promote local specialization may capture better the underlying source processes than those which allow inexhaustible local processes such as ICA. Negative BOLD signal may capture task-related activations. Copyright © 2017 Elsevier B.V. All rights reserved.
Tracing cattle breeds with principal components analysis ancestry informative SNPs.
Directory of Open Access Journals (Sweden)
Jamey Lewis
Full Text Available The recent release of the Bovine HapMap dataset represents the most detailed survey of bovine genetic diversity to date, providing an important resource for the design and development of livestock production. We studied this dataset, comprising more than 30,000 Single Nucleotide Polymorphisms (SNPs for 19 breeds (13 taurine, three zebu, and three hybrid breeds, seeking to identify small panels of genetic markers that can be used to trace the breed of unknown cattle samples. Taking advantage of the power of Principal Components Analysis and algorithms that we have recently described for the selection of Ancestry Informative Markers from genomewide datasets, we present a decision-tree which can be used to accurately infer the origin of individual cattle. In doing so, we present a thorough examination of population genetic structure in modern bovine breeds. Performing extensive cross-validation experiments, we demonstrate that 250-500 carefully selected SNPs suffice in order to achieve close to 100% prediction accuracy of individual ancestry, when this particular set of 19 breeds is considered. Our methods, coupled with the dense genotypic data that is becoming increasingly available, have the potential to become a valuable tool and have considerable impact in worldwide livestock production. They can be used to inform the design of studies of the genetic basis of economically important traits in cattle, as well as breeding programs and efforts to conserve biodiversity. Furthermore, the SNPs that we have identified can provide a reliable solution for the traceability of breed-specific branded products.
MEASURING THE LEANNESS OF SUPPLIERS USING PRINCIPAL COMPONENT ANALYSIS TECHNIQUE
Directory of Open Access Journals (Sweden)
Y. Zare Mehrjerdi
2012-01-01
Full Text Available
ENGLISH ABSTRACT: A technique that helps management to reduce costs and improve quality is ‘lean supply chain management’, which focuses on the elimination of all wastes in every stage of the supply chain and is derived from ‘agile production’. This research aims to assess and rank the suppliers in an auto industry, based upon the concept of ‘production leanness’. The focus of this research is on the suppliers of a company called Touse-Omron Naein. We have examined the literature about leanness, and classified its criteria into ten dimensions and 76 factors. A questionnaire was used to collect the data, and the suppliers were ranked using the principal component analysis (PCA technique.
AFRIKAANSE OPSOMMING: Lenige voorsieningsbestuur (“lean supply chain management” is ’n tegniek wat bestuur in staat stel om koste te verminder en gehalte te verbeter. Dit fokus op die vermindering van vermorsing op elke stadium van die voorsieningsketting en word afgelei van ratse vervaardiging (“agile production”. Hierdie navorsing poog om leweransiers in ’n motorbedryf te beoordeel aan die hand van die konsep van vervaardigingslenigheid (“production leanness”. Die navorsing fokus op leweransiers van ’n maatskappy genaamd Touse-Omron Naein. ’n Literatuurstudie aangaande lenigheid het gelei tot die klassifikasie van kriteria in tien dimensies en 76 faktore. ’n Vraelys is gebruik om die data te versamel en die leweransiers is in rangvolgorde geplaas aan die hand van die PCA-tegniek.
WEB SERVICE SELECTION ALGORITHM BASED ON PRINCIPAL COMPONENT ANALYSIS
Institute of Scientific and Technical Information of China (English)
Kang Guosheng; Liu Jianxun; Tang Mingdong; Cao Buqing
2013-01-01
Existing Web service selection approaches usually assume that preferences of users have been provided in a quantitative form by users.However,due to the subjectivity and vagueness of preferences,it may be impractical for users to specify quantitative and exact preferences.Moreover,due to that Quality of Service (QoS) attributes are often interrelated,existing Web service selection approaches which employ weighted summation of QoS attribute values to compute the overall QoS of Web services may produce inaccurate results,since they do not take correlations among QoS attributes into account.To resolve these problems,a Web service selection framework considering user's preference priority is proposed,which incorporates a searching mechanism with QoS range setting to identify services satisfying the user's QoS constraints.With the identified service candidates,based on the idea of Principal Component Analysis (PCA),an algorithm of Web service selection named PCAoWSS (Web Service Selection based on PCA) is proposed,which can eliminate the correlations among QoS attributes and compute the overall QoS of Web services accurately.After computing the overall QoS for each service,the algorithm ranks the Web service candidates based on their overall QoS and recommends services with top QoS values to users.Finally,the effectiveness and feasibility of our approach are validated by experiments,i.e.the selected Web service by our approach is given high average evaluation than other ones by users and the time cost of PCA-WSS algorithm is not affected acutely by the number of service candidates.
Principal component analysis of gene frequencies of Chinese populations
Institute of Scientific and Technical Information of China (English)
无
2000-01-01
Principal components (PCs) were calculated based on gene frequencies of 130 alleles at 38 loci in Chinese populations, and geographic PC maps were constructed. The first PC map of the Han shows the genetic difference between Southern and Northern Mongoloids, while the second PC indicates the gene flow between Caucasoid and Mongoloids. The first PC map of the Chinese ethnic minorities is similar to that of the second PC map of the Han, while their second PC map is similar to the first PC map of the Han. When calculating PC with the gene frequency data from both the Han and ethnic minorities, the first and second PC maps most resemble those of the ethnic minorities alone. The third and fourth PC maps of Chinese populations may reflect historical events that allowed the expansion of the populations in the highly civilized regions. A clear-cut boundary between Southern and Northern Mongoloids in the synthetic map of the Chinese populations was observed in the zone of the Yangtze River. We suggest that the ancestors of Southern and Northern Mongoloids had already separated before reaching Asia. The ancestors of the Southern Mongoloids may result from the initial expansion from Africa or the Middle East, via the south coast of Asia, toward Southeast Asia, and ultimately South China. Upon reaching the Yangtze River, they might even have crossed the river to occupy the nearby regions for a period of time. The ancestors of the Northern Mongoloids probably expanded from Africa via the Northern Pamirs, first went eastward, then towards the south to reach the Yangtze River. The expansion of the Northern Mongoloids toward the south of the Yangtze River happened only in the last 2 or 3 thousand years.
Inverse spatial principal component analysis for geophysical survey data interpolation
Li, Qingmou; Dehler, Sonya A.
2015-04-01
The starting point for data processing, visualization, and overlay with other data sources in geological applications often involves building a regular grid by interpolation of geophysical measurements. Typically, the sampling interval along survey lines is much higher than the spacing between survey lines because the geophysical recording system is able to operate with a high sampling rate, while the costs and slower speeds associated with operational platforms limit line spacing. However, currently available interpolating methods often smooth data observed with higher sampling rate along a survey line to accommodate the lower spacing across lines, and much of the higher resolution information is not captured in the interpolation process. In this approach, a method termed as the inverse spatial principal component analysis (isPCA) is developed to address this problem. In the isPCA method, a whole profile observation as well as its line position is handled as an entity and a survey collection of line entities is analyzed for interpolation. To test its performance, the developed isPCA method is used to process a simulated airborne magnetic survey from an existing magnetic grid offshore the Atlantic coast of Canada. The interpolation results using the isPCA method and other methods are compared with the original survey grid. It is demonstrated that the isPCA method outperforms the Inverse Distance Weighting (IDW), Kriging (Geostatistical), and MINimum Curvature (MINC) interpolation methods in retaining detailed anomaly structures and restoring original values. In a second test, a high resolution magnetic survey offshore Cape Breton, Nova Scotia, Canada, was processed and the results are compared with other geological information. This example demonstrates the effective performance of the isPCA method in basin structure identification.
Saccenti, E.; Camacho, J.
2015-01-01
Principal component analysis is one of the most commonly used multivariate tools to describe and summarize data. Determining the optimal number of components in a principal component model is a fundamental problem in many fields of application. In this paper we compare the performance of several met
Mining gene expression data by interpreting principal components
Directory of Open Access Journals (Sweden)
Mortazavi Ali
2006-04-01
Full Text Available Abstract Background There are many methods for analyzing microarray data that group together genes having similar patterns of expression over all conditions tested. However, in many instances the biologically important goal is to identify relatively small sets of genes that share coherent expression across only some conditions, rather than all or most conditions as required in traditional clustering; e.g. genes that are highly up-regulated and/or down-regulated similarly across only a subset of conditions. Equally important is the need to learn which conditions are the decisive ones in forming such gene sets of interest, and how they relate to diverse conditional covariates, such as disease diagnosis or prognosis. Results We present a method for automatically identifying such candidate sets of biologically relevant genes using a combination of principal components analysis and information theoretic metrics. To enable easy use of our methods, we have developed a data analysis package that facilitates visualization and subsequent data mining of the independent sources of significant variation present in gene microarray expression datasets (or in any other similarly structured high-dimensional dataset. We applied these tools to two public datasets, and highlight sets of genes most affected by specific subsets of conditions (e.g. tissues, treatments, samples, etc.. Statistically significant associations for highlighted gene sets were shown via global analysis for Gene Ontology term enrichment. Together with covariate associations, the tool provides a basis for building testable hypotheses about the biological or experimental causes of observed variation. Conclusion We provide an unsupervised data mining technique for diverse microarray expression datasets that is distinct from major methods now in routine use. In test uses, this method, based on publicly available gene annotations, appears to identify numerous sets of biologically relevant genes. It
Dafu, Shen; Leihong, Zhang; Dong, Liang; Bei, Li; Yi, Kang
2017-07-01
The purpose of this study is to improve the reconstruction precision and better copy the color of spectral image surfaces. A new spectral reflectance reconstruction algorithm based on an iterative threshold combined with weighted principal component space is presented in this paper, and the principal component with weighted visual features is the sparse basis. Different numbers of color cards are selected as the training samples, a multispectral image is the testing sample, and the color differences in the reconstructions are compared. The channel response value is obtained by a Mega Vision high-accuracy, multi-channel imaging system. The results show that spectral reconstruction based on weighted principal component space is superior in performance to that based on traditional principal component space. Therefore, the color difference obtained using the compressive-sensing algorithm with weighted principal component analysis is less than that obtained using the algorithm with traditional principal component analysis, and better reconstructed color consistency with human eye vision is achieved.
Principal Component Analysis of Long-Lag, Wide-Pulse Gamma-Ray Burst Data
Indian Academy of Sciences (India)
Zhao-Yang Peng; Wen-Shuai Liu
2014-09-01
We have carried out a Principal Component Analysis (PCA) of the temporal and spectral variables of 24 long-lag, wide-pulse gamma-ray bursts (GRBs) presented by Norris et al. (2005). Taking all eight temporal and spectral parameters into account, our analysis shows that four principal components are enough to describe the variation of the temporal and spectral data of long-lag bursts. In addition, the first-two principal components are dominated by the temporal variables while the third and fourth principal components are dominated by the spectral parameters.
Oplatka, Izhar
2017-01-01
Purpose: In order to fill the gap in theoretical and empirical knowledge about the characteristics of principal workload, the purpose of this paper is to explore the components of principal workload as well as its determinants and the coping strategies commonly used by principals to face this personal state. Design/methodology/approach:…
Application of principal-component analysis to the interpretation of brown coal properties
Energy Technology Data Exchange (ETDEWEB)
Tesch, S.; Otto, M. [TU Bergakademie, Freiberg (Germany). Institute for Analytical Chemistry
1995-07-01
The characterization of coal properties using principal-component analysis is described. The aim is to obtain correlations between a large number of chemical and technological parameters as well as FT-i.r. spectroscopic data. A database on 44 brown coals from different deposits was interpreted. After computation of the principal components, scatterplots and component-weight plots are presented for the first two or three principal components. The overlap of the component-weights plot and the scatterplot (biplot) shows how it is possible to classify brown coals by means of selected characteristics. 14 refs., 6 figs., 1 tab.
Identifying apple surface defects using principal components analysis and artifical neural networks
Artificial neural networks and principal components were used to detect surface defects on apples in near-infrared images. Neural networks were trained and tested on sets of principal components derived from columns of pixels from images of apples acquired at two wavelengths (740 nm and 950 nm). I...
A visual basic program for principal components transformation of digital images
Carr, James R.
1998-04-01
Principal components transformation of multispectral and hyperspectral digital imagery is useful for: (1) reducing the number of useful bands, a distinct advantage when using hyperspectral imagery; (2) obtaining image bands that are orthogonal (statistically independent); (3) improving supervised and unsupervised classification; and (4) image compression. A Visual Basic program is presented for principal components transformation of digital images using principal components analysis or correspondence analysis. Principal components analysis is well known for this application. Correspondence analysis is only recently applied for such transformation. The program can import raw digital images, with or without header records; or, the program can accept Windows bitmap (BMP) files. After transformation, output, transformed images can be exported in raw or BMP format. The program can be used as a simple file format conversion program (raw to BMP or BMP to raw) without performing principal components transformation. An application demonstrates the use of the program.
Foch, Eric; Milner, Clare E
2014-01-03
Iliotibial band syndrome (ITBS) is a common knee overuse injury among female runners. Atypical discrete trunk and lower extremity biomechanics during running may be associated with the etiology of ITBS. Examining discrete data points limits the interpretation of a waveform to a single value. Characterizing entire kinematic and kinetic waveforms may provide additional insight into biomechanical factors associated with ITBS. Therefore, the purpose of this cross-sectional investigation was to determine whether female runners with previous ITBS exhibited differences in kinematics and kinetics compared to controls using a principal components analysis (PCA) approach. Forty participants comprised two groups: previous ITBS and controls. Principal component scores were retained for the first three principal components and were analyzed using independent t-tests. The retained principal components accounted for 93-99% of the total variance within each waveform. Runners with previous ITBS exhibited low principal component one scores for frontal plane hip angle. Principal component one accounted for the overall magnitude in hip adduction which indicated that runners with previous ITBS assumed less hip adduction throughout stance. No differences in the remaining retained principal component scores for the waveforms were detected among groups. A smaller hip adduction angle throughout the stance phase of running may be a compensatory strategy to limit iliotibial band strain. This running strategy may have persisted after ITBS symptoms subsided.
A robust polynomial principal component analysis for seismic noise attenuation
Wang, Yuchen; Lu, Wenkai; Wang, Benfeng; Liu, Lei
2016-12-01
Random and coherent noise attenuation is a significant aspect of seismic data processing, especially for pre-stack seismic data flattened by normal moveout correction or migration. Signal extraction is widely used for pre-stack seismic noise attenuation. Principle component analysis (PCA), one of the multi-channel filters, is a common tool to extract seismic signals, which can be realized by singular value decomposition (SVD). However, when applying the traditional PCA filter to seismic signal extraction, the result is unsatisfactory with some artifacts when the seismic data is contaminated by random and coherent noise. In order to directly extract the desired signal and fix those artifacts at the same time, we take into consideration the amplitude variation with offset (AVO) property and thus propose a robust polynomial PCA algorithm. In this algorithm, a polynomial constraint is used to optimize the coefficient matrix. In order to simplify this complicated problem, a series of sub-optimal problems are designed and solved iteratively. After that, the random and coherent noise can be effectively attenuated simultaneously. Applications on synthetic and real data sets note that our proposed algorithm can better suppress random and coherent noise and have a better performance on protecting the desired signals, compared with the local polynomial fitting, conventional PCA and a L1-norm based PCA method.
APPLICATION OF PRINCIPAL COMPONENTS ANALYSIS IN THE STUDY OF ADSORPTIVE VOLTAMMETRY OF METALICS IONS
Directory of Open Access Journals (Sweden)
Leandra de Oliveira Cruz da Silva
2010-01-01
Full Text Available The adsorptive stripping voltammetry with differential pulse cathodic through the use of a mixture of complexing agents dimethylglyoxime and oxine was used for an exploratory study simultaneously of ions cadmium, cobalt, copper, nickel, lead and zinc. Were obtained voltammograms of the 64 individual solutions used in planning and current data were submitted to principal component analysis (PCA, allowing to characterize the trends of the solutions of metal ions studied. The system can be described in eight principal components that explained 98.32% of variance. Since the first three principal components accumulated approximately 85.46% of the total variance.
On the Performance of Principal Component Liu-Type Estimator under the Mean Square Error Criterion
Directory of Open Access Journals (Sweden)
Jibo Wu
2013-01-01
Full Text Available Wu (2013 proposed an estimator, principal component Liu-type estimator, to overcome multicollinearity. This estimator is a general estimator which includes ordinary least squares estimator, principal component regression estimator, ridge estimator, Liu estimator, Liu-type estimator, r-k class estimator, and r-d class estimator. In this paper, firstly we use a new method to propose the principal component Liu-type estimator; then we study the superior of the new estimator by using the scalar mean squares error criterion. Finally, we give a numerical example to show the theoretical results.
Wavelet decomposition based principal component analysis for face recognition using MATLAB
Sharma, Mahesh Kumar; Sharma, Shashikant; Leeprechanon, Nopbhorn; Ranjan, Aashish
2016-03-01
For the realization of face recognition systems in the static as well as in the real time frame, algorithms such as principal component analysis, independent component analysis, linear discriminate analysis, neural networks and genetic algorithms are used for decades. This paper discusses an approach which is a wavelet decomposition based principal component analysis for face recognition. Principal component analysis is chosen over other algorithms due to its relative simplicity, efficiency, and robustness features. The term face recognition stands for identifying a person from his facial gestures and having resemblance with factor analysis in some sense, i.e. extraction of the principal component of an image. Principal component analysis is subjected to some drawbacks, mainly the poor discriminatory power and the large computational load in finding eigenvectors, in particular. These drawbacks can be greatly reduced by combining both wavelet transform decomposition for feature extraction and principal component analysis for pattern representation and classification together, by analyzing the facial gestures into space and time domain, where, frequency and time are used interchangeably. From the experimental results, it is envisaged that this face recognition method has made a significant percentage improvement in recognition rate as well as having a better computational efficiency.
Introduction to uses and interpretation of principal component analyses in forest biology.
J. G. Isebrands; Thomas R. Crow
1975-01-01
The application of principal component analysis for interpretation of multivariate data sets is reviewed with emphasis on (1) reduction of the number of variables, (2) ordination of variables, and (3) applications in conjunction with multiple regression.
PROJECTION-PURSUIT BASED PRINCIPAL COMPONENT ANALYSIS: A LARGE SAMPLE THEORY
Institute of Scientific and Technical Information of China (English)
Jian ZHANG
2006-01-01
The principal component analysis (PCA) is one of the most celebrated methods in analysing multivariate data. An effort of extending PCA is projection pursuit (PP), a more general class of dimension-reduction techniques. However, the application of this extended procedure is often hampered by its complexity in computation and by lack of some appropriate theory. In this paper, by use of the empirical processes we established a large sample theory for the robust PP estimators of the principal components and dispersion matrix.
Directory of Open Access Journals (Sweden)
Calviño Aida
2017-03-01
Full Text Available In this article we propose a simple and versatile method for limiting disclosure in continuous microdata based on Principal Component Analysis (PCA. Instead of perturbing the original variables, we propose to alter the principal components, as they contain the same information but are uncorrelated, which permits working on each component separately, reducing processing times. The number and weight of the perturbed components determine the level of protection and distortion of the masked data. The method provides preservation of the mean vector and the variance-covariance matrix. Furthermore, depending on the technique chosen to perturb the principal components, the proposed method can provide masked, hybrid or fully synthetic data sets. Some examples of application and comparison with other methods previously proposed in the literature (in terms of disclosure risk and data utility are also included.
A unified self-stabilizing neural network algorithm for principal and minor components extraction.
Kong, Xiangyu; Hu, Changhua; Ma, Hongguang; Han, Chongzhao
2012-02-01
Recently, many unified learning algorithms have been developed for principal component analysis and minor component analysis. These unified algorithms can be used to extract principal components and, if altered simply by the sign, can also serve as a minor component extractor. This is of practical significance in the implementations of algorithms. This paper proposes a unified self-stabilizing neural network learning algorithm for principal and minor components extraction, and studies the stability of the proposed unified algorithm via the fixed-point analysis method. The proposed unified self-stabilizing algorithm for principal and minor components extraction is extended for tracking the principal subspace (PS) and minor subspace (MS). The averaging differential equation and the energy function associated with the unified algorithm for tracking PS and MS are given. It is shown that the averaging differential equation will globally asymptotically converge to an invariance set, and the corresponding energy function exhibit a unique global minimum attained if and only if its state matrices span the PS or MS of the autocorrelation matrix of a vector data stream. It is concluded that the proposed unified algorithm for tracking PS and MS can efficiently track an orthonormal basis of the PS or MS. Simulations are carried out to further illustrate the theoretical results achieved.
Timmerman, Marieke E.; Kiers, Henk A.L.; Smilde, Age K.
2007-01-01
Confidence intervals (Cis) in principal component analysis (PCA) can be based on asymptotic standard errors and on the bootstrap methodology. The present paper offers an overview of possible strategies for bootstrapping in PCA. A motivating example shows that Ci estimates for the component loadings
Combined principal component preprocessing and n-tuple neural networks for improved classification
DEFF Research Database (Denmark)
Høskuldsson, Agnar; Linneberg, Christian
2000-01-01
We present a combined principal component analysis/neural network scheme for classification. The data used to illustrate the method consist of spectral fluorescence recordings from seven different production facilities, and the task is to relate an unknown sample to one of these seven factories....... The data are first preprocessed by performing an individual principal component analysis on each of the seven groups of data. The components found are then used for classifying the data, but instead of making a single multiclass classifier, we follow the ideas of turning a multiclass problem into a number...
Research on Rural Consumer Demand in Hebei Province Based on Principal Component Analysis
Institute of Scientific and Technical Information of China (English)
2011-01-01
By selecting the time sequence data concerning influencing factors of rural consumer demand in Hebei Province from 2000 to 2010,this paper uses the principal component analysis method in multiplex econometric statistical analysis,constructs the principal component of consumer demand in Hebei Province,conducts regression on the dependent variable of consumer spending per capita in Hebei Province and the principal component of consumer demand so as to get principal component regression,and then conducts quantitative and qualitative analysis on the principal component.The results show that total output value per capita (yuan),employment rate,and income gap,are correlative with rural residents’ consumer demand in Hebei Province positively;consumer price index,upbringing ratio of children,and one-year interest rate are correlative with rural residents’ consumer demand in Hebei Province negatively;the ratio of supporting the elderly and medical care spending per capita are correlative with rural residents’ consumer demand in Hebei Province positively.The corresponding countermeasures and suggestions are put forward to promote residents’ consumer demand in Hebei Province as follows:develop county economy in Hebei Province and increase rural residents’ consumer demand;use industry to support agriculture and coordinate urban-rural development;improve rural medical care and health system and resolve actual difficulties of the masses.
Optimized principal component analysis on coronagraphic images of the fomalhaut system
Energy Technology Data Exchange (ETDEWEB)
Meshkat, Tiffany; Kenworthy, Matthew A. [Sterrewacht Leiden, P.O. Box 9513, Niels Bohrweg 2, 2300-RA Leiden (Netherlands); Quanz, Sascha P.; Amara, Adam [Institute for Astronomy, ETH Zurich, Wolfgang-Pauli-Strasse 27, 8093-CH Zurich (Switzerland)
2014-01-01
We present the results of a study to optimize the principal component analysis (PCA) algorithm for planet detection, a new algorithm complementing angular differential imaging and locally optimized combination of images (LOCI) for increasing the contrast achievable next to a bright star. The stellar point spread function (PSF) is constructed by removing linear combinations of principal components, allowing the flux from an extrasolar planet to shine through. The number of principal components used determines how well the stellar PSF is globally modeled. Using more principal components may decrease the number of speckles in the final image, but also increases the background noise. We apply PCA to Fomalhaut Very Large Telescope NaCo images acquired at 4.05 μm with an apodized phase plate. We do not detect any companions, with a model dependent upper mass limit of 13-18 M {sub Jup} from 4-10 AU. PCA achieves greater sensitivity than the LOCI algorithm for the Fomalhaut coronagraphic data by up to 1 mag. We make several adaptations to the PCA code and determine which of these prove the most effective at maximizing the signal-to-noise from a planet very close to its parent star. We demonstrate that optimizing the number of principal components used in PCA proves most effective for pulling out a planet signal.
Hemant Pathak; S. N. Limaye
2011-01-01
Groundwater is one of the major resources of the drinking water in Sagar city (India.). In this study 15 sampling station were selected for the investigations on 14 chemical parameters. The work was carried out during different months of the pre-monsoon, monsoon and post-monsoon seasons in June 2009 to June 2010. The multivariate statistics such as principal component and cluster analysis were applied to the datasets to investigate seasonal variations in groundwater quality. Principal axis fa...
Gu, Fei; Wu, Hao
2016-09-01
The specifications of state space model for some principal component-related models are described, including the independent-group common principal component (CPC) model, the dependent-group CPC model, and principal component-based multivariate analysis of variance. Some derivations are provided to show the equivalence of the state space approach and the existing Wishart-likelihood approach. For each model, a numeric example is used to illustrate the state space approach. In addition, a simulation study is conducted to evaluate the standard error estimates under the normality and nonnormality conditions. In order to cope with the nonnormality conditions, the robust standard errors are also computed. Finally, other possible applications of the state space approach are discussed at the end.
Structure Analysis of Network Traffic Matrix Based on Relaxed Principal Component Pursuit
Wang, Zhe; Xu, Ke; Yin, Baolin
2011-01-01
The network traffic matrix is a kind of flow-level Internet traffic data and is widely applied to network operation and management. It is a crucial problem to analyze the composition and structure of traffic matrix; some mathematical approaches such as Principal Component Analysis (PCA) were used to handle that problem. In this paper, we first argue that PCA performs poorly for analyzing traffic matrixes polluted by large volume anomalies, then propose a new composition model of the network traffic matrix. According to our model, structure analysis can be formally defined as decomposing a traffic matrix into low-rank, sparse, and noise sub-matrixes, which is equal to the Robust Principal Component Analysis (RPCA) problem defined in [13]. Based on the Relaxed Principal Component Pursuit (Relaxed PCP) method and the Accelerated Proximal Gradient (APG) algorithm, an iterative algorithm for decomposing a traffic matrix is presented, and our experiment results demonstrate its efficiency and flexibility. At last, f...
The Application of Kernel Principal Component Analysis%核主成分法的应用
Institute of Scientific and Technical Information of China (English)
2013-01-01
In this paper,principal component analysis method and kernel principal component analysis method are used to research tourism development of thirteen cities in Jiangsu Province in 2010. The result shows that the kernel principal component analysis result is more reasonable,and the reasons are analyzed. Lastly, by using statistics analysis, some suggestions about future tourism development of Jiangsu Province are put forward for some departments.% 分别利用主成分法和核主成分法，对2010年江苏省13个市的旅游业发展情况进行对比分析，发现核主成法分析的结果更加合理，并分析了原因，最后对江苏未来的旅游业发展提出了建议，供有关部门参考。
Directory of Open Access Journals (Sweden)
Stefania Salvatore
2016-07-01
Full Text Available Abstract Background Wastewater-based epidemiology (WBE is a novel approach in drug use epidemiology which aims to monitor the extent of use of various drugs in a community. In this study, we investigate functional principal component analysis (FPCA as a tool for analysing WBE data and compare it to traditional principal component analysis (PCA and to wavelet principal component analysis (WPCA which is more flexible temporally. Methods We analysed temporal wastewater data from 42 European cities collected daily over one week in March 2013. The main temporal features of ecstasy (MDMA were extracted using FPCA using both Fourier and B-spline basis functions with three different smoothing parameters, along with PCA and WPCA with different mother wavelets and shrinkage rules. The stability of FPCA was explored through bootstrapping and analysis of sensitivity to missing data. Results The first three principal components (PCs, functional principal components (FPCs and wavelet principal components (WPCs explained 87.5-99.6 % of the temporal variation between cities, depending on the choice of basis and smoothing. The extracted temporal features from PCA, FPCA and WPCA were consistent. FPCA using Fourier basis and common-optimal smoothing was the most stable and least sensitive to missing data. Conclusion FPCA is a flexible and analytically tractable method for analysing temporal changes in wastewater data, and is robust to missing data. WPCA did not reveal any rapid temporal changes in the data not captured by FPCA. Overall the results suggest FPCA with Fourier basis functions and common-optimal smoothing parameter as the most accurate approach when analysing WBE data.
cao, Xiuming; Song, Jinjie; Zhang, Caipo
This work focused on principal component analysis and Choquet integral to structure a model of diagnose Parkinson disease. The proper value of Sugeno measure is vital to a diagnostic model. This paper aims at providing a method of using principal component analysis to obtain the sugeno measure. In this diagnostic model, there are two key elements. One is the goodness of fit that the degrees of evidential support for attribute. The other is the importance of attribute itself. The instances of Parkinson disease illuminate that the method is effective.
Directory of Open Access Journals (Sweden)
Khuat Thanh Tung
2016-11-01
Full Text Available Optical Character Recognition plays an important role in data storage and data mining when the number of documents stored as images is increasing. It is expected to find the ways to convert images of typewritten or printed text into machine-encoded text effectively in order to support for the process of information handling effectively. In this paper, therefore, the techniques which are being used to convert image into editable text in the computer such as principal component analysis, multilayer perceptron network, self-organizing maps, and improved multilayer neural network using principal component analysis are experimented. The obtained results indicated the effectiveness and feasibility of the proposed methods.
Classifying sEMG-based Hand Movements by Means of Principal Component Analysis
Directory of Open Access Journals (Sweden)
M. S. Isaković
2015-06-01
Full Text Available In order to improve surface electromyography (sEMG based control of hand prosthesis, we applied Principal Component Analysis (PCA for feature extraction. The sEMG data from a group of healthy subjects (downloaded from free Ninapro database comprised the following sets: three grasping, eight wrist, and eleven finger movements. We tested the accuracy of a simple quadratic classifier for two sets of features derived from PCA. Preliminary results suggest that the first two principal components do not guarantee successful hand movement classification. The hand movement classification accuracy significantly increased with using three instead of two features, in all three sets of movements and throughout all subjects.
Modeling and prediction of children’s growth data via functional principal component analysis
Institute of Scientific and Technical Information of China (English)
2009-01-01
We use the functional principal component analysis(FPCA) to model and predict the weight growth in children.In particular,we examine how the approach can help discern growth patterns of underweight children relative to their normal counterparts,and whether a commonly used transformation to normality plays any constructive roles in a predictive model based on the FPCA.Our work supplements the conditional growth charts developed by Wei and He(2006) by constructing a predictive growth model based on a small number of principal components scores on individual’s past.
A Stochastic Restricted Principal Components Regression Estimator in the Linear Model
Directory of Open Access Journals (Sweden)
Daojiang He
2014-01-01
Full Text Available We propose a new estimator to combat the multicollinearity in the linear model when there are stochastic linear restrictions on the regression coefficients. The new estimator is constructed by combining the ordinary mixed estimator (OME and the principal components regression (PCR estimator, which is called the stochastic restricted principal components (SRPC regression estimator. Necessary and sufficient conditions for the superiority of the SRPC estimator over the OME and the PCR estimator are derived in the sense of the mean squared error matrix criterion. Finally, we give a numerical example and a Monte Carlo study to illustrate the performance of the proposed estimator.
Neural Network Learning for Principal Component Analysis： A Multistage Decomposition Approach
Institute of Scientific and Technical Information of China (English)
FENGDazheng; ZHANGXianda; BAOZheng
2004-01-01
This paper presents a novel neural network model for finding the principal components of an Ndimensional data stream. This neural network consists of r (≤N) neurons, where the i-th neuron has only N - i+1 weights and an N- i+1 dimensional input vector, while each neuron in most of the relative classical neural networks includes N weights and an N dimensional input vector. All the neurons are trained by the NIC algorithm under the single component case[7] so as to get a series of dimension-reducing principal components in which the dimension number of the i-th principal component is N- i+1. In multistage dimension-reducing processing, the weight vector of i-th neuron is always orthogonal to the subspace constructed from the weight vectors of the first i-1 neurons. By systematic reconstruction technique, wecan recover all the principal components from a series of dimension-reducing ones. Its remarkable advantage is that its computational efficiency of the neural network learning based on the Novel information criterion (NIC) is improved and the weight storage is reduced, by the multistage dimension-reducing processing (multistage decomposition)for the covariance matrix or the input vector sequence. In addition, we study several important properties of the NIC learning algorithm.
Institute of Scientific and Technical Information of China (English)
WANG Cong-lu; WU Chao; WANG Wei-jun
2008-01-01
Referring to GB5618-1995 about heavy metal pollution, and using statistical analysis SPSS, the major pollutants of mine area farmland heavy metal pollution were identified by variable clustering analysis. Assessment and classification were done to the mine area farmland heavy metal pollution situation by synthetic principal components analysis (PCA). The results show that variable clustering analysis is efficient to identify the principal components of mine area farmland heavy metal pollution. Sort and clustering were done to the synthetic principal components scores of soil sample, which is given by synthetic principal components analysis. Data structure of soil heavy metal contaminations, relationships and pollution level of different soil samples are discovered. The results of mine area farmland heavy metal pollution quality assessed and classified with synthetic component scores reflect the influence of both the major and compound heavy metal pol-lutants. Identification and assessment results of mine area farmland heavy metal pollution can provide reference and guide to propose control measures of mine area farmland heavy metal pollution and focus on the key treatment region.
Institute of Scientific and Technical Information of China (English)
Zheng Zhonglong; Yang Jie
2005-01-01
Many problems in image representation and classification involve some form of dimensionality reduction. Non-negative matrix factorization (NMF) is a recently proposed unsupervised procedure for learning spatially localized, parts-based subspace representation of objects. An improvement of the classical NMF by combining with Log-Gabor wavelets to enhance its part-based learning ability is presented. The new method with principal component analysis (PCA) and locally linear embedding (LLE) proposed recently in Science are compared. Finally, the new method to several real world datasets and achieve good performance in representation and classification is applied.
Principal component analysis for neural electron/jet discrimination in highly segmented calorimeters
Vassali, M R
2001-01-01
A neural electron/jet discriminator based on calorimetry is developed for the second-level trigger system of the ATLAS detector. As preprocessing of the calorimeter information, a principal component analysis is performed on each segment of the two sections (electromagnetic and hadronic) of the calorimeter system, in order to reduce significantly the dimension of the input data space and fully explore the detailed energy deposition profile, which is provided by the highly-segmented calorimeter system. It is shown that projecting calorimeter data onto 33 segmented principal components, the discrimination efficiency of the neural classifier reaches 98.9% for electrons (with only 1% of false alarm probability). Furthermore, restricting data projection onto only 9 components, an electron efficiency of 99.1% is achieved (with 3% of false alarm), which confirms that a fast triggering system may be designed using few components. (6 refs).
Sensor Fault Detection, Isolation and Reconstruction Using Nonlinear Principal Component Analysis
Institute of Scientific and Technical Information of China (English)
Mohamed-Faouzi Harkat; Salah Djelel; Noureddine Doghmane; Mohamed Benouaret
2007-01-01
State reconstruction approach is very useful for sensor fault isolation, reconstruction of faulty measurement and the determination of the number of components retained in the principal components analysis (PCA) model. An extension of this approach based on a Nonlinear PCA (NLPCA) model is described in this paper. The NLPCA model is obtained using five layer neural network.A simulation example is given to show the performances of the proposed approach.
Khodasevich, Mikhail A.; Trofimova, Darya V.; Nezalzova, Elena I.
2011-02-01
Principal component analysis of UV-VIS-NIR transmission spectra of matured wine distillates (1-40 years aged) produced by three Moldavian manufacturers allows to characterize with sufficient certainty the eleven chemical parameters of considered alcoholic beverages: contents of acetaldehyde, ethyl acetate, furfural, vanillin, syringic aldehyde and acid, etc.
Class separation of buildings with high and low prevalence of SBS by principal component analysis
DEFF Research Database (Denmark)
Pommer, L.; Fick, J.; Andersson, B.
2002-01-01
This method was able to separate buildings with high and low prevalence of SBS in two different classes using principal component analysis (PCA). Data from the Northern Swedish Office Illness Study describing the presence and level of chemical compounds in outdoor, supply and room air, respective...
Khodasevich, M. A.; Sinitsyn, G. V.; Gres'ko, M. A.; Dolya, V. M.; Rogovaya, M. V.; Kazberuk, A. V.
2017-07-01
A study of 153 brands of commercial vodka products showed that counterfeit samples could be identified by introducing a unified additive at the minimum concentration acceptable for instrumental detection and multivariate analysis of UV-Vis transmission spectra. Counterfeit products were detected with 100% probability by using hierarchical cluster analysis or the C-means method in two-dimensional principal-component space.
Learning Principal Component Analysis by Using Data from Air Quality Networks
Perez-Arribas, Luis Vicente; Leon-González, María Eugenia; Rosales-Conrado, Noelia
2017-01-01
With the final objective of using computational and chemometrics tools in the chemistry studies, this paper shows the methodology and interpretation of the Principal Component Analysis (PCA) using pollution data from different cities. This paper describes how students can obtain data on air quality and process such data for additional information…
Directory of Open Access Journals (Sweden)
Mohebodini Mehdi
2017-08-01
Full Text Available Landraces of spinach in Iran have not been sufficiently characterised for their morpho-agronomic traits. Such characterisation would be helpful in the development of new genetically improved cultivars. In this study 54 spinach accessions collected from the major spinach growing areas of Iran were evaluated to determine their phenotypic diversity profile of spinach genotypes on the basis of 10 quantitative and 9 qualitative morpho-agronomic traits. High coefficients of variation were recorded in some quantitative traits (dry yield and leaf area and all of the qualitative traits. Using principal component analysis, the first four principal components with eigen-values more than 1 contributed 87% of the variability among accessions for quantitative traits, whereas the first four principal components with eigen-values more than 0.8 contributed 79% of the variability among accessions for qualitative traits. The most important relations observed on the first two principal components were a strong positive association between leaf width and petiole length; between leaf length and leaf numbers in flowering; and among fresh yield, dry yield and petiole diameter; a near zero correlation between days to flowering with leaf width and petiole length. Prickly seeds, high percentage of female plants, smooth leaf texture, high numbers of leaves at flowering, greygreen leaves, erect petiole attitude and long petiole length are important characters for spinach breeding programmes.
Directory of Open Access Journals (Sweden)
Anish Nair
2013-08-01
Full Text Available In order to improve the quality and productivity the present study highlights the optimization of CNC end milling process parameters to provide a good surface finish. Surface finish has been identified as one of the main quality attributes and is directly related to the productivity of a machine. In this paper an attempt has been made to optimize the process such that the best surface roughness value can be obtained in a process. Hence a multi objective optimization problem has been obtained which can be solved by the hybrid Taguchi method comprising of principal components analysis as well as by utility theory. In this work, Individual response correlation has been eliminated first by mean of Principal Component Analysis (PCA to meet the basic assumption of Taguchi method. Correlated responses have been transformed into uncorrelated quality indices called as principal components. Quality loss estimates have been calculated from the principal components and the utility values are found out for the same. Then the overall utility index has been calculated. Finally, Taguchi method has been used to solve the optimization problem.
Zbilut, J P; Webber, C L
1998-01-01
Recurrence plots were introduced to help aid the detection of signals in complicated data series. This effort was furthered by the quantification of recurrence plot elements. We now demonstrate the utility of combining recurrence quantification analysis with principal components analysis to allow for a probabilistic evaluation for the presence of deterministic signals in relatively short data lengths.
Principal Component Analysis: Resources for an Essential Application of Linear Algebra
Pankavich, Stephen; Swanson, Rebecca
2015-01-01
Principal Component Analysis (PCA) is a highly useful topic within an introductory Linear Algebra course, especially since it can be used to incorporate a number of applied projects. This method represents an essential application and extension of the Spectral Theorem and is commonly used within a variety of fields, including statistics,…
Statistical Monitoring of Chemical Processes Based on Sensitive Kernel Principal Components
Institute of Scientific and Technical Information of China (English)
JIANG Qingchao; YAN Xuefeng
2013-01-01
The kernel principal component analysis (KPCA) method employs the first several kernel principal components (KPCs),which indicate the most variance information of normal observations for process monitoring,but may not reflect the fault information.In this study,sensitive kernel principal component analysis (SKPCA) is proposed to improve process monitoring performance,i.e.,to deal with the discordance of T2 statistic and squared prediction error δspE statistic and reduce missed detection rates.T2 statistic can be used to measure the variation directly along each KPC and analyze the detection performance as well as capture the most useful information in a process.With the calculation of the change rate of T2 statistic along each KPC,SKPCA selects the sensitive kernel principal components for process monitoring.A simulated simple system and Tennessee Eastman process are employed to demonstrate the efficiency of SKPCA on online monitoring.The results indicate that the monitoring performance is improved significantly.
Hunley-Jenkins, Keisha Janine
2012-01-01
This qualitative study explores large, urban, mid-western principal perspectives about cyberbullying and the policy components and practices that they have found effective and ineffective at reducing its occurrence and/or negative effect on their schools' learning environments. More specifically, the researcher was interested in learning more…
DEFF Research Database (Denmark)
Tian, Fang; Rades, Thomas; Sandler, Niklas
2008-01-01
The purpose of this research is to gain a greater insight into the hydrate formation processes of different carbamazepine (CBZ) anhydrate forms in aqueous suspension, where principal component analysis (PCA) was applied for data analysis. The capability of PCA to visualize and to reveal simplifie...
A Cure for Variance Inflation in High Dimensional Kernel Principal Component Analysis
DEFF Research Database (Denmark)
Abrahamsen, Trine Julie; Hansen, Lars Kai
2011-01-01
Small sample high-dimensional principal component analysis (PCA) suffers from variance inflation and lack of generalizability. It has earlier been pointed out that a simple leave-one-out variance renormalization scheme can cure the problem. In this paper we generalize the cure in two directions...
Self-Organized Robust Principal Component Analysis by Back-Propagation Learning
樋口, 勇夫
2004-01-01
The purpose of this study is the suggestion of a self-organized back-propagation algorithm for robust principal component analysis. The self-organizing algorithm that discriminates the influence of data automatically is applied to learning of a sandglass type neural network.
Principal Component Analysis: Resources for an Essential Application of Linear Algebra
Pankavich, Stephen; Swanson, Rebecca
2015-01-01
Principal Component Analysis (PCA) is a highly useful topic within an introductory Linear Algebra course, especially since it can be used to incorporate a number of applied projects. This method represents an essential application and extension of the Spectral Theorem and is commonly used within a variety of fields, including statistics,…
Fall detection in walking robots by multi-way principal component analysis
Karssen, J.G.; Wisse, M.
2008-01-01
Large disturbances can cause a biped to fall. If an upcoming fall can be detected, damage can be minimized or the fall can be prevented. We introduce the multi-way principal component analysis (MPCA) method for the detection of upcoming falls. We study the detection capability of the MPCA method in
Tuber proteome comparison of five potato varieties by principal component analysis
Mello, de Carla Souza; Dijk, Van Jeroen P.; Voorhuijzen, Marleen; Kok, Esther J.; Arisi, Ana Carolina Maisonnave
2016-01-01
BACKGROUND: Data analysis of omics data should be performed by multivariate analysis such as principal component analysis (PCA). The way data are clustered in PCA is of major importance to develop some classification systems based on multivariate analysis, such as soft independent modeling of cla
Energy Technology Data Exchange (ETDEWEB)
Kang, Ho Yang [Korea Research Institute of Standards and Science, Daejeon (Korea, Republic of); Kim, Ki Bok [Chungnam National University, Daejeon (Korea, Republic of)
2003-06-15
In this study, acoustic emission (AE) signals due to surface cracking and moisture movement in the flat-sawn boards of oak (Quercus Variablilis) during drying under the ambient conditions were analyzed and classified using the principal component analysis. The AE signals corresponding to surface cracking showed higher in peak amplitude and peak frequency, and shorter in rise time than those corresponding to moisture movement. To reduce the multicollinearity among AE features and to extract the significant AE parameters, correlation analysis was performed. Over 99% of the variance of AE parameters could be accounted for by the first to the fourth principal components. The classification feasibility and success rate were investigated in terms of two statistical classifiers having six independent variables (AE parameters) and six principal components. As a result, the statistical classifier having AE parameters showed the success rate of 70.0%. The statistical classifier having principal components showed the success rate of 87.5% which was considerably than that of the statistical classifier having AE parameters
Hendrix, Dean
2010-01-01
This study analyzed 2005-2006 Web of Science bibliometric data from institutions belonging to the Association of Research Libraries (ARL) and corresponding ARL statistics to find any associations between indicators from the two data sets. Principal components analysis on 36 variables from 103 universities revealed obvious associations between…
Directory of Open Access Journals (Sweden)
Kirkpatrick Mark
2005-01-01
Full Text Available Abstract Principal component analysis is a widely used 'dimension reduction' technique, albeit generally at a phenotypic level. It is shown that we can estimate genetic principal components directly through a simple reparameterisation of the usual linear, mixed model. This is applicable to any analysis fitting multiple, correlated genetic effects, whether effects for individual traits or sets of random regression coefficients to model trajectories. Depending on the magnitude of genetic correlation, a subset of the principal component generally suffices to capture the bulk of genetic variation. Corresponding estimates of genetic covariance matrices are more parsimonious, have reduced rank and are smoothed, with the number of parameters required to model the dispersion structure reduced from k(k + 1/2 to m(2k - m + 1/2 for k effects and m principal components. Estimation of these parameters, the largest eigenvalues and pertaining eigenvectors of the genetic covariance matrix, via restricted maximum likelihood using derivatives of the likelihood, is described. It is shown that reduced rank estimation can reduce computational requirements of multivariate analyses substantially. An application to the analysis of eight traits recorded via live ultrasound scanning of beef cattle is given.
Hip fracture risk estimation based on principal component analysis of QCT atlas: a preliminary study
Li, Wenjun; Kornak, John; Harris, Tamara; Lu, Ying; Cheng, Xiaoguang; Lang, Thomas
2009-02-01
We aim to capture and apply 3-dimensional bone fragility features for fracture risk estimation. Using inter-subject image registration, we constructed a hip QCT atlas comprising 37 patients with hip fractures and 38 age-matched controls. In the hip atlas space, we performed principal component analysis to identify the principal components (eigen images) that showed association with hip fracture. To develop and test a hip fracture risk model based on the principal components, we randomly divided the 75 QCT scans into two groups, one serving as the training set and the other as the test set. We applied this model to estimate a fracture risk index for each test subject, and used the fracture risk indices to discriminate the fracture patients and controls. To evaluate the fracture discrimination efficacy, we performed ROC analysis and calculated the AUC (area under curve). When using the first group as the training group and the second as the test group, the AUC was 0.880, compared to conventional fracture risk estimation methods based on bone densitometry, which had AUC values ranging between 0.782 and 0.871. When using the second group as the training group, the AUC was 0.839, compared to densitometric methods with AUC values ranging between 0.767 and 0.807. Our results demonstrate that principal components derived from hip QCT atlas are associated with hip fracture. Use of such features may provide new quantitative measures of interest to osteoporosis.
Principal Component Surface (2011) for St. Thomas East End Reserve, St. Thomas
National Oceanic and Atmospheric Administration, Department of Commerce — This image represents a 0.3x0.3 meter principal component analysis (PCA) surface for areas the St. Thomas East End Reserve (STEER) in the U.S. Virgin Islands (USVI)....
Hunley-Jenkins, Keisha Janine
2012-01-01
This qualitative study explores large, urban, mid-western principal perspectives about cyberbullying and the policy components and practices that they have found effective and ineffective at reducing its occurrence and/or negative effect on their schools' learning environments. More specifically, the researcher was interested in learning more…
Impact of Autocorrelation on Principal Components and Their Use in Statistical Process Control
DEFF Research Database (Denmark)
Vanhatalo, Erik; Kulahci, Murat
2015-01-01
A basic assumption when using principal component analysis (PCA) for inferential purposes, such as in statistical process control (SPC), is that the data are independent in time. In many industrial processes, frequent sampling and process dynamics make this assumption unrealistic rendering sampled...
Nonnegativity of uncertain polynomials
Directory of Open Access Journals (Sweden)
iljak Dragoslav D.
1998-01-01
Full Text Available The purpose of this paper is to derive tests for robust nonnegativity of scalar and matrix polynomials, which are algebraic, recursive, and can be completed in finite number of steps. Polytopic families of polynomials are considered with various characterizations of parameter uncertainty including affine, multilinear, and polynomic structures. The zero exclusion condition for polynomial positivity is also proposed for general parameter dependencies. By reformulating the robust stability problem of complex polynomials as positivity of real polynomials, we obtain new sufficient conditions for robust stability involving multilinear structures, which can be tested using only real arithmetic. The obtained results are applied to robust matrix factorization, strict positive realness, and absolute stability of multivariable systems involving parameter dependent transfer function matrices.
Complexity of free energy landscapes of peptides revealed by nonlinear principal component analysis.
Nguyen, Phuong H
2006-12-01
Employing the recently developed hierarchical nonlinear principal component analysis (NLPCA) method of Saegusa et al. (Neurocomputing 2004;61:57-70 and IEICE Trans Inf Syst 2005;E88-D:2242-2248), the complexities of the free energy landscapes of several peptides, including triglycine, hexaalanine, and the C-terminal beta-hairpin of protein G, were studied. First, the performance of this NLPCA method was compared with the standard linear principal component analysis (PCA). In particular, we compared two methods according to (1) the ability of the dimensionality reduction and (2) the efficient representation of peptide conformations in low-dimensional spaces spanned by the first few principal components. The study revealed that NLPCA reduces the dimensionality of the considered systems much better, than did PCA. For example, in order to get the similar error, which is due to representation of the original data of beta-hairpin in low dimensional space, one needs 4 and 21 principal components of NLPCA and PCA, respectively. Second, by representing the free energy landscapes of the considered systems as a function of the first two principal components obtained from PCA, we obtained the relatively well-structured free energy landscapes. In contrast, the free energy landscapes of NLPCA are much more complicated, exhibiting many states which are hidden in the PCA maps, especially in the unfolded regions. Furthermore, the study also showed that many states in the PCA maps are mixed up by several peptide conformations, while those of the NLPCA maps are more pure. This finding suggests that the NLPCA should be used to capture the essential features of the systems.
Influencing Factors of Catering and Food Service Industry Based on Principal Component Analysis
Directory of Open Access Journals (Sweden)
Zi Tang
2014-02-01
Full Text Available Scientific analysis of influencing factors is of great importance for the healthy development of catering and food service industry. This study attempts to present a set of critical indicators for evaluating the contribution of influencing factors to catering and food service industry in the particular context of Harbin City, Northeast China. Ten indicators that correlate closely with catering and food service industry were identified and performed by the principal component analysis method using panel data collected from 2000 to 2011. The result showed that three principal components were extracted out of ten indicators, which can be synthesized respectively as comprehensive strength of catering and food service industry, development of social and economy and residents’ consumption willingness to catering services. Additionally, among ten indicators, five relatively important indicators were prioritized as Revenue from principal business of above designated size, Profits of principal business, Cost of principal business, Total investment in fixed assets in hotel and catering services and Retail sales of hotel and catering services.
Principal Component Analysis of the physique in young adults of Punjab
Directory of Open Access Journals (Sweden)
S. Kaur
2016-05-01
Full Text Available The present cross-sectional study was conducted on 400 Punjabi subjects (200 females and 200 males of 20- 25 years of age to assess the principal components of physique of young adults. Height, weight, circumferences of the waist and hip, skinfolds of the biceps, triceps and subscapular, biacromial diameter and bi-iliocristal diameter were the anthropometric measurements taken on each subject. Systolic and diastolic blood pressure was also measured for each individual. Principal component analysis was applied on 12 variables to extract the components of physique. The principal component analysis extracted 5 major components of physique in adults of Punjab and these factors explained 74.20% and 80% variance in females and males respectively. Factor 1 in both sexes had high loading of adiposity. Factor 2 in females was representing masculinity, whereas factor 2 in males represented obesity or bulkiness like traits. Factor 3 in females was reflecting android (abdominal body fat, whereas in males it had shown characteristics of masculinity related traits. Factor 4 in females had indicated a high load score of blood pressure characteristics, whereas in males it represented android (abdominal body fat. The factor 5 was showing the high load score of blood pressure in the males.
Directory of Open Access Journals (Sweden)
Arthur Schmidt Nanni
2013-01-01
Full Text Available Groundwater with anomalous fluoride content and water mixture patterns were studied in the fractured Serra Geral Aquifer System, a basaltic to rhyolitic geological unit, using a principal component analysis interpretation of groundwater chemical data from 309 deep wells distributed in the Rio Grande do Sul State, Southern Brazil. A four-component model that explains 81% of the total variance in the Principal Component Analysis is suggested. Six hydrochemical groups were identified. δ18O and δ2H were analyzed in 28 Serra Geral Aquifer System samples in order to identify stable isotopes patterns and make comparisons with data from the Guarani Aquifer System and meteoric waters. The results demonstrated a complex water mixture between the Serra Geral Aquifer System and the Guarani Aquifer System, with meteoric recharge and ascending water infiltration through an intensive tectonic fracturing.
DEFF Research Database (Denmark)
Giesen, EB; Ding, Ming; Dalstra, M
2003-01-01
As several morphological parameters of cancellous bone express more or less the same architectural measure, we applied principal components analysis to group these measures and correlated these to the mechanical properties. Cylindrical specimens (n = 24) were obtained in different orientations from...... embalmed mandibular condyles; the angle of the first principal direction and the axis of the specimen, expressing the orientation of the trabeculae, ranged from 10 degrees to 87 degrees. Morphological parameters were determined by a method based on Archimedes' principle and by micro-CT scanning...... analysis revealed four components: amount of bone, number of trabeculae, trabecular orientation, and miscellaneous. They accounted for about 90% of the variance in the morphological variables. The component loadings indicated that a higher amount of bone was primarily associated with more plate...
Automatic Classification of Staphylococci by Principal-Component Analysis and a Gradient Method1
Hill, L. R.; Silvestri, L. G.; Ihm, P.; Farchi, G.; Lanciani, P.
1965-01-01
Hill, L. R. (Università Statale, Milano, Italy), L. G. Silvestri, P. Ihm, G. Farchi, and P. Lanciani. Automatic classification of staphylococci by principal-component analysis and a gradient method. J. Bacteriol. 89:1393–1401. 1965.—Forty-nine strains from the species Staphylococcus aureus, S. saprophyticus, S. lactis, S. afermentans, and S. roseus were submitted to different taxometric analyses; clustering was performed by single linkage, by the unweighted pair group method, and by principal-component analysis followed by a gradient method. Results were substantially the same with all methods. All S. aureus clustered together, sharply separated from S. roseus and S. afermentans; S. lactis and S. saprophyticus fell between, with the latter nearer to S. aureus. The main purpose of this study was to introduce a new taxometric technique, based on principal-component analysis followed by a gradient method, and to compare it with some other methods in current use. Advantages of the new method are complete automation and therefore greater objectivity, execution of the clustering in a space of reduced dimensions in which different characters have different weights, easy recognition of taxonomically important characters, and opportunity for representing clusters in three-dimensional models; the principal disadvantage is the need for large computer facilities. Images PMID:14293013
Effect of noise in principal component analysis with an application to ozone pollution
Tsakiri, Katerina G.
This thesis analyzes the effect of independent noise in principal components of k normally distributed random variables defined by a covariance matrix. We prove that the principal components as well as the canonical variate pairs determined from joint distribution of original sample affected by noise can be essentially different in comparison with those determined from the original sample. However when the differences between the eigenvalues of the original covariance matrix are sufficiently large compared to the level of the noise, the effect of noise in principal components and canonical variate pairs proved to be negligible. The theoretical results are supported by simulation study and examples. Moreover, we compare our results about the eigenvalues and eigenvectors in the two dimensional case with other models examined before. This theory can be applied in any field for the decomposition of the components in multivariate analysis. One application is the detection and prediction of the main atmospheric factor of ozone concentrations on the example of Albany, New York. Using daily ozone, solar radiation, temperature, wind speed and precipitation data, we determine the main atmospheric factor for the explanation and prediction of ozone concentrations. A methodology is described for the decomposition of the time series of ozone and other atmospheric variables into the global term component which describes the long term trend and the seasonal variations, and the synoptic scale component which describes the short term variations. By using the Canonical Correlation Analysis, we show that solar radiation is the only main factor between the atmospheric variables considered here for the explanation and prediction of the global and synoptic scale component of ozone. The global term components are modeled by a linear regression model, while the synoptic scale components by a vector autoregressive model and the Kalman filter. The coefficient of determination, R2, for the
Directory of Open Access Journals (Sweden)
Panazzolo Diogo G
2012-11-01
Full Text Available Abstract Background We aimed to evaluate the multivariate association between functional microvascular variables and clinical-laboratorial-anthropometrical measurements. Methods Data from 189 female subjects (34.0±15.5 years, 30.5±7.1 kg/m2, who were non-smokers, non-regular drug users, without a history of diabetes and/or hypertension, were analyzed by principal component analysis (PCA. PCA is a classical multivariate exploratory tool because it highlights common variation between variables allowing inferences about possible biological meaning of associations between them, without pre-establishing cause-effect relationships. In total, 15 variables were used for PCA: body mass index (BMI, waist circumference, systolic and diastolic blood pressure (BP, fasting plasma glucose, levels of total cholesterol, high-density lipoprotein cholesterol (HDL-c, low-density lipoprotein cholesterol (LDL-c, triglycerides (TG, insulin, C-reactive protein (CRP, and functional microvascular variables measured by nailfold videocapillaroscopy. Nailfold videocapillaroscopy was used for direct visualization of nutritive capillaries, assessing functional capillary density, red blood cell velocity (RBCV at rest and peak after 1 min of arterial occlusion (RBCVmax, and the time taken to reach RBCVmax (TRBCVmax. Results A total of 35% of subjects had metabolic syndrome, 77% were overweight/obese, and 9.5% had impaired fasting glucose. PCA was able to recognize that functional microvascular variables and clinical-laboratorial-anthropometrical measurements had a similar variation. The first five principal components explained most of the intrinsic variation of the data. For example, principal component 1 was associated with BMI, waist circumference, systolic BP, diastolic BP, insulin, TG, CRP, and TRBCVmax varying in the same way. Principal component 1 also showed a strong association among HDL-c, RBCV, and RBCVmax, but in the opposite way. Principal component 3 was
Complex principal component and correlation structure of 16 yeast genomic variables.
Theis, Fabian J; Latif, Nadia; Wong, Philip; Frishman, Dmitrij
2011-09-01
A quickly growing number of characteristics reflecting various aspects of gene function and evolution can be either measured experimentally or computed from DNA and protein sequences. The study of pairwise correlations between such quantitative genomic variables as well as collective analysis of their interrelations by multidimensional methods have delivered crucial insights into the processes of molecular evolution. Here, we present a principal component analysis (PCA) of 16 genomic variables from Saccharomyces cerevisiae, the largest data set analyzed so far. Because many missing values and potential outliers hinder the direct calculation of principal components, we introduce the application of Bayesian PCA. We confirm some of the previously established correlations, such as evolutionary rate versus protein expression, and reveal new correlations such as those between translational efficiency, phosphorylation density, and protein age. Although the first principal component primarily contrasts genomic change and protein expression, the second component separates variables related to gene existence and expressed protein functions. Enrichment analysis on genes affecting variable correlations unveils classes of influential genes. For example, although ribosomal and nuclear transport genes make important contributions to the correlation between protein isoelectric point and molecular weight, protein synthesis and amino acid metabolism genes help cause the lack of significant correlation between propensity for gene loss and protein age. We present the novel Quagmire database (Quantitative Genomics Resource) which allows exploring relationships between more genomic variables in three model organisms-Escherichia coli, S. cerevisiae, and Homo sapiens (http://webclu.bio.wzw.tum.de:18080/quagmire).
Directory of Open Access Journals (Sweden)
Suwicha Jirayucharoensak
2014-01-01
Full Text Available Automatic emotion recognition is one of the most challenging tasks. To detect emotion from nonstationary EEG signals, a sophisticated learning algorithm that can represent high-level abstraction is required. This study proposes the utilization of a deep learning network (DLN to discover unknown feature correlation between input signals that is crucial for the learning task. The DLN is implemented with a stacked autoencoder (SAE using hierarchical feature learning approach. Input features of the network are power spectral densities of 32-channel EEG signals from 32 subjects. To alleviate overfitting problem, principal component analysis (PCA is applied to extract the most important components of initial input features. Furthermore, covariate shift adaptation of the principal components is implemented to minimize the nonstationary effect of EEG signals. Experimental results show that the DLN is capable of classifying three different levels of valence and arousal with accuracy of 49.52% and 46.03%, respectively. Principal component based covariate shift adaptation enhances the respective classification accuracy by 5.55% and 6.53%. Moreover, DLN provides better performance compared to SVM and naive Bayes classifiers.
Directory of Open Access Journals (Sweden)
Federica Censi
2012-06-01
Full Text Available BACKGROUND: Principal component analysis (PCA of the T-wave has been demonstrated to quantify the dipolar and not-dipolar components of the ventricular activation, the latter reflecting repolarization heterogeneity. Accordingly, the PCA of the P-wave could help in analyzing the heterogeneous propagation of sinus impulses in the atria, which seems to predispose to fibrillation. AIM: The aim of this study is to perform the PCA of the P-wave in patients prone to atrial fibrillation (AF. METHODS: PCA is performed on P-waves extracted by averaging technique from ECG recordings acquired using a 32-lead mapping system (2048 Hz, 24 bit, 0-400 Hz bandwidth. We extracted PCA parameters related to the dipolar and not dipolar components of the P-wave using the first 3 eigenvalues and the cumulative percent of variance explained by the first 3 PCs (explained variance EV. RESULTS AND CONCLUSIONS: We found that the EV associated to the low risk patients is higher than that associated to the high risk patients, and that, correspondingly, the first eigenvalue is significantly lower while the second one is significantly higher in the high risk patients respect to the low risk group. Factor loadings showed that on average all leads contribute to the first principal component.
Fast adaptive principal component extraction based on a generalized energy function
Institute of Scientific and Technical Information of China (English)
欧阳缮; 保铮; 廖桂生
2003-01-01
By introducing an arbitrary diagonal matrix, a generalized energy function (GEF) is proposed for searching for the optimum weights of a two layer linear neural network. From the GEF, we derive a recur- sive least squares (RLS) algorithm to extract in parallel multiple principal components of the input covari-ance matrix without designing an asymmetrical circuit. The local stability of the GEF algorithm at the equilibrium is analytically verified. Simulation resultsshow that the GEF algorithm for parallel multiple principal components extraction exhibits the fast convergence and has the improved robustness resis- tance tothe eigenvalue spread of the input covariance matrix as compared to the well-known lateral inhi- bition model (APEX) and least mean square error reconstruction(LMSER) algorithms.
Directory of Open Access Journals (Sweden)
Haorui Liu
2016-01-01
Full Text Available In the car control systems, it is hard to measure some key vehicle states directly and accurately when running on the road and the cost of the measurement is high as well. To address these problems, a vehicle state estimation method based on the kernel principal component analysis and the improved Elman neural network is proposed. Combining with nonlinear vehicle model of three degrees of freedom (3 DOF, longitudinal, lateral, and yaw motion, this paper applies the method to the soft sensor of the vehicle states. The simulation results of the double lane change tested by Matlab/SIMULINK cosimulation prove the KPCA-IENN algorithm (kernel principal component algorithm and improved Elman neural network to be quick and precise when tracking the vehicle states within the nonlinear area. This algorithm method can meet the software performance requirements of the vehicle states estimation in precision, tracking speed, noise suppression, and other aspects.
Indian Academy of Sciences (India)
Anita Gharekhan; Ashok N Oza; M B Sureshkumar; Asima Pradhan; Prasanta K Panigrahi
2010-12-01
Fluorescence characteristics of human breast tissues are investigated through wavelet transform and principal component analysis (PCA). Wavelet transform of polarized fluorescence spectra of human breast tissues is found to localize spectral features that can reliably differentiate different tissue types. The emission range in the visible wavelength regime of 500–700 nm is analysed, with the excitation wavelength at 488 nm using laser as an excitation source, where flavin and porphyrin are some of the active fluorophores. A number of global and local parameters from principal component analysis of both high- and low-pass coefficients extracted in the wavelet domain, capturing spectral variations and subtle changes in the diseased tissues are clearly identifiable.
Wang, Shouyu; Jin, Ying; Yan, Keding; Xue, Liang; Liu, Fei; Li, Zhenhua
2014-11-01
Quantitative interferometric microscopy is used in biological and medical fields and a wealth of applications are proposed in order to detect different kinds of biological samples. Here, we develop a phase detecting cytometer based on quantitative interferometric microscopy with expanded principal component analysis phase retrieval method to obtain phase distributions of red blood cells with a spatial resolution ~1.5 μm. Since expanded principal component analysis method is a time-domain phase retrieval algorithm, it could avoid disadvantages of traditional frequency-domain algorithms. Additionally, the phase retrieval method realizes high-speed phase imaging from multiple microscopic interferograms captured by CCD camera when the biological cells are scanned in the field of view. We believe this method can be a powerful tool to quantitatively measure the phase distributions of different biological samples in biological and medical fields.
Directory of Open Access Journals (Sweden)
S. Roy
2013-12-01
Full Text Available The present investigation is an experimental approach to deposit electroless Ni-P-W coating on mild steel substrate and find out the optimum combination of various tribological performances on the basis of minimum friction and wear, using weighted principal component analysis (WPCA. In this study three main tribological parameters are chosen viz. load (A, speed (B and time(C. The responses are coefficient of friction and wear depth. Here Weighted Principal Component Analysis (WPCA method is adopted to convert the multi-responses into single performance index called multiple performance index (MPI and Taguchi L27 orthogonal array is used to design the experiment and to find the optimum combination of tribological parameters for minimum coefficient of friction and wear depth. ANOVA is performed to find the significance of the each tribological process parameters and their interactions. The EDX analysis, SEM and XRD are performed to study the composition and structural aspects.
Obtaining a linear combination of the principal components of a matrix on quantum computers
Daskin, Ammar
2016-10-01
Principal component analysis is a multivariate statistical method frequently used in science and engineering to reduce the dimension of a problem or extract the most significant features from a dataset. In this paper, using a similar notion to the quantum counting, we show how to apply the amplitude amplification together with the phase estimation algorithm to an operator in order to procure the eigenvectors of the operator associated to the eigenvalues defined in the range [ a, b] , where a and b are real and 0 ≤ a ≤ b ≤ 1. This makes possible to obtain a combination of the eigenvectors associated with the largest eigenvalues and so can be used to do principal component analysis on quantum computers.
Milan, S. E.; Carter, J. A.; Korth, H.; Anderson, B. J.
2015-12-01
Principal component analysis is performed on Birkeland or field-aligned current (FAC) measurements from the Active Magnetosphere and Planetary Electrodynamics Response Experiment. Principal component analysis (PCA) identifies the patterns in the FACs that respond coherently to different aspects of geomagnetic activity. The regions 1 and 2 current system is shown to be the most reproducible feature of the currents, followed by cusp currents associated with magnetic tension forces on newly reconnected field lines. The cusp currents are strongly modulated by season, indicating that their strength is regulated by the ionospheric conductance at the foot of the field lines. PCA does not identify a pattern that is clearly characteristic of a substorm current wedge. Rather, a superposed epoch analysis of the currents associated with substorms demonstrates that there is not a single mode of response, but a complicated and subtle mixture of different patterns.
Directory of Open Access Journals (Sweden)
Pengyu Gao
2016-03-01
Full Text Available It is difficult to forecast the well productivity because of the complexity of vertical and horizontal developments in fluvial facies reservoir. This paper proposes a method based on Principal Component Analysis and Artificial Neural Network to predict well productivity of fluvial facies reservoir. The method summarizes the statistical reservoir factors and engineering factors that affect the well productivity, extracts information by applying the principal component analysis method and approximates arbitrary functions of the neural network to realize an accurate and efficient prediction on the fluvial facies reservoir well productivity. This method provides an effective way for forecasting the productivity of fluvial facies reservoir which is affected by multi-factors and complex mechanism. The study result shows that this method is a practical, effective, accurate and indirect productivity forecast method and is suitable for field application.
Magnetic anomaly detection (MAD) of ferromagnetic pipelines using principal component analysis (PCA)
Sheinker, Arie; Moldwin, Mark B.
2016-04-01
The magnetic anomaly detection (MAD) method is used for detection of visually obscured ferromagnetic objects. The method exploits the magnetic field originating from the ferromagnetic object, which constitutes an anomaly in the ambient earth’s magnetic field. Traditionally, MAD is used to detect objects with a magnetic field of a dipole structure, where far from the object it can be considered as a point source. In the present work, we expand MAD to the case of a non-dipole source, i.e. a ferromagnetic pipeline. We use principal component analysis (PCA) to calculate the principal components, which are then employed to construct an effective detector. Experiments conducted in our lab with real-world data validate the above analysis. The simplicity, low computational complexity, and the high detection rate make the proposed detector attractive for real-time, low power applications.
Institute of Scientific and Technical Information of China (English)
Yawei Yang; Yuxin Ma; Bing Song; Hongbo Shi
2015-01-01
A novel approach named aligned mixture probabilistic principal component analysis (AMPPCA) is proposed in this study for fault detection of multimode chemical processes. In order to exploit within-mode correlations, the AMPPCA algorithm first estimates a statistical description for each operating mode by applying mixture prob-abilistic principal component analysis (MPPCA). As a comparison, the combined MPPCA is employed where mon-itoring results are softly integrated according to posterior probabilities of the test sample in each local model. For exploiting the cross-mode correlations, which may be useful but are inadvertently neglected due to separately held monitoring approaches, a global monitoring model is constructed by aligning al local models together. In this way, both within-mode and cross-mode correlations are preserved in this integrated space. Finally, the utility and feasibility of AMPPCA are demonstrated through a non-isothermal continuous stirred tank reactor and the TE benchmark process.
Wetzel, Sebastian J.
2017-08-01
We examine unsupervised machine learning techniques to learn features that best describe configurations of the two-dimensional Ising model and the three-dimensional X Y model. The methods range from principal component analysis over manifold and clustering methods to artificial neural-network-based variational autoencoders. They are applied to Monte Carlo-sampled configurations and have, a priori, no knowledge about the Hamiltonian or the order parameter. We find that the most promising algorithms are principal component analysis and variational autoencoders. Their predicted latent parameters correspond to the known order parameters. The latent representations of the models in question are clustered, which makes it possible to identify phases without prior knowledge of their existence. Furthermore, we find that the reconstruction loss function can be used as a universal identifier for phase transitions.
Milan, S E; Korth, H; Anderson, B J
2016-01-01
Principal component analysis is performed on Birkeland or field-aligned current (FAC) measurements from the Active Magnetosphere and Planetary Electrodynamics Response Experiment. Principal component analysis (PCA) identifies the patterns in the FACs that respond coherently to different aspects of geomagnetic activity. The regions 1 and 2 current system is shown to be the most reproducible feature of the currents, followed by cusp currents associated with magnetic tension forces on newly reconnected field lines. The cusp currents are strongly modulated by season, indicating that their strength is regulated by the ionospheric conductance at the foot of the field lines. PCA does not identify a pattern that is clearly characteristic of a substorm current wedge. Rather, a superposed epoch analysis of the currents associated with substorms demonstrates that there is not a single mode of response, but a complicated and subtle mixture of different patterns.
Thai, Le Hoang; Hai, Tran Son
2011-01-01
Facial Expression Classification is an interesting research problem in recent years. There are a lot of methods to solve this problem. In this research, we propose a novel approach using Canny, Principal Component Analysis (PCA) and Artificial Neural Network. Firstly, in preprocessing phase, we use Canny for local region detection of facial images. Then each of local region's features will be presented based on Principal Component Analysis (PCA). Finally, using Artificial Neural Network (ANN)applies for Facial Expression Classification. We apply our proposal method (Canny_PCA_ANN) for recognition of six basic facial expressions on JAFFE database consisting 213 images posed by 10 Japanese female models. The experimental result shows the feasibility of our proposal method.
Devi, Seema; Panigrahi, Prasanta K.; Pradhan, Asima
2014-12-01
Intrinsic fluorescence spectra of the human normal, cervical intraepithelial neoplasia 1 (CIN1), CIN2, and cervical cancer tissue have been extracted by effectively combining the measured polarized fluorescence and polarized elastic scattering spectra. The efficacy of principal component analysis (PCA) to disentangle the collective behavior from smaller correlated clusters in a dimensionally reduced space in conjunction with the intrinsic fluorescence is examined. This combination unambiguously reveals the biochemical changes occurring with the progression of the disease. The differing activities of the dominant fluorophores, collagen, nicotinamide adenine dinucleotide, flavins, and porphyrin of different grades of precancers are clearly identified through a careful examination of the sectorial behavior of the dominant eigenvectors of PCA. To further classify the different grades, the Mahalanobis distance has been calculated using the scores of selected principal components.
Kernel principal component and maximum autocorrelation factor analyses for change detection
DEFF Research Database (Denmark)
Nielsen, Allan Aasbjerg; Canty, Morton John
2009-01-01
in Nevada acquired on successive passes of the Landsat-5 satellite in August-September 1991. The six-band images (the thermal band is omitted) with 1,000 by 1,000 28.5 m pixels were first processed with the iteratively re-weighted MAD (IR-MAD) algorithm in order to discriminate change. Then the MAD image......Principal component analysis (PCA) has often been used to detect change over time in remotely sensed images. A commonly used technique consists of finding the projections along the eigenvectors for data consisting of pair-wise (perhaps generalized) differences between corresponding spectral bands...... covering the same geographical region acquired at two different time points. In this paper kernel versions of the principal component and maximum autocorrelation factor (MAF) transformations are used to carry out the analysis. An example is based on bi-temporal Landsat-5 TM imagery over irrigation fields...
Principal Component Analysis and Cluster Analysis in Profile of Electrical System
Iswan; Garniwa, I.
2017-03-01
This paper propose to present approach for profile of electrical system, presented approach is combination algorithm, namely principal component analysis (PCA) and cluster analysis. Based on relevant data of gross domestic regional product and electric power and energy use. This profile is set up to show the condition of electrical system of the region, that will be used as a policy in the electrical system of spatial development in the future. This paper consider 24 region in South Sulawesi province as profile center points and use principal component analysis (PCA) to asses the regional profile for development. Cluster analysis is used to group these region into few cluster according to the new variable be produced PCA. The general planning of electrical system of South Sulawesi province can provide support for policy making of electrical system development. The future research can be added several variable into existing variable.
Nonlinear real-life signal detection with a supervised principal components analysis.
Zhou, C T; Cai, T X; Cai, T F
2007-03-01
A novel strategy named supervised principal components analysis for the detection of a target signal of interest embedded in an unknown noisy environment has been investigated. There are two channels in our detection scheme. Each channel consists of a nonlinear phase-space reconstructor (for embedding a data matrix using the received time series) and a principal components analyzer (for feature extraction), respectively. The output error time series, which results from the difference of both eigenvectors of the correlation data matrices from these two channels, is then analyzed using time-frequency tools, for example, frequency spectrum or Wigner-Ville distribution. Experimental results based on real-life electromagnetic data are presented to demonstrate the detection performance of our algorithm. It is found that weak signals hidden beneath the noise floor can be detected. Furthermore, the robustness of the detection performance clearly illustrated that signal frequencies can be extracted when the signal power is not too low.
Online signature recognition using principal component analysis and artificial neural network
Hwang, Seung-Jun; Park, Seung-Je; Baek, Joong-Hwan
2016-12-01
In this paper, we propose an algorithm for on-line signature recognition using fingertip point in the air from the depth image acquired by Kinect. We extract 10 statistical features from X, Y, Z axis, which are invariant to changes in shifting and scaling of the signature trajectories in three-dimensional space. Artificial neural network is adopted to solve the complex signature classification problem. 30 dimensional features are converted into 10 principal components using principal component analysis, which is 99.02% of total variances. We implement the proposed algorithm and test to actual on-line signatures. In experiment, we verify the proposed method is successful to classify 15 different on-line signatures. Experimental result shows 98.47% of recognition rate when using only 10 feature vectors.
Bio-inspired controller for a dexterous prosthetic hand based on Principal Components Analysis.
Matrone, G; Cipriani, C; Secco, E L; Carrozza, M C; Magenes, G
2009-01-01
Controlling a dexterous myoelectric prosthetic hand with many degrees of freedom (DoFs) could be a very demanding task, which requires the amputee for high concentration and ability in modulating many different muscular contraction signals. In this work a new approach to multi-DoF control is proposed, which makes use of Principal Component Analysis (PCA) to reduce the DoFs space dimensionality and allow to drive a 15 DoFs hand by means of a 2 DoFs signal. This approach has been tested and properly adapted to work onto the underactuated robotic hand named CyberHand, using mouse cursor coordinates as input signals and a principal components (PCs) matrix taken from the literature. First trials show the feasibility of performing grasps using this method. Further tests with real EMG signals are foreseen.
Principal component analysis-based inversion of effective temperatures for late-type stars
Paletou, F; Houdebine, E R; Watson, V
2015-01-01
We show how the range of application of the principal component analysis-based inversion method of Paletou et al. (2015) can be extended to late-type stars data. Besides being an extension of its original application domain, for FGK stars, we also used synthetic spectra for our learning database. We discuss our results on effective temperatures against previous evaluations made available from Vizier and Simbad services at CDS.
Extracting quantum dynamics from genetic learning algorithms through principal component analysis
White, J L; Bucksbaum, P H
2004-01-01
Genetic learning algorithms are widely used to control ultrafast optical pulse shapes for photo-induced quantum control of atoms and molecules. An outstanding issue is how to use the solutions found by these algorithms to learn about the system's quantum dynamics. We propose a simple method based on principal component analysis of the control space, which can reveal the degrees of freedom responsible for control, and aid in the construction of an effective Hamiltonian for the dynamics.
An application of principal component analysis to the clavicle and clavicle fixation devices.
LENUS (Irish Health Repository)
Daruwalla, Zubin J
2010-01-01
Principal component analysis (PCA) enables the building of statistical shape models of bones and joints. This has been used in conjunction with computer assisted surgery in the past. However, PCA of the clavicle has not been performed. Using PCA, we present a novel method that examines the major modes of size and three-dimensional shape variation in male and female clavicles and suggests a method of grouping the clavicle into size and shape categories.
Friesen, Christine Elizabeth; Seliske, Patrick; Papadopoulos, Andrew
2016-01-01
Objectives. Socioeconomic status (SES) is a comprehensive indicator of health status and is useful in area-level health research and informing public health resource allocation. Principal component analysis (PCA) is a useful tool for developing SES indices to identify area-level disparities in SES within communities. While SES research in Canada has relied on census data, the voluntary nature of the 2011 National Household Survey challenges the validity of its data, especially income variables. This study sought to determine the appropriateness of replacing census income information with tax filer data in neighbourhood SES index development. Methods. Census and taxfiler data for Guelph, Ontario were retrieved for the years 2005, 2006, and 2011. Data were extracted for eleven income and non-income SES variables. PCA was employed to identify significant principal components from each dataset and weights of each contributing variable. Variable-specific factor scores were applied to standardized census and taxfiler data values to produce SES scores. Results. The substitution of taxfiler income variables for census income variables yielded SES score distributions and neighbourhood SES classifications that were similar to SES scores calculated using entirely census variables. Combining taxfiler income variables with census non-income variables also produced clearer SES level distinctions. Internal validation procedures indicated that utilizing multiple principal components produced clearer SES level distinctions than using only the first principal component. Conclusion. Identifying socioeconomic disparities between neighbourhoods is an important step in assessing the level of disadvantage of communities. The ability to replace census income information with taxfiler data to develop SES indices expands the versatility of public health research and planning in Canada, as more data sources can be explored. The apparent usefulness of PCA also contributes to the improvement
Yousefi, Fakhri; Karimi, Hajir; Mohammadiyan, Somayeh
2016-11-01
This paper applies the model including back-propagation network (BPN) and principal component analysis (PCA) to estimate the effective viscosity of carbon nanotubes suspension. The effective viscosities of multiwall carbon nanotubes suspension are examined as a function of the temperature, nanoparticle volume fraction, effective length of nanoparticle and the viscosity of base fluids using artificial neural network. The obtained results by BPN-PCA model have good agreement with the experimental data.
Yan, Daikang; Gades, Lisa; Jacobsen, Chris; Madden, Timothy; Miceli, Antonino
2016-01-01
We present a method using principal component analysis (PCA) to process x-ray pulses with severe shape variation where traditional optimal filter methods fail. We demonstrate that PCA is able to noise-filter and extract energy information from x-ray pulses despite their different shapes. We apply this method to a dataset from an x-ray thermal kinetic inductance detector which has severe pulse shape variation arising from position-dependent absorption.
DEFF Research Database (Denmark)
Rasmussen, Peter Mondrup; Abrahamsen, Trine Julie; Madsen, Kristoffer Hougaard
2012-01-01
We investigate the use of kernel principal component analysis (PCA) and the inverse problem known as pre-image estimation in neuroimaging: i) We explore kernel PCA and pre-image estimation as a means for image denoising as part of the image preprocessing pipeline. Evaluation of the denoising...... base these illustrations on two fMRI BOLD data sets — one from a simple finger tapping experiment and the other from an experiment on object recognition in the ventral temporal lobe....
Directory of Open Access Journals (Sweden)
Nop Sopipan
2013-01-01
Full Text Available The aim of this study was to forecast the returns for the Stock Exchange of Thailand (SET Index by adding some explanatory variables and stationary Autoregressive order p (AR (p in the mean equation of returns. In addition, we used Principal Component Analysis (PCA to remove possible complications caused by multicollinearity. Results showed that the multiple regressions based on PCA, has the best performance.
Generalized multilevel function-on-scalar regression and principal component analysis.
Goldsmith, Jeff; Zipunnikov, Vadim; Schrack, Jennifer
2015-06-01
This manuscript considers regression models for generalized, multilevel functional responses: functions are generalized in that they follow an exponential family distribution and multilevel in that they are clustered within groups or subjects. This data structure is increasingly common across scientific domains and is exemplified by our motivating example, in which binary curves indicating physical activity or inactivity are observed for nearly 600 subjects over 5 days. We use a generalized linear model to incorporate scalar covariates into the mean structure, and decompose subject-specific and subject-day-specific deviations using multilevel functional principal components analysis. Thus, functional fixed effects are estimated while accounting for within-function and within-subject correlations, and major directions of variability within and between subjects are identified. Fixed effect coefficient functions and principal component basis functions are estimated using penalized splines; model parameters are estimated in a Bayesian framework using Stan, a programming language that implements a Hamiltonian Monte Carlo sampler. Simulations designed to mimic the application have good estimation and inferential properties with reasonable computation times for moderate datasets, in both cross-sectional and multilevel scenarios; code is publicly available. In the application we identify effects of age and BMI on the time-specific change in probability of being active over a 24-hour period; in addition, the principal components analysis identifies the patterns of activity that distinguish subjects and days within subjects.
Principal component analysis of Raman spectra for TiO2 nanoparticle characterization
Ilie, Alina Georgiana; Scarisoareanu, Monica; Morjan, Ion; Dutu, Elena; Badiceanu, Maria; Mihailescu, Ion
2017-09-01
The Raman spectra of anatase/rutile mixed phases of Sn doped TiO2 nanoparticles and undoped TiO2 nanoparticles, synthesised by laser pyrolysis, with nanocrystallite dimensions varying from 8 to 28 nm, was simultaneously processed with a self-written software that applies Principal Component Analysis (PCA) on the measured spectrum to verify the possibility of objective auto-characterization of nanoparticles from their vibrational modes. The photo-excited process of Raman scattering is very sensible to the material characteristics, especially in the case of nanomaterials, where more properties become relevant for the vibrational behaviour. We used PCA, a statistical procedure that performs eigenvalue decomposition of descriptive data covariance, to automatically analyse the sample's measured Raman spectrum, and to interfere the correlation between nanoparticle dimensions, tin and carbon concentration, and their Principal Component values (PCs). This type of application can allow an approximation of the crystallite size, or tin concentration, only by measuring the Raman spectrum of the sample. The study of loadings of the principal components provides information of the way the vibrational modes are affected by the nanoparticle features and the spectral area relevant for the classification.
Institute of Scientific and Technical Information of China (English)
Xinguang Wang; Nicholas O'Dwyer; Mark Halaki
2013-01-01
Walking is a complex task which includes hundreds of muscles, bones and joints working together to deliver smooth movements. With the complexity, walking has been widely investigated in order to identify the pattern of multi-segment movement and reveal the control mechanism. The degree of freedom and dimensional properties provide a view of the coordinative structure during walking, which has been extensively studied by using dimension reduction technique. In this paper, the studies related to the coordinative structure, dimensions detection and pattern reorganization during walking have been reviewed. Principal component analysis, as a popular technique, is widely used in the processing of human movement data. Both the principle and the outcomes of principal component analysis were introduced in this paper. This technique has been reported to successfully reduce the redundancy within the original data, identify the physical meaning represented by the extracted principal components and discriminate the different patterns. The coordinative structure during walking assessed by this technique could provide further information of the body control mechanism and correlate walking pattern with injury.
[Content of mineral elements of Gastrodia elata by principal components analysis].
Li, Jin-ling; Zhao, Zhi; Liu, Hong-chang; Luo, Chun-li; Huang, Ming-jin; Luo, Fu-lai; Wang, Hua-lei
2015-03-01
To study the content of mineral elements and the principal components in Gastrodia elata. Mineral elements were determined by ICP and the data was analyzed by SPSS. K element has the highest content-and the average content was 15.31 g x kg(-1). The average content of N element was 8.99 g x kg(-1), followed by K element. The coefficient of variation of K and N was small, but the Mn was the biggest with 51.39%. The highly significant positive correlation was found among N, P and K . Three principal components were selected by principal components analysis to evaluate the quality of G. elata. P, B, N, K, Cu, Mn, Fe and Mg were the characteristic elements of G. elata. The content of K and N elements was higher and relatively stable. The variation of Mn content was biggest. The quality of G. elata in Guizhou and Yunnan was better from the perspective of mineral elements.
Zou, Ling; Zhang, Yingchun; Yang, Laurence T; Zhou, Renlai
2010-02-01
The authors have developed a new approach by combining the wavelet denoising and principal component analysis methods to reduce the number of required trials for efficient extraction of brain evoked-related potentials (ERPs). Evoked-related potentials were initially extracted using wavelet denoising to enhance the signal-to-noise ratio of raw EEG measurements. Principal components of ERPs accounting for 80% of the total variance were extracted as part of the subspace of the ERPs. Finally, the ERPs were reconstructed from the selected principal components. Computer simulation results showed that the combined approach provided estimations with higher signal-to-noise ratio and lower root mean squared error than each of them alone. The authors further tested this proposed approach in single-trial ERPs extraction during an emotional process and brain responses analysis to emotional stimuli. The experimental results also demonstrated the effectiveness of this combined approach in ERPs extraction and further supported the view that emotional stimuli are processed more intensely.
Strale, Mathieu; Krysinska, Karolina; Overmeiren, Gaëtan Van; Andriessen, Karl
2017-06-01
This study investigated the geographic distribution of suicide and railway suicide in Belgium over 2008--2013 on local (i.e., district or arrondissement) level. There were differences in the regional distribution of suicide and railway suicides in Belgium over the study period. Principal component analysis identified three groups of correlations among population variables and socio-economic indicators, such as population density, unemployment, and age group distribution, on two components that helped explaining the variance of railway suicide at a local (arrondissement) level. This information is of particular importance to prevent suicides in high-risk areas on the Belgian railway network.
Batch process monitoring based on multiple-phase online sorting principal component analysis.
Lv, Zhaomin; Yan, Xuefeng; Jiang, Qingchao
2016-09-01
Existing phase-based batch or fed-batch process monitoring strategies generally have two problems: (1) phase number, which is difficult to determine, and (2) uneven length feature of data. In this study, a multiple-phase online sorting principal component analysis modeling strategy (MPOSPCA) is proposed to monitor multiple-phase batch processes online. Based on all batches of off-line normal data, a new multiple-phase partition algorithm is proposed, where k-means and a defined average Euclidean radius are employed to determine the multiple-phase data set and phase number. Principal component analysis is then applied to build the model in each phase, and all the components are retained. In online monitoring, the Euclidean distance is used to select the monitoring model. All the components undergo online sorting through a parameter defined by Bayesian inference (BI). The first several components are retained to calculate the T(2) statistics. Finally, the respective probability indices of [Formula: see text] is obtained using BI as the moving average strategy. The feasibility and effectiveness of MPOSPCA are demonstrated through a simple numerical example and the fed-batch penicillin fermentation process.
Principal component analysis of PiB distribution in Parkinson and Alzheimer diseases.
Campbell, Meghan C; Markham, Joanne; Flores, Hubert; Hartlein, Johanna M; Goate, Alison M; Cairns, Nigel J; Videen, Tom O; Perlmutter, Joel S
2013-08-06
To use principal component analyses (PCA) of Pittsburgh compound B (PiB) PET imaging to determine whether the pattern of in vivo β-amyloid (Aβ) in Parkinson disease (PD) with cognitive impairment is similar to the pattern found in symptomatic Alzheimer disease (AD). PiB PET scans were obtained from participants with PD with cognitive impairment (n = 53), participants with symptomatic AD (n = 35), and age-matched controls (n = 67). All were assessed using the Clinical Dementia Rating and APOE genotype was determined in 137 participants. PCA was used to (1) determine the PiB binding pattern in AD, (2) determine a possible unique PD pattern, and (3) directly compare the PiB binding patterns in PD and AD groups. The first 2 principal components (PC1 and PC2) significantly separated the AD and control participants (p < 0.001). Participants with PD with cognitive impairment also were significantly different from participants with symptomatic AD on both components (p < 0.001). However, there was no difference between PD and controls on either component. Even those participants with PD with elevated mean cortical binding potentials were significantly different from participants with AD on both components. Using PCA, we demonstrated that participants with PD with cognitive impairment do not exhibit the same PiB binding pattern as participants with AD. These data suggest that Aβ deposition may play a different pathophysiologic role in the cognitive impairment of PD compared to that in AD.
Institute of Scientific and Technical Information of China (English)
GUO Qintao; ZHANG Lingmi; TAO Zheng
2008-01-01
Thin wall component is utilized to absorb impact energy of a structure. However, the dynamic behavior of such thin-walled structure is highly non-linear with material, geometry and boundary non-linearity. A model updating and validation procedure is proposed to build accurate finite element model of a frame structure with a non-linear thin-walled component for dynamic analysis. Design of experiments (DOE) and principal component decomposition (PCD) approach are applied to extract dynamic feature from nonlinear impact response for correlation of impact test result and FE model of the non-linear structure. A strain-rate-dependent non-linear model updating method is then developed to build accurate FE model of the structure. Computer simulation and a real frame structure with a highly non-linear thin-walled component are employed to demonstrate the feasibility and effectiveness of the proposed approach.
Algorithms for Sparse Non-negative Tucker Decompositions
DEFF Research Database (Denmark)
Mørup, Morten; Hansen, Lars Kai
2008-01-01
for Tucker decompositions when indeed the data and interactions can be considered non-negative. We further illustrate how sparse coding can help identify what model (PARAFAC or Tucker) is the most appropriate for the data as well as to select the number of components by turning off excess components...
The difference between 5 x 5 doubly nonnegative and completely positive matrices
Burer, Samuel; Anstreicher, Kurt M.; Duer, Mirjam
2009-01-01
The convex cone of n x n completely positive (CP) matrices and its dual cone of copositive matrices arise in several areas of applied mathematics, including optimization. Every CP matrix is doubly nonnegative (DNN), i.e., positive semidefinite and component-wise nonnegative, and it is known that, fo
The difference between 5 × 5 doubly nonnegative and completely positive matrices
Burer, Samuel; Anstreicher, Kurt M.; Dür, Mirjam
2009-01-01
The convex cone of n × n completely positive (CP) matrices and its dual cone of copositive matrices arise in several areas of applied mathematics, including optimization. Every CP matrix is doubly nonnegative (DNN), i.e., positive semidefinite and component-wise nonnegative, and it is known that, fo
The difference between 5 × 5 doubly nonnegative and completely positive matrices
Burer, Samuel; Anstreicher, Kurt M.; Dür, Mirjam
2009-01-01
The convex cone of n × n completely positive (CP) matrices and its dual cone of copositive matrices arise in several areas of applied mathematics, including optimization. Every CP matrix is doubly nonnegative (DNN), i.e., positive semidefinite and component-wise nonnegative, and it is known that,
The difference between 5 x 5 doubly nonnegative and completely positive matrices
Burer, Samuel; Anstreicher, Kurt M.; Duer, Mirjam
2009-01-01
The convex cone of n x n completely positive (CP) matrices and its dual cone of copositive matrices arise in several areas of applied mathematics, including optimization. Every CP matrix is doubly nonnegative (DNN), i.e., positive semidefinite and component-wise nonnegative, and it is known that,
Hurley, Peter D; Farrah, Duncan; Wang, Lingyu; Efstathiou, Andreas
2012-01-01
The mid-infrared spectra of ultraluminous infrared galaxies (ULIRGs) contain a variety of spectral features that can be used as diagnostics to characterise the spectra. However, such diagnostics are biased by our prior prejudices on the origin of the features. Moreover, by using only part of the spectrum they do not utilise the full information content of the spectra. Blind statistical techniques such as principal component analysis (PCA) consider the whole spectrum, find correlated features and separate them out into distinct components. We further investigate the principal components (PCs) of ULIRGs derived in Wang et al.(2011). We quantitatively show that five PCs is optimal for describing the IRS spectra. These five components (PC1-PC5) and the mean spectrum provide a template basis set that reproduces spectra of all z<0.35 ULIRGs within the noise. For comparison, the spectra are also modelled with a combination of radiative transfer models of both starbursts and the dusty torus surrounding active gala...
PCA of PCA: Principal Component Analysis of Partial Covering Absorption in NGC 1365
Parker, M L; Fabian, A C; Risaliti, G
2014-01-01
We analyse 400 ks of XMM-Newton data on the active galactic nucleus NGC 1365 using principal component analysis (PCA) to identify model independent spectral components. We find two significant components and demonstrate that they are qualitatively different from those found in MCG?6-30-15 using the same method. As the variability in NGC 1365 is known to be due to changes in the parameters of a partial covering neutral absorber, this shows that the same mechanism cannot be the driver of variability in MCG-6-30-15. By examining intervals where the spectrum shows relatively low absorption we separate the effects of intrinsic source variability, including signatures of relativistic reflection, from variations in the intervening absorption. We simulate the principal components produced by different physical variations, and show that PCA provides a clear distinction between absorption and reflection as the drivers of variability in AGN spectra. The simulations are shown to reproduce the PCA spectra of both NGC 1365...
Principal component analysis of dynamic fluorescence images for diagnosis of diabetic vasculopathy
Seo, Jihye; An, Yuri; Lee, Jungsul; Ku, Taeyun; Kang, Yujung; Ahn, Chulwoo; Choi, Chulhee
2016-04-01
Indocyanine green (ICG) fluorescence imaging has been clinically used for noninvasive visualizations of vascular structures. We have previously developed a diagnostic system based on dynamic ICG fluorescence imaging for sensitive detection of vascular disorders. However, because high-dimensional raw data were used, the analysis of the ICG dynamics proved difficult. We used principal component analysis (PCA) in this study to extract important elements without significant loss of information. We examined ICG spatiotemporal profiles and identified critical features related to vascular disorders. PCA time courses of the first three components showed a distinct pattern in diabetic patients. Among the major components, the second principal component (PC2) represented arterial-like features. The explained variance of PC2 in diabetic patients was significantly lower than in normal controls. To visualize the spatial pattern of PCs, pixels were mapped with red, green, and blue channels. The PC2 score showed an inverse pattern between normal controls and diabetic patients. We propose that PC2 can be used as a representative bioimaging marker for the screening of vascular diseases. It may also be useful in simple extractions of arterial-like features.
Directory of Open Access Journals (Sweden)
Peterson Mark D
2012-11-01
Full Text Available Abstract Background The purpose of this study was to determine the sex-specific pattern of pediatric cardiometabolic risk with principal component analysis, using several biological, behavioral and parental variables in a large cohort (n = 2866 of 6th grade students. Methods Cardiometabolic risk components included waist circumference, fasting glucose, blood pressure, plasma triglycerides levels and HDL-cholesterol. Principal components analysis was used to determine the pattern of risk clustering and to derive a continuous aggregate score (MetScore. Stratified risk components and MetScore were analyzed for association with age, body mass index (BMI, cardiorespiratory fitness (CRF, physical activity (PA, and parental factors. Results In both boys and girls, BMI and CRF were associated with multiple risk components, and overall MetScore. Maternal smoking was associated with multiple risk components in girls and boys, as well as MetScore in boys, even after controlling for children’s BMI. Paternal family history of early cardiovascular disease (CVD and parental age were associated with increased blood pressure and MetScore for girls. Children’s PA levels, maternal history of early CVD, and paternal BMI were also indicative for various risk components, but not MetScore. Conclusions Several biological and behavioral factors were independently associated with children’s cardiometabolic disease risk, and thus represent a unique gender-specific risk profile. These data serve to bolster the independent contribution of CRF, PA, and family-oriented healthy lifestyles for improving children’s health.
Change of climate pattern in the Baltic States using principal component analysis
Bethere, Liga; Sennikovs, Juris; Bethers, Uldis
2017-04-01
The aim of this work is to compare the climate of past and future in the Baltic States. The regional climate model (RCM) data from project ENSEMBLES was used and bias correction procedure was carried out. Monthly average temperature and monthly total precipitation values were chosen as the variables that best capture the climate features important for the society. In the first part of our work we used principal component analysis (PCA) on data for years 1961-1990 to reduce the number of initial climate variables and create indices that represent the main features of the climate in the Baltic States. Standardization of variables was done using a modified approach. The first three principal components explained most of the variation in the initial variables and were analyzed further. We calculated the correlation coefficients between the retained principal components and initial variables, plotted them for the study region and compared the spatial patterns with the climate features reported in literature. It could be observed that the first component (PC1) is highly positively correlated with the temperature and precipitation in winter, which means that high values of PC1 correspond to warm winters with a lot of snow. Also PC1 values have east-west gradient with warmer winters at the shores of the Baltic Sea. PC1 values are also similar to the start date of the winter reported in literature. The second principal component (PC2) has a strong negative correlation with the autumn precipitation and shows a significant positive correlation with all temperature variables. This means that high values of PC2 correspond to a year that is warmer than average and to years with dry autumns. The PC2 pattern is similar to the spatial distribution of the start of the spring and summer phenological events and growing degree day values. The changes to PC2 therefore imply possible changes in the plant suitability for a specific region. The third principal component (PC3) is mainly
Hu, Chen; Ho, Luis C; Ferland, Gary J; Baldwin, Jack A; Wang, Ye
2012-01-01
We report on a spectral principal component analysis (SPCA) of a sample of 816 quasars, selected to have small Fe II velocity shifts with spectral coverage in the rest wavelength range 3500--5500 \\AA. The sample is explicitly designed to mitigate spurious effects on SPCA induced by Fe II velocity shifts. We improve the algorithm of SPCA in the literature and introduce a new quantity, \\emph{the fractional-contribution spectrum}, that effectively identifies the emission features encoded in each eigenspectrum. The first eigenspectrum clearly records the power-law continuum and very broad Balmer emission lines. Narrow emission lines dominate the second eigenspectrum. The third eigenspectrum represents the Fe II emission and a component of the Balmer lines with kinematically similar intermediate velocity widths. Correlations between the weights of the eigenspectra and parametric measurements of line strength and continuum slope confirm the above interpretation for the eigenspectra. Monte Carlo simulations demonstr...
Bai, Libing; Gao, Bin; Tian, Shulin; Cheng, Yuhua; Chen, Yifan; Tian, Gui Yun; Woo, W. L.
2013-10-01
Eddy Current Pulsed Thermography (ECPT), an emerging Non-Destructive Testing and Evaluation technique, has been applied for a wide range of materials. The lateral heat diffusion leads to decreasing of temperature contrast between defect and defect-free area. To enhance the flaw contrast, different statistical methods, such as Principal Component Analysis and Independent Component Analysis, have been proposed for thermography image sequences processing in recent years. However, there is lack of direct and detailed independent comparisons in both algorithm implementations. The aim of this article is to compare the two methods and to determine the optimized technique for flaw contrast enhancement in ECPT data. Verification experiments are conducted on artificial and thermal fatigue nature crack detection.
Zupancic, Gregor
2003-10-01
A method was developed for dynamic spectrophotometric measurements in vivo in the presence of non-specific spectral changes due to external disturbances. This method was used to measure changes in mitochondrial respiratory pigment redox states in photoreceptor cells of live, white-eyed mutants of the blowfly Calliphora vicina. The changes were brought about by exchanging the atmosphere around an immobilised animal from air to N2 and back again by a rapid gas exchange system. During an experiment reflectance spectra were measured by a linear CCD array spectrophotometer. This method involves the pre-processing steps of difference spectra calculation and digital filtering in one and two dimensions. These were followed by time-domain principal component analysis (PCA). PCA yielded seven significant time domain principal component vectors and seven corresponding spectral score vectors. In addition, through PCA we also obtained a time course of changes common to all wavelengths-the residual vector, corresponding to non-specific spectral changes due to preparation movement or mitochondrial swelling. In the final step the redox state time courses were obtained by fitting linear combinations of respiratory pigment difference spectra to each of the seven score vectors. The resulting matrix of factors was then multiplied by the matrix of seven principal component vectors to yield the time courses of respiratory pigment redox states. The method can be used, with minor modifications, in many cases of time-resolved optical measurements of multiple overlapping spectral components, especially in situations where non-specific external influences cannot be disregarded.
Principal components analysis of reward prediction errors in a reinforcement learning task.
Sambrook, Thomas D; Goslin, Jeremy
2016-01-01
Models of reinforcement learning represent reward and punishment in terms of reward prediction errors (RPEs), quantitative signed terms describing the degree to which outcomes are better than expected (positive RPEs) or worse (negative RPEs). An electrophysiological component known as feedback related negativity (FRN) occurs at frontocentral sites 240-340ms after feedback on whether a reward or punishment is obtained, and has been claimed to neurally encode an RPE. An outstanding question however, is whether the FRN is sensitive to the size of both positive RPEs and negative RPEs. Previous attempts to answer this question have examined the simple effects of RPE size for positive RPEs and negative RPEs separately. However, this methodology can be compromised by overlap from components coding for unsigned prediction error size, or "salience", which are sensitive to the absolute size of a prediction error but not its valence. In our study, positive and negative RPEs were parametrically modulated using both reward likelihood and magnitude, with principal components analysis used to separate out overlying components. This revealed a single RPE encoding component responsive to the size of positive RPEs, peaking at ~330ms, and occupying the delta frequency band. Other components responsive to unsigned prediction error size were shown, but no component sensitive to negative RPE size was found.
Anomaly Detection System Based on Principal Component Analysis and Support Vector Machine
Institute of Scientific and Technical Information of China (English)
LI Zhanchun; LI Zhitang; LIU Bin
2006-01-01
This article presents an anomaly detection system based on principal component analysis (PCA) and support vector machine (SVM). The system first creates a profile defining a normal behavior by frequency-based scheme, and then compares the similarity of a current behavior with the created profile to decide whether the input instance is normal or anomaly. In order to avoid overfitting and reduce the computational burden, normal behavior principal features are extracted by the PCA method. SVM is used to distinguish normal or anomaly for user behavior after training procedure has been completed by learning. In the experiments for performance evaluation the system achieved a correct detection rate equal to 92.2% and a false detection rate equal to 2.8%.
Principal component analysis of global maps of the total electronic content
Maslennikova, Yu. S.; Bochkarev, V. V.
2014-03-01
In this paper we present results of the spatial distribution analysis of the total electron content (TEC) performed by the Principal Component Analysis (PCA) with the use of global maps of TEC provided by the JPL laboratory (Jet Propulsion Laboratory, NASA, USA) for the period from 2004 to 2010. We show that the obtained components of the decomposition of TEC essentially depend on the representation of the initial data and the method of their preliminary processing. We propose a technique for data centering that allows us to take into account the influence of diurnal and seasonal factors. We establish a correlation between amplitudes of the first components of the decomposition of TEC (connected with the equatorial anomaly) and the solar activity index F10.7, as well as with the flow of high energy particles of the solar wind.
Application of the Model of Principal Components Analysis on Romanian Insurance Market
Directory of Open Access Journals (Sweden)
Dan Armeanu
2008-06-01
Full Text Available Principal components analysis (PCA is a multivariate data analysis technique whose main purpose is to reduce the dimension of the observations and thus simplify the analysis and interpretation of data, as well as facilitate the construction of predictive models. A rigorous definition of PCA has been given by Bishop (1995 and it states that PCA is a linear dimensionality reduction technique, which identifies orthogonal directions of maximum variance in the original data, and projects the data into a lower-dimensionality space formed of a sub-set of the highest-variance components. PCA is commonly used in economic research, as well as in other fields of activity. When faced with the complexity of economic and financial processes, researchers have to analyze a large number of variables (or indicators, fact which often proves to be troublesome because it is difficult to collect such a large amount of data and perform calculations on it. In addition, there is a good chance that the initial data is powerfully correlated; therefore, the signification of variables is seriously diminished and it is virtually impossible to establish causal relationships between variables. Researchers thus require a simple, yet powerful annalytical tool to solve these problems and perform a coherent and conclusive analysis. This tool is PCA.The essence of PCA consists of transforming the space of the initial data into another space of lower dimension while maximising the quantity of information recovered from the initial space(1. Mathematically speaking, PCA is a method of determining a new space (called principal component space or factor space onto which the original space of variables can be projected. The axes of the new space (called factor axes are defined by the principal components determined as result of PCA. Principal components (PC are standardized linear combinations (SLC of the original variables and are uncorrelated. Theoretically, the number of PCs equals
The use of principal components and univariate charts to control multivariate processes
Directory of Open Access Journals (Sweden)
Marcela A. G. Machado
2008-04-01
Full Text Available In this article, we evaluate the performance of the T² chart based on the principal components (PC X chart and the simultaneous univariate control charts based on the original variables (SU charts or based on the principal components (SUPC charts. The main reason to consider the PC chart lies on the dimensionality reduction. However, depending on the disturbance and on the way the original variables are related, the chart is very slow in signaling, except when all variables are negatively correlated and the principal component is wisely selected. Comparing the SU , the SUPC and the T² charts we conclude that the SU X charts (SUPC charts have a better overall performance when the variables are positively (negatively correlated. We also develop the expression to obtain the power of two S² charts designed for monitoring the covariance matrix. These joint S² charts are, in the majority of the cases, more efficient than the generalized variance chart.Neste artigo, avaliamos o desempenho do gráfico de T² baseado em componentes principais (gráfico PC e dos gráficos de controle simultâneos univariados baseados nas variáveis originais (gráfico SU X ou baseados em componentes principais (gráfico SUPC. A principal razão para o uso do gráfico PC é a redução de dimensionalidade. Entretanto, dependendo da perturbação e da correlação entre as variáveis originais, o gráfico é lento em sinalizar, exceto quando todas as variáveis são negativamente correlacionadas e a componente principal é adequadamente escolhida. Comparando os gráficos SU X, SUPC e T² concluímos que o gráfico SU X (gráfico SUPC tem um melhor desempenho global quando as variáveis são positivamente (negativamente correlacionadas. Desenvolvemos também uma expressão para obter o poder de detecção de dois gráficos de S² projetados para controlar a matriz de covariâncias. Os gráficos conjuntos de S² são, na maioria dos casos, mais eficientes que o gr
Directory of Open Access Journals (Sweden)
Yuliana Yuliana
2010-06-01
Full Text Available Quantitative Electronic Structure Activity Relationship (QSAR analysis of a series of benzalacetones has been investigated based on semi empirical PM3 calculation data using Principal Components Regression (PCR. Investigation has been done based on antimutagen activity from benzalacetone compounds (presented by log 1/IC50 and was studied as linear correlation with latent variables (Tx resulted from transformation of atomic net charges using Principal Component Analysis (PCA. QSAR equation was determinated based on distribution of selected components and then was analysed with PCR. The result was described by the following QSAR equation : log 1/IC50 = 6.555 + (2.177.T1 + (2.284.T2 + (1.933.T3 The equation was significant on the 95% level with statistical parameters : n = 28 r = 0.766 SE = 0.245 Fcalculation/Ftable = 3.780 and gave the PRESS result 0.002. It means that there were only a relatively few deviations between the experimental and theoretical data of antimutagenic activity. New types of benzalacetone derivative compounds were designed and their theoretical activity were predicted based on the best QSAR equation. It was found that compounds number 29, 30, 31, 32, 33, 35, 36, 37, 38, 40, 41, 42, 44, 47, 48, 49 and 50 have a relatively high antimutagenic activity. Keywords: QSAR; antimutagenic activity; benzalaceton; atomic net charge
Dissecting the molecular structure of the Orion B cloud: insight from principal component analysis
Gratier, Pierre; Bron, Emeric; Gerin, Maryvonne; Pety, Jérôme; Guzman, Viviana V.; Orkisz, Jan; Bardeau, Sébastien; Goicoechea, Javier R.; Le Petit, Franck; Liszt, Harvey; Öberg, Karin; Peretto, Nicolas; Roueff, Evelyne; Sievers, Albrech; Tremblin, Pascal
2017-03-01
Context. The combination of wideband receivers and spectrometers currently available in (sub-)millimeter observatories deliver wide-field hyperspectral imaging of the interstellar medium. Tens of spectral lines can be observed over degree wide fields in about 50 h. This wealth of data calls for restating the physical questions about the interstellar medium in statistical terms. Aims: We aim to gain information on the physical structure of the interstellar medium from a statistical analysis of many lines from different species over a large field of view, without requiring detailed radiative transfer or astrochemical modeling. Methods: We coupled a non-linear rescaling of the data with one of the simplest multivariate analysis methods, namely the principal component analysis, to decompose the observed signal into components that we interpret first qualitatively and then quantitatively based on our deep knowledge of the observed region and of the astrochemistry at play. Results: We identify three principal components, linear compositions of line brightness temperatures, that are correlated at various levels with the column density, the volume density and the UV radiation field. Conclusions: When sampling a sufficiently diverse mixture of physical parameters, it is possible to decompose the molecular emission in order to gain physical insight on the observed interstellar medium. This opens a new avenue for future studies of the interstellar medium. Based on observations carried out at the IRAM-30 m single-dish telescope. IRAM is supported by INSU/CNRS (France), MPG (Germany) and IGN (Spain).
Biometric variability of goat populations revealed by means of principal component analysis.
Pires, Luanna Chácara; Machado, Théa M Medeiros; Araújo, Adriana Mello; Olson, Timothy A; da Silva, João Batista Lopes; Torres, Robledo Almeida; Costa, Márcio da Silva
2012-12-01
The aim was to analyze variation in 12 Brazilian and Moroccan goat populations, and, through principal component analysis (PCA), check the importance of body measures and their indices as a means of distinguishing among individuals and populations. The biometric measurements were wither height (WH), brisket height (BH) and ear length (EL). Thorax depth (WH-BH) and the three indices, TD/WH, EL/TD and EL/WH, were also calculated. Of the seven components extracted, the first three principal components were sufficient to explain 99.5% of the total variance of the data. Graphical dispersion by genetic groups revealed that European dairy breeds clustered together. The Moroccan breeds were separated into two groups, one comprising the Drâa and the other the Zagora and Rhâali breeds. Whereas, on the one side, the Anglo-Nubian and undefined breeds were the closest to one another the goats of the Azul were observed to have the highest variation of all the breeds. The Anglo-Nubian and Boer breeds were similar to each other. The Nambi-type goats remained distinct from all the other populations. In general, the use of graphical representation of PCA values allowed to distinguish genetic groups.
Biometric variability of goat populations revealed by means of principal component analysis
Directory of Open Access Journals (Sweden)
Luanna Chácara Pires
2012-01-01
Full Text Available The aim was to analyze variation in 12 Brazilian and Moroccan goat populations, and, through principal component analysis (PCA, check the importance of body measures and their indices as a means of distinguishing among individuals and populations. The biometric measurements were wither height (WH, brisket height (BH and ear length (EL. Thorax depth (WH-BH and the three indices, TD/WH, EL/TD and EL/WH, were also calculated. Of the seven components extracted, the first three principal components were sufficient to explain 99.5% of the total variance of the data. Graphical dispersion by genetic groups revealed that European dairy breeds clustered together. The Moroccan breeds were separated into two groups, one comprising the Drâa and the other the Zagora and Rhâali breeds. Whereas, on the one side, the Anglo-Nubian and undefined breeds were the closest to one another the goats of the Azul were observed to have the highest variation of all the breeds. The Anglo-Nubian and Boer breeds were similar to each other. The Nambi-type goats remained distinct from all the other populations. In general, the use of graphical representation of PCA values allowed to distinguish genetic groups.
Polat, Esra; Gunay, Suleyman
2013-10-01
One of the problems encountered in Multiple Linear Regression (MLR) is multicollinearity, which causes the overestimation of the regression parameters and increase of the variance of these parameters. Hence, in case of multicollinearity presents, biased estimation procedures such as classical Principal Component Regression (CPCR) and Partial Least Squares Regression (PLSR) are then performed. SIMPLS algorithm is the leading PLSR algorithm because of its speed, efficiency and results are easier to interpret. However, both of the CPCR and SIMPLS yield very unreliable results when the data set contains outlying observations. Therefore, Hubert and Vanden Branden (2003) have been presented a robust PCR (RPCR) method and a robust PLSR (RPLSR) method called RSIMPLS. In RPCR, firstly, a robust Principal Component Analysis (PCA) method for high-dimensional data on the independent variables is applied, then, the dependent variables are regressed on the scores using a robust regression method. RSIMPLS has been constructed from a robust covariance matrix for high-dimensional data and robust linear regression. The purpose of this study is to show the usage of RPCR and RSIMPLS methods on an econometric data set, hence, making a comparison of two methods on an inflation model of Turkey. The considered methods have been compared in terms of predictive ability and goodness of fit by using a robust Root Mean Squared Error of Cross-validation (R-RMSECV), a robust R2 value and Robust Component Selection (RCS) statistic.
Energy Technology Data Exchange (ETDEWEB)
Zimroz, Radoslaw [Wroclaw University of Technology, Diagnostics and Vibro-Acoustics Science Laboratory (Poland); Bartkowiak, Anna, E-mail: radoslaw.zimroz@pwr.wroc.pl, E-mail: aba@ii.uni.wroc.pl [University of Wroclaw, Institute of Computer Science, Wroclaw (Poland)
2011-07-19
Spectral analysis is well-established analysis of vibrations used in diagnostics both in academia and industry. In general, one may identify components related to particular stages in the gearbox and analyze amplitudes of these components with a simple rule for decision-making: if amplitudes are increasing the condition becomes worse. However, usually one should analyze not single amplitude but at least several components, but: how to analyze them simultaneously? We have provided an example (case study) for planetary gearboxes in good and bad conditions (case B and case A). As diagnostic features we have used 15 amplitudes of spectral components related to fundamental planetary mesh frequency and its harmonics. Using Principal Component Analysis (PCA), it has been shown that amplitudes don't vary in the same way; change of condition affects not only amplitudes of all components in that sense, but also relation between them. We have investigated geometry of the data and it has been shown that the proportions of the explained total inertia of the three data sets ('good', 'bad' and mixed good/bad) are different. We claim that it may be a novel diagnostic approach to employ multidimensional analysis for accounting not only directly observed values but also interrelations both within and between the two groups of data. Different structure of the data is associated with different condition of the machines and such assumption is specified for the first time in the literature. Obviously it requires more studies.
Energy Technology Data Exchange (ETDEWEB)
Holden, H.; LeDrew, E. [Univ. of Waterloo, Ontario (Canada)
1997-06-01
Remote discrimination of substrate types in relatively shallow coastal waters has been limited by the spatial and spectral resolution of available sensors. An additional limiting factor is the strong attenuating influence of the water column over the substrate. As a result, there have been limited attempts to map submerged ecosystems such as coral reefs based on spectral characteristics. Both healthy and bleached corals were measured at depth with a hand-held spectroradiometer, and their spectra compared. Two separate principal components analyses (PCA) were performed on two sets of spectral data. The PCA revealed that there is indeed a spectral difference based on health. In the first data set, the first component (healthy coral) explains 46.82%, while the second component (bleached coral) explains 46.35% of the variance. In the second data set, the first component (bleached coral) explained 46.99%; the second component (healthy coral) explained 36.55%; and the third component (healthy coral) explained 15.44 % of the total variance in the original data. These results are encouraging with respect to using an airborne spectroradiometer to identify areas of bleached corals thus enabling accurate monitoring over time.
Zhang, Yiwei; Pan, Wei
2015-03-01
Genome-wide association studies (GWAS) have been established as a major tool to identify genetic variants associated with complex traits, such as common diseases. However, GWAS may suffer from false positives and false negatives due to confounding population structures, including known or unknown relatedness. Another important issue is unmeasured environmental risk factors. Among many methods for adjusting for population structures, two approaches stand out: one is principal component regression (PCR) based on principal component analysis, which is perhaps the most popular due to its early appearance, simplicity, and general effectiveness; the other is based on a linear mixed model (LMM) that has emerged recently as perhaps the most flexible and effective, especially for samples with complex structures as in model organisms. As shown previously, the PCR approach can be regarded as an approximation to an LMM; such an approximation depends on the number of the top principal components (PCs) used, the choice of which is often difficult in practice. Hence, in the presence of population structure, the LMM appears to outperform the PCR method. However, due to the different treatments of fixed vs. random effects in the two approaches, we show an advantage of PCR over LMM: in the presence of an unknown but spatially confined environmental confounder (e.g., environmental pollution or lifestyle), the PCs may be able to implicitly and effectively adjust for the confounder whereas the LMM cannot. Accordingly, to adjust for both population structures and nongenetic confounders, we propose a hybrid method combining the use and, thus, strengths of PCR and LMM. We use real genotype data and simulated phenotypes to confirm the above points, and establish the superior performance of the hybrid method across all scenarios.
Estimation of surface curvature from full-field shape data using principal component analysis
Sharma, Sameer; Vinuchakravarthy, S.; Subramanian, S. J.
2017-01-01
Three-dimensional digital image correlation (3D-DIC) is a popular image-based experimental technique for estimating surface shape, displacements and strains of deforming objects. In this technique, a calibrated stereo rig is used to obtain and stereo-match pairs of images of the object of interest from which the shapes of the imaged surface are then computed using the calibration parameters of the rig. Displacements are obtained by performing an additional temporal correlation of the shapes obtained at various stages of deformation and strains by smoothing and numerically differentiating the displacement data. Since strains are of primary importance in solid mechanics, significant efforts have been put into computation of strains from the measured displacement fields; however, much less attention has been paid to date to computation of curvature from the measured 3D surfaces. In this work, we address this gap by proposing a new method of computing curvature from full-field shape measurements using principal component analysis (PCA) along the lines of a similar work recently proposed to measure strains (Grama and Subramanian 2014 Exp. Mech. 54 913-33). PCA is a multivariate analysis tool that is widely used to reveal relationships between a large number of variables, reduce dimensionality and achieve significant denoising. This technique is applied here to identify dominant principal components in the shape fields measured by 3D-DIC and these principal components are then differentiated systematically to obtain the first and second fundamental forms used in the curvature calculation. The proposed method is first verified using synthetically generated noisy surfaces and then validated experimentally on some real world objects with known ground-truth curvatures.
Kopparla, P.; Natraj, V.; Shia, R. L.; Spurr, R. J. D.; Crisp, D.; Yung, Y. L.
2015-12-01
Radiative transfer (RT) computations form the engine of atmospheric retrieval codes. However, full treatment of RT processes is computationally expensive, prompting usage of two-stream approximations in current exoplanetary atmospheric retrieval codes [Line et al., 2013]. Natraj et al. [2005, 2010] and Spurr and Natraj [2013] demonstrated the ability of a technique using principal component analysis (PCA) to speed up RT computations. In the PCA method for RT performance enhancement, empirical orthogonal functions are developed for binned sets of inherent optical properties that possess some redundancy; costly multiple-scattering RT calculations are only done for those few optical states corresponding to the most important principal components, and correction factors are applied to approximate radiation fields. Kopparla et al. [2015, in preparation] extended the PCA method to a broadband spectral region from the ultraviolet to the shortwave infrared (0.3-3 micron), accounting for major gas absorptions in this region. Here, we apply the PCA method to a some typical (exo-)planetary retrieval problems. Comparisons between the new model, called Universal Principal Component Analysis Radiative Transfer (UPCART) model, two-stream models and line-by-line RT models are performed, for spectral radiances, spectral fluxes and broadband fluxes. Each of these are calculated at the top of the atmosphere for several scenarios with varying aerosol types, extinction and scattering optical depth profiles, and stellar and viewing geometries. We demonstrate that very accurate radiance and flux estimates can be obtained, with better than 1% accuracy in all spectral regions and better than 0.1% in most cases, as compared to a numerically exact line-by-line RT model. The accuracy is enhanced when the results are convolved to typical instrument resolutions. The operational speed and accuracy of UPCART can be further improved by optimizing binning schemes and parallelizing the codes, work
Whitworth, M.; Giles, D.; Murphy, W.
The Jurassic strata of the Cotswolds escarpment of southern central United Kingdom are associated with extensive mass movement activity, including mudslide systems, rotational and translational landslides. These mass movements can pose a significant engineering risk and have been the focus of research into the use of remote sensing techniques as a tool for landslide identification and delineation on clay slopes. The study has utilised a field site on the Cotswold escarpment above the village of Broad- way, Worcestershire, UK. Geomorphological investigation was initially undertaken at the site in order to establish ground control on landslides and other landforms present at the site. Subsequent to this, Airborne Thematic Mapper (ATM) imagery and colour stereo photography were acquired by the UK Natural Environment Research Coun- cil (NERC) for further analysis and interpretation. This paper describes the textu- ral enhancement of the airborne imagery undertaken using both mean euclidean dis- tance (MEUC) and grey level co-occurrence matrix entropy (GLCM) together with a combined texture-principal component based supervised image classification that was adopted as the method for landslide identification. The study highlights the importance of image texture for discriminating mass movements within multispectral imagery and demonstrates that by adopting a combined texture-principal component image classi- fication we have been able to achieve classification accuracy of 84 % with a Kappa statistic of 0.838 for landslide classes. This paper also highlights the potential prob- lems that can be encountered when using high-resolution multispectral imagery, such as the presence of dense variable woodland present within the image, and presents a solution using principal component analysis.
Tipton, John; Hooten, Mevin; Goring, Simon
2017-01-01
Scientific records of temperature and precipitation have been kept for several hundred years, but for many areas, only a shorter record exists. To understand climate change, there is a need for rigorous statistical reconstructions of the paleoclimate using proxy data. Paleoclimate proxy data are often sparse, noisy, indirect measurements of the climate process of interest, making each proxy uniquely challenging to model statistically. We reconstruct spatially explicit temperature surfaces from sparse and noisy measurements recorded at historical United States military forts and other observer stations from 1820 to 1894. One common method for reconstructing the paleoclimate from proxy data is principal component regression (PCR). With PCR, one learns a statistical relationship between the paleoclimate proxy data and a set of climate observations that are used as patterns for potential reconstruction scenarios. We explore PCR in a Bayesian hierarchical framework, extending classical PCR in a variety of ways. First, we model the latent principal components probabilistically, accounting for measurement error in the observational data. Next, we extend our method to better accommodate outliers that occur in the proxy data. Finally, we explore alternatives to the truncation of lower-order principal components using different regularization techniques. One fundamental challenge in paleoclimate reconstruction efforts is the lack of out-of-sample data for predictive validation. Cross-validation is of potential value, but is computationally expensive and potentially sensitive to outliers in sparse data scenarios. To overcome the limitations that a lack of out-of-sample records presents, we test our methods using a simulation study, applying proper scoring rules including a computationally efficient approximation to leave-one-out cross-validation using the log score to validate model performance. The result of our analysis is a spatially explicit reconstruction of spatio
An application of principal component analysis to the clavicle and clavicle fixation devices
Directory of Open Access Journals (Sweden)
Fitzpatrick David
2010-03-01
Full Text Available Abstract Background Principal component analysis (PCA enables the building of statistical shape models of bones and joints. This has been used in conjunction with computer assisted surgery in the past. However, PCA of the clavicle has not been performed. Using PCA, we present a novel method that examines the major modes of size and three-dimensional shape variation in male and female clavicles and suggests a method of grouping the clavicle into size and shape categories. Materials and methods Twenty-one high-resolution computerized tomography scans of the clavicle were reconstructed and analyzed using a specifically developed statistical software package. After performing statistical shape analysis, PCA was applied to study the factors that account for anatomical variation. Results The first principal component representing size accounted for 70.5 percent of anatomical variation. The addition of a further three principal components accounted for almost 87 percent. Using statistical shape analysis, clavicles in males have a greater lateral depth and are longer, wider and thicker than in females. However, the sternal angle in females is larger than in males. PCA confirmed these differences between genders but also noted that men exhibit greater variance and classified clavicles into five morphological groups. Discussion And Conclusions This unique approach is the first that standardizes a clavicular orientation. It provides information that is useful to both, the biomedical engineer and clinician. Other applications include implant design with regard to modifying current or designing future clavicle fixation devices. Our findings support the need for further development of clavicle fixation devices and the questioning of whether gender-specific devices are necessary.
McIlroy, John W; Smith, Ruth Waddell; McGuffin, Victoria L
2015-12-01
Following publication of the National Academy of Sciences report "Strengthening Forensic Science in the United States: A Path Forward", there has been increasing interest in the application of multivariate statistical procedures for the evaluation of forensic evidence. However, prior to statistical analysis, variance from sources other than the sample must be minimized through application of data pretreatment procedures. This is necessary to ensure that subsequent statistical analysis of the data provides meaningful results. The purpose of this work was to evaluate the effect of pretreatment procedures on multivariate statistical analysis of chromatographic data obtained for a reference set of diesel fuels. Diesel was selected due to its chemical complexity and forensic relevance, both for fire debris and environmental forensic applications. Principal components analysis (PCA) was applied to the untreated chromatograms to assess association of replicates and discrimination among the different diesel samples. The chromatograms were then pretreated by sequentially applying the following procedures: background correction, smoothing, retention-time alignment, and normalization. The effect of each procedure on association and discrimination was evaluated based on the association of replicates in the PCA scores plot. For these data, background correction and smoothing offered minimal improvement, whereas alignment and normalization offered the greatest improvement in the association of replicates and discrimination among highly similar samples. Further, prior to pretreatment, the first principal component accounted for only non-sample sources of variance. Following pretreatment, these sources were minimized and the first principal component accounted for significant chemical differences among the diesel samples. These results highlight the need for pretreatment procedures and provide a metric to assess the effect of pretreatment on subsequent multivariate statistical
Institute of Scientific and Technical Information of China (English)
范文茹; 王化祥; 杨程屹; 马世文
2010-01-01
The aim of this paper is to propose a useful method for exploring regional ventilation and perfusion in the chest and also separation of pulmonary and cardiac changes.The approach is based on estimating both electrical impedance tomography(EIT) measurements and reconstructed images by means of principal component analysis(PCA).In the experiments in vivo,43 cycles of heart-beat rhythm could be detected by PCA when the volunteer held breath;9 breathing cycles and 50 heart-beat cycles could be detected by PCA ...
Directory of Open Access Journals (Sweden)
Oliveira-Esquerre K.P.
2002-01-01
Full Text Available This work presents a way to predict the biochemical oxygen demand (BOD of the output stream of the biological wastewater treatment plant at RIPASA S/A Celulose e Papel, one of the major pulp and paper plants in Brazil. The best prediction performance is achieved when the data are preprocessed using principal components analysis (PCA before they are fed to a backpropagated neural network. The influence of input variables is analyzed and satisfactory prediction results are obtained for an optimized situation.
Institute of Scientific and Technical Information of China (English)
LI; Xia(黎夏); YEH; Gar-On(叶嘉安)
2002-01-01
This paper discusses the issues about the correlation of spatial variables during spatial decisionmaking using multicriteria evaluation (MCE) and cellular automata (CA). The correlation of spatial variables can cause the malfunction of MCE. In urban simulation, spatial factors often exhibit a high degree of correlation which is considered as an undesirable property for MCE. This study uses principal components analysis (PCA) to remove data redundancy among a large set of spatial variables and determine 'ideal points' for land development. PCA is integrated with cellular automata and geographical information systems (GIS) for the simulation of idealized urban forms for planning purposes.
Computing steerable principal components of a large set of images and their rotations.
Ponce, Colin; Singer, Amit
2011-11-01
We present here an efficient algorithm to compute the Principal Component Analysis (PCA) of a large image set consisting of images and, for each image, the set of its uniform rotations in the plane. We do this by pointing out the block circulant structure of the covariance matrix and utilizing that structure to compute its eigenvectors. We also demonstrate the advantages of this algorithm over similar ones with numerical experiments. Although it is useful in many settings, we illustrate the specific application of the algorithm to the problem of cryo-electron microscopy.
Scalable multi-correlative statistics and principal component analysis with Titan.
Energy Technology Data Exchange (ETDEWEB)
Thompson, David C.; Bennett, Janine C.; Roe, Diana C.; Pebay, Philippe Pierre
2009-02-01
This report summarizes existing statistical engines in VTK/Titan and presents the recently parallelized multi-correlative and principal component analysis engines. It is a sequel to [PT08] which studied the parallel descriptive and correlative engines. The ease of use of these parallel engines is illustrated by the means of C++ code snippets. Furthermore, this report justifies the design of these engines with parallel scalability in mind; then, this theoretical property is verified with test runs that demonstrate optimal parallel speed-up with up to 200 processors.
Pinto da Costa, Joaquim
2015-01-01
This book examines in detail the correlation, more precisely the weighted correlation, and applications involving rankings. A general application is the evaluation of methods to predict rankings. Others involve rankings representing human preferences to infer user preferences; the use of weighted correlation with microarray data and those in the domain of time series. In this book we present new weighted correlation coefficients and new methods of weighted principal component analysis. We also introduce new methods of dimension reduction and clustering for time series data, and describe some theoretical results on the weighted correlation coefficients in separate sections.
Variation of fundamental parameters and dark energy. A principal component approach
Amendola, L; Martins, C J A P; Nunes, N J; Pedrosa, P O J; Seganti, A
2011-01-01
We discuss methods based on Principal Component Analysis for reconstructing the dark energy equation of state and constraining its evolution, using a combination of Type Ia supernovae at low redshift and spectroscopic measurements of varying fundamental couplings at higher redshifts. We discuss the performance of this method when future better-quality datasets are available, focusing on two forthcoming ESO spectrographs -- ESPRESSO for the VLT and CODEX for the E-ELT -- which include these measurements as a key part of their science cases. These can realize the prospect of a detailed characterization of dark energy properties all the way up to redshift 4.
Energy Technology Data Exchange (ETDEWEB)
Nasimi, Elnara; Gabbar, Hossam A., E-mail: Hossam.gabbar@uoit.ca
2014-04-01
Highlights: • Diagnosis of neutron overpower protection (NOP) in CANDU reactors. • Accurate reactor detector modeling. • NOP detectors response analysis. • Statistical methods for quantitative analysis of NOP detector behavior. - Abstract: An accurate fault modeling and troubleshooting methodology is required to aid in making risk-informed decisions related to design and operational activities of current and future generation of CANDU{sup ®} designs. This paper attempts to develop an explanation for the unanticipated detector response and overall behavior phenomena using statistical methods to compliment traditional engineering analysis techniques. Principal component analysis (PCA) methodology is used for pattern recognition using a case study of Bruce B zone-control level oscillations.
A Study of the Comprehensive Urban Competitiveness Based On Principal Component Analysis
Institute of Scientific and Technical Information of China (English)
Rifeng; HE; Chunxiang; ZHAO
2015-01-01
This paper studies the comprehensive urban competitiveness and performs the principal component analysis. The results show that the comprehensive evaluation of urban competitiveness is not entirely dependent on the city’s economic strength or GDP,and it is necessary to consider from resource allocation capacity,openness and public service capacity. By selecting various data concerning 11 prefecture-level cities in Jiangxi Province in 2006,2009 and 2012,this paper gets the ranking results and analyzes trends,to provide a basis for making future economic policy.
Smilek, Jan; Hadas, Zdenek
2017-02-01
In this paper we propose the use of principal component analysis to process the measured acceleration data in order to determine the direction of acceleration with the highest variance on given frequency of interest. This method can be used for improving the power generated by inertial energy harvesters. Their power output is highly dependent on the excitation acceleration magnitude and frequency, but the axes of acceleration measurements might not always be perfectly aligned with the directions of movement, and therefore the generated power output might be severely underestimated in simulations, possibly leading to false conclusions about the feasibility of using the inertial energy harvester for the examined application.
Research on application of principal component statistical analysis in the financial early-warning
Directory of Open Access Journals (Sweden)
Lan Yang
2017-06-01
Full Text Available Under the background of market economy, the environment of enterprises is changing rapidly, so the management layer urgently needs to know the financial situation in advance, in order to take measures to resolve risks. Based on 25 domestic listed companies, this paper uses SPSS software and statistical method of principal component analysis to establish the financial early warning model that is suitable for the listed companies in China. Taking Maotai Company as an example, this paper conducts prediction analysis, and obtains the conclusion that it has some practical guidance, and proposes a suggestion that the combination with qualitative and quantitative analysis can predict risks more comprehensively and accurately.
Lee, Seunggeun; Zou, Fei; Wright, Fred A
2014-06-01
The development of high-throughput biomedical technologies has led to increased interest in the analysis of high-dimensional data where the number of features is much larger than the sample size. In this paper, we investigate principal component analysis under the ultra-high dimensional regime, where both the number of features and the sample size increase as the ratio of the two quantities also increases. We bridge the existing results from the finite and the high-dimension low sample size regimes, embedding the two regimes in a more general framework. We also numerically demonstrate the universal application of the results from the finite regime.
Principal component analysis of bacteria using surface-enhanced Raman spectroscopy
Guicheteau, Jason; Christesen, Steven D.
2006-05-01
Surface-enhanced Raman scattering (SERS) provides rapid fingerprinting of biomaterial in a non-destructive manner. The problem of tissue fluorescence, which can overwhelm a normal Raman signal from biological samples, is largely overcome by treatment of biomaterials with colloidal silver. This work presents a study into the applicability of qualitative SER spectroscopy with principal component analysis (PCA) for the discrimination of four biological threat simulants; Bacillus globigii, Pantoea agglomerans, Brucella noetomae, and Yersinia rohdei. We also demonstrate differentiation of gram-negative and gram-positive species and as well as spores and vegetative cells of Bacillus globigii.
DEFF Research Database (Denmark)
Kotwa, Ewelina Katarzyna; Jørgensen, Bo Munk; Brockhoff, Per B.;
2013-01-01
In this paper, we introduce a new method, based on spherical principal component analysis (S‐PCA), for the identification of Rayleigh and Raman scatters in fluorescence excitation–emission data. These scatters should be found and eliminated as a prestep before fitting parallel factor analysis...... this drawback, we implement the fast S‐PCA in the scatter identification routine. Moreover, an additional pattern interpolation step that complements the method, based on robust regression, will be applied. In this way, substantial time savings are gained, and the user's engagement is restricted to a minimum...
Principal component cluster analysis of ECG time series based on Lyapunov exponent spectrum
Institute of Scientific and Technical Information of China (English)
WANG Nai; RUAN Jiong
2004-01-01
In this paper we propose an approach of principal component cluster analysis based on Lyapunov exponent spectrum (LES) to analyze the ECG time series. Analysis results of 22 sample-files of ECG from the MIT-BIH database confirmed the validity of our approach. Another technique named improved teacher selecting student (TSS) algorithm is presented to analyze unknown samples by means of some known ones, which is of better accuracy. This technique combines the advantages of both statistical and nonlinear dynamical methods and is shown to be significant to the analysis of nonlinear ECG time series.
Competition analysis on the operating system market using principal component analysis
Directory of Open Access Journals (Sweden)
Brătucu, G.
2011-01-01
Full Text Available Operating system market has evolved greatly. The largest software producer in the world, Microsoft, dominates the operating systems segment. With three operating systems: Windows XP, Windows Vista and Windows 7 the company held a market share of 87.54% in January 2011. Over time, open source operating systems have begun to penetrate the market very strongly affecting other manufacturers. Companies such as Apple Inc. and Google Inc. penetrated the operating system market. This paper aims to compare the best-selling operating systems on the market in terms of defining characteristics. To this purpose the principal components analysis method was used.
Guicheteau, J; Argue, L; Emge, D; Hyre, A; Jacobson, M; Christesen, S
2008-03-01
Surface-enhanced Raman spectroscopy (SERS) can provide rapid fingerprinting of biomaterial in a nondestructive manner. The adsorption of colloidal silver to biological material suppresses native biofluorescence while providing electromagnetic surface enhancement of the normal Raman signal. This work validates the applicability of qualitative SER spectroscopy for analysis of bacterial species by utilizing principal component analysis (PCA) to show discrimination of biological threat simulants, based upon multivariate statistical confidence limits bounding known data clusters. Gram-positive Bacillus spores (Bacillus atrophaeus, Bacillus anthracis, and Bacillus thuringiensis) are investigated along with the Gram-negative bacterium Pantoea agglomerans.
Principal components analysis corrects for stratification in genome-wide association studies.
Price, Alkes L; Patterson, Nick J; Plenge, Robert M; Weinblatt, Michael E; Shadick, Nancy A; Reich, David
2006-08-01
Population stratification--allele frequency differences between cases and controls due to systematic ancestry differences-can cause spurious associations in disease studies. We describe a method that enables explicit detection and correction of population stratification on a genome-wide scale. Our method uses principal components analysis to explicitly model ancestry differences between cases and controls. The resulting correction is specific to a candidate marker's variation in frequency across ancestral populations, minimizing spurious associations while maximizing power to detect true associations. Our simple, efficient approach can easily be applied to disease studies with hundreds of thousands of markers.
A principal components approach to parent-to-newborn body composition associations in South India
Directory of Open Access Journals (Sweden)
Hill Jacqueline C
2009-02-01
Full Text Available Abstract Background Size at birth is influenced by environmental factors, like maternal nutrition and parity, and by genes. Birth weight is a composite measure, encompassing bone, fat and lean mass. These may have different determinants. The main purpose of this paper was to use anthropometry and principal components analysis (PCA to describe maternal and newborn body composition, and associations between them, in an Indian population. We also compared maternal and paternal measurements (body mass index (BMI and height as predictors of newborn body composition. Methods Weight, height, head and mid-arm circumferences, skinfold thicknesses and external pelvic diameters were measured at 30 ± 2 weeks gestation in 571 pregnant women attending the antenatal clinic of the Holdsworth Memorial Hospital, Mysore, India. Paternal height and weight were also measured. At birth, detailed neonatal anthropometry was performed. Unrotated and varimax rotated PCA was applied to the maternal and neonatal measurements. Results Rotated PCA reduced maternal measurements to 4 independent components (fat, pelvis, height and muscle and neonatal measurements to 3 components (trunk+head, fat, and leg length. An SD increase in maternal fat was associated with a 0.16 SD increase (β in neonatal fat (p Conclusion Principal components analysis is a useful method to describe neonatal body composition and its determinants. Newborn adiposity is related to maternal nutritional status and parity, while newborn length is genetically determined. Further research is needed to understand mechanisms linking maternal pelvic size to fetal growth and the determinants and implications of the components (trunk v leg length of fetal skeletal growth.
Schelkanova, Irina; Toronov, Vladislav
2011-07-01
Although near infrared spectroscopy (NIRS) is now widely used both in emerging clinical techniques and in cognitive neuroscience, the development of the apparatuses and signal processing methods for these applications is still a hot research topic. The main unresolved problem in functional NIRS is the separation of functional signals from the contaminations by systemic and local physiological fluctuations. This problem was approached by using various signal processing methods, including blind signal separation techniques. In particular, principal component analysis (PCA) and independent component analysis (ICA) were applied to the data acquired at the same wavelength and at multiple sites on the human or animal heads during functional activation. These signal processing procedures resulted in a number of principal or independent components that could be attributed to functional activity but their physiological meaning remained unknown. On the other hand, the best physiological specificity is provided by broadband NIRS. Also, a comparison with functional magnetic resonance imaging (fMRI) allows determining the spatial origin of fNIRS signals. In this study we applied PCA and ICA to broadband NIRS data to distill the components correlating with the breath hold activation paradigm and compared them with the simultaneously acquired fMRI signals. Breath holding was used because it generates blood carbon dioxide (CO2) which increases the blood-oxygen-level-dependent (BOLD) signal as CO2 acts as a cerebral vasodilator. Vasodilation causes increased cerebral blood flow which washes deoxyhaemoglobin out of the cerebral capillary bed thus increasing both the cerebral blood volume and oxygenation. Although the original signals were quite diverse, we found very few different components which corresponded to fMRI signals at different locations in the brain and to different physiological chromophores.
Regional assessment of trends in vegetation change dynamics using principal component analysis
Osunmadewa, B. A.; Csaplovics, E.; R. A., Majdaldin; Adeofun, C. O.; Aralova, D.
2016-10-01
Vegetation forms the basis for the existence of animal and human. Due to changes in climate and human perturbation, most of the natural vegetation of the world has undergone some form of transformation both in composition and structure. Increased anthropogenic activities over the last decades had pose serious threat on the natural vegetation in Nigeria, many vegetated areas are either transformed to other land use such as deforestation for agricultural purpose or completely lost due to indiscriminate removal of trees for charcoal, fuelwood and timber production. This study therefore aims at examining the rate of change in vegetation cover, the degree of change and the application of Principal Component Analysis (PCA) in the dry sub-humid region of Nigeria using Normalized Difference Vegetation Index (NDVI) data spanning from 1983-2011. The method used for the analysis is the T-mode orientation approach also known as standardized PCA, while trends are examined using ordinary least square, median trend (Theil-Sen) and monotonic trend. The result of the trend analysis shows both positive and negative trend in vegetation change dynamics over the 29 years period examined. Five components were used for the Principal Component Analysis. The results of the first component explains about 98 % of the total variance of the vegetation (NDVI) while components 2-5 have lower variance percentage (Vegetation Index. The result of the land use data shows changes in land use pattern which can be attributed to anthropogenic activities such as cutting of trees for charcoal production, fuelwood and agricultural practices. The result of this study shows the ability of remote sensing data for monitoring vegetation change in the dry-sub humid region of Nigeria.
Nonnegative and Compartmental Dynamical Systems
Haddad, Wassim M; Hui, Qing
2010-01-01
This comprehensive book provides the first unified framework for stability and dissipativity analysis and control design for nonnegative and compartmental dynamical systems, which play a key role in a wide range of fields, including engineering, thermal sciences, biology, ecology, economics, genetics, chemistry, medicine, and sociology. Using the highest standards of exposition and rigor, the authors explain these systems and advance the state of the art in their analysis and active control design. Nonnegative and Compartmental Dynamical Systems presents the most complete treatment available o
Form Sums of Nonnegative Selfadjoint Operators
Hassi, S.; Sandovici, A.; Snoo, H.S.V. de; Winkler, Henrik; Sandovici, 27740
2006-01-01
The sum of two unbounded nonnegative selfadjoint operators is a nonnegative operator which is not necessarily densely defined. In general its selfadjoint extensions exist in the sense of linear relations (multivalued operators). One of its nonnegative selfadjoint extensions is constructed via the fo
Durigon, Angelica; Lier, Quirijn de Jong van; Metselaar, Klaas
2016-10-01
To date, measuring plant transpiration at canopy scale is laborious and its estimation by numerical modelling can be used to assess high time frequency data. When using the model by Jacobs (1994) to simulate transpiration of water stressed plants it needs to be reparametrized. We compare the importance of model variables affecting simulated transpiration of water stressed plants. A systematic literature review was performed to recover existing parameterizations to be tested in the model. Data from a field experiment with common bean under full and deficit irrigation were used to correlate estimations to forcing variables applying principal component analysis. New parameterizations resulted in a moderate reduction of prediction errors and in an increase in model performance. Ags model was sensitive to changes in the mesophyll conductance and leaf angle distribution parameterizations, allowing model improvement. Simulated transpiration could be separated in temporal components. Daily, afternoon depression and long-term components for the fully irrigated treatment were more related to atmospheric forcing variables (specific humidity deficit between stomata and air, relative air humidity and canopy temperature). Daily and afternoon depression components for the deficit-irrigated treatment were related to both atmospheric and soil dryness, and long-term component was related to soil dryness.
Energy Technology Data Exchange (ETDEWEB)
Biesinger, Mark C. [Surface Science Western, University of Western Ontario, London, Ont., N6A 5B7 (Canada)]. E-mail: biesingr@uwo.ca; Miller, David J. [Surface Science Western, University of Western Ontario, London, Ont., N6A 5B7 (Canada); Department of Chemistry, University of Western Ontario, London, Ont., N6A 5B7 (Canada); Harbottle, Robert R. [Department of Chemistry, University of Western Ontario, London, Ont., N6A 5B7 (Canada); Possmayer, Fred [Department of Obstetrics and Gynecology, University of Western Ontario, London, Ont., N6A 5B7 (Canada); McIntyre, N. Stewart [Surface Science Western, University of Western Ontario, London, Ont., N6A 5B7 (Canada); Department of Chemistry, University of Western Ontario, London, Ont., N6A 5B7 (Canada); Petersen, Nils O. [National Institute for Nanotechnology and Department of Chemistry, University of Alberta W6-017 ECERF Bldg, 9107-116th Street, Edmonton, Alta., T6G 2V4 (Canada)
2006-07-30
Time of flight secondary ion mass spectrometry (ToF-SIMS) provides the capability to image the distribution of molecular ions and their associated fragments that are emitted from monolayer films. ToF-SIMS can be applied to the analysis of monolayers of complex lipid mixtures that act as a model to understand the organization of cell membranes into solid-like domains called lipid rafts. The ability to determine the molecular distribution of lipids using ToF-SIMS in monolayer films is also important in studies of the function of pulmonary surfactant. One of the limitations of the use of ToF-SIMS to studies of complex lipid mixtures found in biological systems, arises from the similarity of the mass fragments that are emitted from the components of the lipid mixture. The use of selectively deuterated components in a mixture overcomes this limitation and results in an unambiguous assignment of specific lipids to particular surface domains. The use of deuterium labeling to identify specific lipids in a multi-component mixture can be done by the deuteration of a single lipid or by the addition of more than one lipid with selectively deuterated components. The incorporation of deuterium into the lipid chains does not alter the miscibility or phase behavior of these systems. The use of deuterium labeling to identify lipids and determine their distribution in monolayer films will be demonstrated using two biological systems. Principal components analysis (PCA) is used to further analyze these deuterated systems checking for the origin of the various mass fragments present.
Magnetic unmixing of first-order reversal curve diagrams using principal component analysis
Lascu, Ioan; Harrison, Richard; Li, Yuting; Piotrowski, Alexander; Channell, James; Muraszko, Joy; Hodell, David
2015-04-01
We have developed a magnetic unmixing method based on principal component analysis (PCA) of entire first-order reversal curve (FORC) diagrams. FORC diagrams are an advanced hysteresis technique that allows the quantitative characterisation of magnetic grain size, domain state, coercivity and spatial distribution of ensembles of particles within a sample. PCA has been previously applied on extracted central ridges from FORC diagrams of sediment samples containing single domain (SD) magnetite produced by magnetotactic bacteria (Heslop et al., 2014). We extend this methodology to the entire FORC space, which incorporates additional SD signatures, pseudo-single domain (PSD) and multi domain (MD) magnetite signatures, as well as fingerprints of other minerals, such as hematite (HEM). We apply the PCA by resampling the FORC distribution on a regular grid designed to encompass all significant features. Typically 80-90% of the variability within the FORC dataset is described by one or two principal components. Individual FORCs are recast as linear combinations of physically distinct end-member FORCs defined using the principal components and constraints derived from physical modelling. In a first case study we quantify the spatial variation of end-member components in surficial sediments along the North Atlantic Deep Water (NADW) from Iceland to Newfoundland. The samples have been physically separated into granulometric fractions, which added a further constraint in determining three end members used to model the magnetic ensemble, namely a coarse silt-sized MD component, a fine silt-sized PSD component, and a mixed clay-sized component containing both SD magnetite and hematite (SD+HEM). Sediments from core tops proximal to Iceland are dominated by the SD+HEM component, whereas those closer to Greenland and Canada are increasingly dominated by MD grains. Iceland sediments follow a PSD to SD+HEM trend with increasing grain-size fraction, whereas the Greenland and North
Improved gene prediction by principal component analysis based autoregressive Yule-Walker method.
Roy, Manidipa; Barman, Soma
2016-01-10
Spectral analysis using Fourier techniques is popular with gene prediction because of its simplicity. Model-based autoregressive (AR) spectral estimation gives better resolution even for small DNA segments but selection of appropriate model order is a critical issue. In this article a technique has been proposed where Yule-Walker autoregressive (YW-AR) process is combined with principal component analysis (PCA) for reduction in dimensionality. The spectral peaks of DNA signal are used to detect protein-coding regions based on the 1/3 frequency component. Here optimal model order selection is no more critical as noise is removed by PCA prior to power spectral density (PSD) estimation. Eigenvalue-ratio is used to find the threshold between signal and noise subspaces for data reduction. Superiority of proposed method over fast Fourier Transform (FFT) method and autoregressive method combined with wavelet packet transform (WPT) is established with the help of receiver operating characteristics (ROC) and discrimination measure (DM) respectively.
Serpen, G; Iyer, R; Elsamaloty, H M; Parsai, E I
2003-03-01
The present work addresses the development of an automated software-based system utilized in order to create an outline reconstruction of lung images from ventilation-perfusion scans for the purpose of diagnosing pulmonary embolism. The proposed diagnostic software procedure would require a standard set of digitized ventilation-perfusion scans in addition to correlated chest X-rays as key components in the identification of an ideal template match used to approximate and reconstruct the outline of the lungs. These reconstructed lung images would then be used to extract the necessary PIOPED-compliant features which would warrant a pulmonary embolism diagnosis. In order to evaluate this issue, two separate principal component analysis (PCA) algorithms were employed independently, including Eigenlungs, which was adapted from the Eigenfaces method, and an artificial neural network. The results obtained through MATLAB(TM) simulation indicated that lung outline reconstruction through the PCA approach carries significant viability.
Institute of Scientific and Technical Information of China (English)
2011-01-01
In order to make analysis on consumption structure of rural residents,the paper makes a principle component analysis on consumption expenditure per capita of rural residents in different areas of 2009 based on statistics of China statistical yearbook of 2010.Selecting a principal component,the paper arranges 31 provinces in China in order.Shanghai lists the 1st place with highest marks;coastal provinces in southeastern part,the Northeast,Beijing and Tianjin are at the top;the northern and central parts with Hebei,Shanxi,Hubei as representatives scores minus which is a little lower than that of average;the western part,such as Guizhou,Xizang,Gansu and so on are in far behind.The paper also makes analysis on the consumption structure of rural residents and proposes suggestions on how to accelerate consumption of rural residents.
Corriveau, H; Arsenault, A B; Dutil, E; Lepage, Y
1992-01-01
An evaluation based on the Bobath approach to treatment has previously been developed and partially validated. The purpose of the present study was to verify the content validity of this evaluation with the use of a statistical approach known as principal components analysis. Thirty-eight hemiplegic subjects participated in the study. Analysis of the scores on each of six parameters (sensorium, active movements, muscle tone, reflex activity, postural reactions, and pain) was evaluated on three occasions across a 2-month period. Each time this produced three factors that contained 70% of the variation in the data set. The first component mainly reflected variations in mobility, the second mainly variations in muscle tone, and the third mainly variations in sensorium and pain. The results of such exploratory analysis highlight the fact that some of the parameters are not only important but also interrelated. These results seem to partially support the conceptual framework substantiating the Bobath approach to treatment.
Directory of Open Access Journals (Sweden)
M. A. Islam
2014-06-01
Full Text Available The aim of the present study was to get a total physical and chemical characterization and comparison of the principal components in Bangladeshi buffalo (B, Holstein cross (HX, Indigenous cattle (IC and Red Chittagong Cattle (RCC milk. Protein and casein (CN composition and type, casein micellar size (CMS, naturally occurring peptides, free amino acids, fat, milk fat globule size (MFGS, fatty acid composition, carbohydrates, total and individual minerals were analyzed. These components are related to technological and nutritional properties of milk. Consequently, they are important for the dairy industry and in the animal feeding and breeding strategies. Considerable variation in most of the principal components of milk were observed among the animals. The milk of RCC and IC contained higher protein, CN, β-CN, whey protein, lactose, total mineral and P. They were more or less similar in most of the all other components. The B milk was found higher in CN number, in the content of αs2-, κ-CN and α-lactalbumin, free amino acids, unsaturated fatty acids, Ca and Ca:P. The B milk was also lower in β-lactoglobulin content and had the largest CMS and MFGS. Proportion of CN to whey protein was lower in HX milk and this milk was found higher in β-lactoglobulin and naturally occuring peptides. Considering the results obtained including the ratio of αs1-, αs2-, β- and κ-CN, B and RCC milk showed best data both from nutritional and technological aspects.
Directory of Open Access Journals (Sweden)
Hemant Pathak
2011-01-01
Full Text Available Groundwater is one of the major resources of the drinking water in Sagar city (India.. In this study 15 sampling station were selected for the investigations on 14 chemical parameters. The work was carried out during different months of the pre-monsoon, monsoon and post-monsoon seasons in June 2009 to June 2010. The multivariate statistics such as principal component and cluster analysis were applied to the datasets to investigate seasonal variations in groundwater quality. Principal axis factoring has been used to observe the mode of association of parameters and their interrelationships, for evaluating water quality. Average value of BOD, COD, ammonia and iron was high during entire study period. Elevated values of BOD and ammonia in monsoon, slightly more value of BOD in post-monsoon, BOD, ammonia and iron in pre-monsoon period reflected contribution on temporal effect on groundwater. Results of principal component analysis evinced that all the parameters equally and significantly contribute to groundwater quality variations. Factor 1 and factor 2 analysis revealed the DO value deteriorate due to organic load (BOD/Ammonia in different seasons. Hierarchical cluster analysis grouped 15 stations into four clusters in monsoon, five clusters in post-monsoon and five clusters in pre-monsoon with similar water quality features. Clustered group at monsoon, post-monsoon and pre-monsoon consisted one station exhibiting significant spatial variation in physicochemical composition. The anthropogenic nitrogenous species, as fallout from modernization activities. The study indicated that the groundwater sufficiently well oxygenated and nutrient-rich in study places.
On the neural networks of empathy: A principal component analysis of an fMRI study
Directory of Open Access Journals (Sweden)
Wittsack Hans-Jörg
2008-09-01
Full Text Available Abstract Background Human emotional expressions serve an important communicatory role allowing the rapid transmission of valence information among individuals. We aimed at exploring the neural networks mediating the recognition of and empathy with human facial expressions of emotion. Methods A principal component analysis was applied to event-related functional magnetic imaging (fMRI data of 14 right-handed healthy volunteers (29 +/- 6 years. During scanning, subjects viewed happy, sad and neutral face expressions in the following conditions: emotion recognition, empathizing with emotion, and a control condition of simple object detection. Functionally relevant principal components (PCs were identified by planned comparisons at an alpha level of p Results Four PCs revealed significant differences in variance patterns of the conditions, thereby revealing distinct neural networks: mediating facial identification (PC 1, identification of an expressed emotion (PC 2, attention to an expressed emotion (PC 12, and sense of an emotional state (PC 27. Conclusion Our findings further the notion that the appraisal of human facial expressions involves multiple neural circuits that process highly differentiated cognitive aspects of emotion.
Chen, Tung-Chien; Liu, Wentai; Chen, Liang-Gee
2008-01-01
On-chip spike detection and principal component analysis (PCA) sorting hardware in an integrated multi-channel neural recording system is highly desired to ease the bandwidth bottleneck from high-density microelectrode array implanted in the cortex. In this paper, we propose the first leading eigenvector generator, the key hardware module of PCA, to enable the whole framework. Based on the iterative eigenvector distilling algorithm, the proposed flipped structure enables the low cost and low power implementation by discarding the division and square root hardware units. Further, the proposed adaptive level shifting scheme optimizes the accuracy and area trade off by dynamically increasing the quantization parameter according to the signal level.With the specification of four principal components/channel, 32 samples/spike, and nine bits/sample, the proposed hardware can train 312 channels per minute with 1MHz operation frequency. 0.13 mm(2) silicon area and 282microW power consumption are required in 90 nm 1P9M CMOS process.
Denoising of MR spectroscopic imaging data using statistical selection of principal components.
Abdoli, Abas; Stoyanova, Radka; Maudsley, Andrew A
2016-12-01
To evaluate a new denoising method for MR spectroscopic imaging (MRSI) data based on selection of signal-related principal components (SSPCs) from principal components analysis (PCA). A PCA-based method was implemented for selection of signal-related PCs and denoising achieved by reconstructing the original data set utilizing only these PCs. Performance was evaluated using simulated MRSI data and two volumetric in vivo MRSIs of human brain, from a normal subject and a patient with a brain tumor, using variable signal-to-noise ratios (SNRs), metabolite peak areas, Cramer-Rao bounds (CRBs) of fitted metabolite peak areas and metabolite linewidth. In simulated data, SSPC determined the correct number of signal-related PCs. For in vivo studies, the SSPC denoising resulted in improved SNRs and reduced metabolite quantification uncertainty compared to the original data and two other methods for denoising. The method also performed very well in preserving the spectral linewidth and peak areas. However, this method performs better for regions that have larger numbers of similar spectra. The proposed SSPC denoising improved the SNR and metabolite quantification uncertainty in MRSI, with minimal compromise of the spectral information, and can result in increased accuracy.
Dietary patterns in Irish adolescents: a comparison of cluster and principal component analyses.
Hearty, Áine P; Gibney, Michael J
2013-05-01
Pattern analysis of adolescent diets may provide an important basis for nutritional health promotion. The aims of the present study were to examine and compare dietary patterns in adolescents using cluster analysis and principal component analysis (PCA) and to examine the impact of the format of the dietary variables on the solutions. Analysis was based on the Irish National Teens Food Survey, in which food intake data were collected using a semi-quantitative 7 d food diary. Thirty-two food groups were created and were expressed as either g/d or percentage contribution to total energy. Dietary patterns were identified using cluster analysis (k-means) and PCA. Republic of Ireland, 2005-2006. A representative sample of 441 adolescents aged 13-17 years. Five clusters based on percentage contribution to total energy were identified, 'Healthy', 'Unhealthy', 'Rice/Pasta dishes', 'Sandwich' and 'Breakfast cereal & Main meal-type foods'. Four principal components based on g/d were identified which explained 28 % of total variance: 'Healthy foods', 'Traditional foods', 'Sandwich foods' and 'Unhealthy foods'. A 'Sandwich' and an 'Unhealthy' pattern are the main dietary patterns in this sample. Patterns derived from either cluster analysis or PCA were comparable, although it appears that cluster analysis also identifies dietary patterns not identified through PCA, such as a 'Breakfast cereal & Main meal-type foods' pattern. Consideration of the format of the dietary variable is important as it can directly impact on the patterns obtained for both cluster analysis and PCA.
Comber, Alexis J.; Harris, Paul; Tsutsumida, Narumasa
2016-09-01
This study demonstrates the use of a geographically weighted principal components analysis (GWPCA) of remote sensing imagery to improve land cover classification accuracy. A principal components analysis (PCA) is commonly applied in remote sensing but generates global, spatially-invariant results. GWPCA is a local adaptation of PCA that locally transforms the image data, and in doing so, can describe spatial change in the structure of the multi-band imagery, thus directly reflecting that many landscape processes are spatially heterogenic. In this research the GWPCA localised loadings of MODIS data are used as textural inputs, along with GWPCA localised ranked scores and the image bands themselves to three supervised classification algorithms. Using a reference data set for land cover to the west of Jakarta, Indonesia the classification procedure was assessed via training and validation data splits of 80/20, repeated 100 times. For each classification algorithm, the inclusion of the GWPCA loadings data was found to significantly improve classification accuracy. Further, but more moderate improvements in accuracy were found by additionally including GWPCA ranked scores as textural inputs, data that provide information on spatial anomalies in the imagery. The critical importance of considering both spatial structure and spatial anomalies of the imagery in the classification is discussed, together with the transferability of the new method to other studies. Research topics for method refinement are also suggested.
Spectral principal component analysis of mid-infrared spectra of a sample of PG QSOs
Bian, Wei-Hao; Green, Richard; Shi, Yong; Ge, Xue; Liu, Wen-Shuai
2015-01-01
A spectral principal component analysis (SPCA) of a sample of 87 PG QSOs at $z < 0.5$ is presented for their mid-infrared spectra from Spitzer Space Telescope. We have derived the first five eigenspectra, which account for 85.2\\% of the mid-infrared spectral variation. It is found that the first eigenspectrum represents the mid-infrared slope, forbidden emission line strength and $9.7~\\mu m$ silicate feature, the 3rd and 4th eigenspectra represent the silicate features at $18~ \\mu m$ and $9.7~\\mu m$, respectively. With the principal components (PC) from optical PCA, we find that there is a medium strong correlation between spectral SPC1 and PC2 (accretion rate). It suggests that more nuclear contribution to the near-IR spectrum leads to the change of mid-IR slope. We find mid-IR forbidden lines are suppressed with higher accretion rate. A medium strong correlation between SPC3 and PC1 (Eddington ratio) suggests a connection between the silicate feature at $18~\\mu m$ and the Eddington ratio. For the ratio o...
Pratiwi, Destari; Fawcett, J Paul; Gordon, Keith C; Rades, Thomas
2002-11-01
Ranitidine hydrochloride exists as two polymorphs, forms I and II, both of which are used to manufacture commercial tablets. Raman spectroscopy can be used to differentiate the two forms but univariate methods of quantitative analysis of one polymorph as an impurity in the other lack sensitivity. We have applied principal components analysis (PCA) of Raman spectra to binary mixtures of the two polymorphs and to binary mixtures prepared by adding one polymorph to powdered tablets of the other. Based on absorption measurements of seven spectral regions, it was found that >97% of the spectral variation was accounted for by three principal components. Quantitative calibration models generated by multiple linear regression predicted a detection limit and quantitation limit for either forms I or II in mixtures of the two of 0.6 and 1.8%, respectively. This study demonstrates that PCA of Raman spectroscopic data provides a sensitive method for the quantitative analysis of polymorphic impurities of drugs in commercial tablets with a quantitation limit of less than 2%.
Finger crease pattern recognition using Legendre moments and principal component analysis
Institute of Scientific and Technical Information of China (English)
Rongfang Luo; Tusheng Lin
2007-01-01
The finger joint lines defined as finger creases and its distribution can identify a person. In this paper,we propose a new finger crease pattern recognition method based on Legendre moments and principal component analysis (PCA). After obtaining the region of interest (ROI) for each finger image in the preprocessing stage, Legendre moments under Radon transform are applied to construct a moment feature matrix from the ROI, which greatly decreases the dimensionality of ROI and can represent principal components of the finger creases quite well. Then, an approach to finger crease pattern recognition is designed based on Karhunen-Loeve (K-L) transform. The method applies PCA to a moment feature matrix rather than the original image matrix to achieve the feature vector. The proposed method has been tested on a database of 824 images from 103 individuals using the nearest neighbor classifier. The accuracy up to 98.584% has been obtained when using 4 samples per class for training. The experimental results demonstrate that our proposed approach is feasible and effective in biometrics.
Sousa, C C; Damasceno-Silva, K J; Bastos, E A; Rocha, M M
2015-12-07
Vigna unguiculata (L.) Walp (cowpea) is a food crop with high nutritional value that is cultivated throughout tropical and subtropical regions of the world. The main constraint on high productivity of cowpea is water deficit, caused by the long periods of drought that occur in these regions. The aim of the present study was to select elite cowpea genotypes with enhanced drought tolerance, by applying principal component analysis to 219 first-cycle progenies obtained in a recurrent selection program. The experimental design comprised a simple 15 x 15 lattice with 450 plots, each of two rows of 10 plants. Plants were grown under water-deficit conditions by applying a water depth of 205 mm representing one-half of that required by cowpea. Variables assessed were flowering, maturation, pod length, number and mass of beans/pod, mass of 100 beans, and productivity/plot. Ten elite cowpea genotypes were selected, in which principal components 1 and 2 encompassed variables related to yield (pod length, beans/pod, and productivity/plot) and life precocity (flowering and maturation), respectively.
Trevizani, Gabriela A; Nasario-Junior, Olivassé; Benchimol-Barbosa, Paulo R; Silva, Lilian P; Nadal, Jurandir
2016-07-01
The purpose of this study was to investigate the application of the principal component analysis (PCA) technique on power spectral density function (PSD) of consecutive normal RR intervals (iRR) aiming at assessing its ability to discriminate healthy women according to age groups: young group (20-25 year-old) and middle-aged group (40-60 year-old). Thirty healthy and non-smoking female volunteers were investigated (13 young [mean ± SD (median): 22·8 ± 0·9 years (23·0)] and 17 Middle-aged [51·7 ± 5·3 years (50·0)]). The iRR sequence was collected during ten minutes, breathing spontaneously, in supine position and in the morning, using a heart rate monitor. After selecting an iRR segment (5 min) with the smallest variance, an auto regressive model was used to estimate the PSD. Five principal component coefficients, extracted from PSD signals, were retained for analysis according to the Mahalanobis distance classifier. A threshold established by logistic regression allowed the separation of the groups with 100% specificity, 83·2% sensitivity and 93·3% total accuracy. The PCA appropriately classified two groups of women in relation to age (young and Middle-aged) based on PSD analysis of consecutive normal RR intervals.
Lin, Nan; Jiang, Junhai; Guo, Shicheng; Xiong, Momiao
2015-01-01
Due to the advancement in sensor technology, the growing large medical image data have the ability to visualize the anatomical changes in biological tissues. As a consequence, the medical images have the potential to enhance the diagnosis of disease, the prediction of clinical outcomes and the characterization of disease progression. But in the meantime, the growing data dimensions pose great methodological and computational challenges for the representation and selection of features in image cluster analysis. To address these challenges, we first extend the functional principal component analysis (FPCA) from one dimension to two dimensions to fully capture the space variation of image the signals. The image signals contain a large number of redundant features which provide no additional information for clustering analysis. The widely used methods for removing the irrelevant features are sparse clustering algorithms using a lasso-type penalty to select the features. However, the accuracy of clustering using a lasso-type penalty depends on the selection of the penalty parameters and the threshold value. In practice, they are difficult to determine. Recently, randomized algorithms have received a great deal of attentions in big data analysis. This paper presents a randomized algorithm for accurate feature selection in image clustering analysis. The proposed method is applied to both the liver and kidney cancer histology image data from the TCGA database. The results demonstrate that the randomized feature selection method coupled with functional principal component analysis substantially outperforms the current sparse clustering algorithms in image cluster analysis. PMID:26196383
Directory of Open Access Journals (Sweden)
Nan Lin
Full Text Available Due to the advancement in sensor technology, the growing large medical image data have the ability to visualize the anatomical changes in biological tissues. As a consequence, the medical images have the potential to enhance the diagnosis of disease, the prediction of clinical outcomes and the characterization of disease progression. But in the meantime, the growing data dimensions pose great methodological and computational challenges for the representation and selection of features in image cluster analysis. To address these challenges, we first extend the functional principal component analysis (FPCA from one dimension to two dimensions to fully capture the space variation of image the signals. The image signals contain a large number of redundant features which provide no additional information for clustering analysis. The widely used methods for removing the irrelevant features are sparse clustering algorithms using a lasso-type penalty to select the features. However, the accuracy of clustering using a lasso-type penalty depends on the selection of the penalty parameters and the threshold value. In practice, they are difficult to determine. Recently, randomized algorithms have received a great deal of attentions in big data analysis. This paper presents a randomized algorithm for accurate feature selection in image clustering analysis. The proposed method is applied to both the liver and kidney cancer histology image data from the TCGA database. The results demonstrate that the randomized feature selection method coupled with functional principal component analysis substantially outperforms the current sparse clustering algorithms in image cluster analysis.
The personal lift-assist device and lifting technique: a principal component analysis.
Sadler, Erin M; Graham, Ryan B; Stevenson, Joan M
2011-04-01
The personal lift-assist device (PLAD) is a non-motorised, on-body device that acts as an external force generator using the concept of stored elastic energy. In this study, the effect of the PLAD on the lifting kinematics of male and female lifters was investigated using principal component analysis. Joint kinematic data of 15 males and 15 females were collected using an opto-electronic system during a freestyle, symmetrical-lifting protocol with and without wearing the PLAD. Of the 31 Principal Components (PCs) retained in the models, eight scores were significantly different between the PLAD and no-PLAD conditions. There were no main effects for gender and no significant interactions. Results indicated that the PLAD similarly affected the lifting kinematics of males and females; demonstrating significantly less lumbar and thoracic flexion and significantly greater hip and ankle flexion when wearing the PLAD. These findings add to the body of work that suggest the PLAD may be a safe and effective ergonomic aid. STATEMENT OF RELEVANCE: The PLAD is an ergonomic aid that has been shown to be effective at reducing low back demands during manual materials handling tasks. This body of work establishes that the PLAD encourages safe lifting practices without adversely affecting lifting technique.
Directory of Open Access Journals (Sweden)
Rockson Dobgegah
2011-03-01
Full Text Available The study adopts a data reduction technique to examine the presence of any complex structure among a set of project management competency variables. A structured survey questionnaire was administered to 100 project managers to elicit relevant data, and this achieved a relatively high response rate of 54%. After satisfying all the necessary tests of reliability of the survey instrument, sample size adequacy and population matrix, the data was subjected to principal component analysis, resulting in the identification of six new thematic project management competency areas ; and were explained in terms of human resource management and project control; construction innovation and communication; project financial resources management; project risk and quality management; business ethics and; physical resources and procurement management. These knowledge areas now form the basis for lateral project management training requirements in the context of the Ghanaian construction industry. Key contribution of the paper is manifested in the use of the principal component analysis, which has rigorously provided understanding into the complex structure and the relationship between the various knowledge areas. The originality and value of the paper is embedded in the use of contextual-task conceptual knowledge to expound the six uncorrelated empirical utility of the project management competencies.
Fault detection of flywheel system based on clustering and principal component analysis
Directory of Open Access Journals (Sweden)
Wang Rixin
2015-12-01
Full Text Available Considering the nonlinear, multifunctional properties of double-flywheel with closed-loop control, a two-step method including clustering and principal component analysis is proposed to detect the two faults in the multifunctional flywheels. At the first step of the proposed algorithm, clustering is taken as feature recognition to check the instructions of “integrated power and attitude control” system, such as attitude control, energy storage or energy discharge. These commands will ask the flywheel system to work in different operation modes. Therefore, the relationship of parameters in different operations can define the cluster structure of training data. Ordering points to identify the clustering structure (OPTICS can automatically identify these clusters by the reachability-plot. K-means algorithm can divide the training data into the corresponding operations according to the reachability-plot. Finally, the last step of proposed model is used to define the relationship of parameters in each operation through the principal component analysis (PCA method. Compared with the PCA model, the proposed approach is capable of identifying the new clusters and learning the new behavior of incoming data. The simulation results show that it can effectively detect the faults in the multifunctional flywheels system.
Fault detection of flywheel system based on clustering and principal component analysis
Institute of Scientific and Technical Information of China (English)
Wang Rixin; Gong Xuebing; Xu Minqiang; Li Yuqing
2015-01-01
Considering the nonlinear, multifunctional properties of double-flywheel with closed-loop control, a two-step method including clustering and principal component analysis is proposed to detect the two faults in the multifunctional flywheels. At the first step of the proposed algorithm, clustering is taken as feature recognition to check the instructions of‘‘integrated power and attitude control”system, such as attitude control, energy storage or energy discharge. These commands will ask the flywheel system to work in different operation modes. Therefore, the relationship of parameters in different operations can define the cluster structure of training data. Ordering points to identify the clustering structure (OPTICS) can automatically identify these clusters by the reachability-plot. K-means algorithm can divide the training data into the corresponding operations according to the reachability-plot. Finally, the last step of proposed model is used to define the rela-tionship of parameters in each operation through the principal component analysis (PCA) method. Compared with the PCA model, the proposed approach is capable of identifying the new clusters and learning the new behavior of incoming data. The simulation results show that it can effectively detect the faults in the multifunctional flywheels system.
Shah, Syed Muhammad Saqlain; Batool, Safeera; Khan, Imran; Ashraf, Muhammad Usman; Abbas, Syed Hussnain; Hussain, Syed Adnan
2017-09-01
Automatic diagnosis of human diseases are mostly achieved through decision support systems. The performance of these systems is mainly dependent on the selection of the most relevant features. This becomes harder when the dataset contains missing values for the different features. Probabilistic Principal Component Analysis (PPCA) has reputation to deal with the problem of missing values of attributes. This research presents a methodology which uses the results of medical tests as input, extracts a reduced dimensional feature subset and provides diagnosis of heart disease. The proposed methodology extracts high impact features in new projection by using Probabilistic Principal Component Analysis (PPCA). PPCA extracts projection vectors which contribute in highest covariance and these projection vectors are used to reduce feature dimension. The selection of projection vectors is done through Parallel Analysis (PA). The feature subset with the reduced dimension is provided to radial basis function (RBF) kernel based Support Vector Machines (SVM). The RBF based SVM serves the purpose of classification into two categories i.e., Heart Patient (HP) and Normal Subject (NS). The proposed methodology is evaluated through accuracy, specificity and sensitivity over the three datasets of UCI i.e., Cleveland, Switzerland and Hungarian. The statistical results achieved through the proposed technique are presented in comparison to the existing research showing its impact. The proposed technique achieved an accuracy of 82.18%, 85.82% and 91.30% for Cleveland, Hungarian and Switzerland dataset respectively.
Fault detection of excavator's hydraulic system based on dynamic principal component analysis
Institute of Scientific and Technical Information of China (English)
HE Qing-hua; HE Xiang-yu; ZHU Jian-xin
2008-01-01
In order to improve reliability of the excavator's hydraulic system, a fault detection approach based on dynamic principal component analysis(PCA) was proposed. Dynamic PCA is an extension of PCA, which can effectively extract the dynamic relations among process variables. With this approach, normal samples were used as training data to develop a dynamic PCA model in the first step. Secondly, the dynamic PCA model decomposed the testing data into projections to the principal component subspace(PCS) and residual subspace(RS). Thirdly, T2 statistic and Q statistic performed as indexes of fault detection in PCS and RS, respectively.Several simulated faults were introduced to validate the approach. The results show that the dynamic PCA model developed is able to detect overall faults by using T2 statistic and Q statistic. By simulation analysis, the proposed approach achieves an accuracy of 95% for 20 test sample sets, which shows that the fault detection approach can be effectively applied to the excavator's hydraulic system.
Carta, M G; Coppo, P; Reda, M A; Mounkuoro, P P; Carpiniello, B
1999-05-01
The present paper reports the findings of principal components analysis performed on the basis of answers to the Questionnaire pour le Depistage en Santé Mentale (QDSM) administered to subjects from the Bandiagara plateau (Mali), who had been evaluated in a previously published report. The study sample was made up of 466 subjects (253 males, 213 females), 273 of whom belonged to the Dogon ethnic group, 163 were Peul and the remaining 30 belonged to other groups (Sonrai, Bozo, Tuareg, Bambara). All subjects were submitted to QDSM, a structured interview derived from the Self Reporting Questionnaire. Data obtained were processed by means of principal components analysis, in order to obtain syndromic aggregations. Eight factors with an Eigen value greater than 1 were extracted, which provided sufficient explanation for the overall variance observed among the 23 items. These factors may be termed as follows: Sadness (factor 1); Dysphoria (factor 2); Nightmares (factor 3); Persecution (factor 4); Somatic symptoms (factor 5); Special powers (factor 6); Hopelessness (factor 7); Loss of Interest (factor 8). The findings from this study support the hypothesis of an independence of "psychosomatic" from depressive symptoms. In particular, contrary to some evidence derived from other African studies, the present research appears to suggest a possible counterposition of these two ways of expressing depression, commonly considered as autonomous.
Roopwani, Rahul; Buckner, Ira S
2011-10-14
Principal component analysis (PCA) was applied to pharmaceutical powder compaction. A solid fraction parameter (SF(c/d)) and a mechanical work parameter (W(c/d)) representing irreversible compression behavior were determined as functions of applied load. Multivariate analysis of the compression data was carried out using PCA. The first principal component (PC1) showed loadings for the solid fraction and work values that agreed with changes in the relative significance of plastic deformation to consolidation at different pressures. The PC1 scores showed the same rank order as the relative plasticity ranking derived from the literature for common pharmaceutical materials. The utility of PC1 in understanding deformation was extended to binary mixtures using a subset of the original materials. Combinations of brittle and plastic materials were characterized using the PCA method. The relationships between PC1 scores and the weight fractions of the mixtures were typically linear showing ideal mixing in their deformation behaviors. The mixture consisting of two plastic materials was the only combination to show a consistent positive deviation from ideality. The application of PCA to solid fraction and mechanical work data appears to be an effective means of predicting deformation behavior during compaction of simple powder mixtures.
Sigirli, Deniz; Ercan, Ilker
2015-09-01
Most of the studies in medical and biological sciences are related to the examination of geometrical properties of an organ or organism. Growth and allometry studies are important in the way of investigating the effects of diseases and the environmental factors effects on the structure of the organ or organism. Thus, statistical shape analysis has recently become more important in the medical and biological sciences. Shape is all geometrical information that remains when location, scale and rotational effects are removed from an object. Allometry, which is a relationship between size and shape, plays an important role in the development of statistical shape analysis. The aim of the present study was to compare two different models for allometry which includes tangent coordinates and principal component scores of tangent coordinates as dependent variables in multivariate regression analysis. The results of the simulation study showed that the model constructed by taking tangent coordinates as dependent variables is more appropriate than the model constructed by taking principal component scores of tangent coordinates as dependent variables, for all sample sizes.
A Principal Component Analysis of global images of Jupiter obtained by Cassini ISS
Ordóñez Etxeberria, I.; Hueso, R.; Sánchez-Lavega, A.
2014-04-01
The Cassini spacecraft flybied Jupiter in December 2000. The Imaging Science Subsystem (ISS) cameras acquired a large number of images at different spatial resolution in several filters sensitive to different altitudes and to cloud color. We have used these images to build high-resolution multi-wavelength nearly full maps of the planet in cylindrical and polar projections. The images have been analyzed by means of a principal component analysis technique (PCA) which looks for spatial covariances in different filtered images and proposes a new set of images (Principal Components, PC) which contains most of the spatial variability. The goal of this research is triple since we: 1) explore correlations between the ammonia cloud layer observed in most filters and the upper hazes observed in methane band images and UV, 2) we explore the spatial distribution of chromophores similarly to previous studies using HST images [1, 2]; 3) we look for image combinations that could be useful for cloud features sharpening. Furthermore, we study a global characterization of reletive altimetry of clouds and hazes from synthetic indexes between images with different contributions from the methane absorption bands (CB1, CB2, CB3, MT1, MT2, MT3).
Super-sparse principal component analyses for high-throughput genomic data
Directory of Open Access Journals (Sweden)
Lee Youngjo
2010-06-01
Full Text Available Abstract Background Principal component analysis (PCA has gained popularity as a method for the analysis of high-dimensional genomic data. However, it is often difficult to interpret the results because the principal components are linear combinations of all variables, and the coefficients (loadings are typically nonzero. These nonzero values also reflect poor estimation of the true vector loadings; for example, for gene expression data, biologically we expect only a portion of the genes to be expressed in any tissue, and an even smaller fraction to be involved in a particular process. Sparse PCA methods have recently been introduced for reducing the number of nonzero coefficients, but these existing methods are not satisfactory for high-dimensional data applications because they still give too many nonzero coefficients. Results Here we propose a new PCA method that uses two innovations to produce an extremely sparse loading vector: (i a random-effect model on the loadings that leads to an unbounded penalty at the origin and (ii shrinkage of the singular values obtained from the singular value decomposition of the data matrix. We develop a stable computing algorithm by modifying nonlinear iterative partial least square (NIPALS algorithm, and illustrate the method with an analysis of the NCI cancer dataset that contains 21,225 genes. Conclusions The new method has better performance than several existing methods, particularly in the estimation of the loading vectors.
Directory of Open Access Journals (Sweden)
Darabi , M. (MSC
2014-05-01
Full Text Available Background and Objective: Quality control of drinking water is important for maintaining health and safety of consumers, and the first step is to study the water quality variables. This study aimed to evaluate the chemical and physical indicators, water quality variables and qualitative classification of drinking water stations and water sources in Boroujerd. Material and Methods: This descriptive-cross sectional study was conducted on 70 samples of drinking water and 10 samples from sources in 2011-2012. Nine Water quality variables were measured and coded using STATISTICA10 Software. Principal component analysis (PCA was performed for qualitative classification of water samples and determination of water quality variables. Results: Based on PCA, chemical variables such as fluoride, nitrate, total hardness and iron, and physical variables such as pH and TDS were paramount importance to water quality. According to T-test, the average concentration of fluoride and iron, and the turbidity in all samples were significantly less than the standard. But other variables were up to standard. Conclusion: For the large water quality data, the use of PCA to identify the main qualitative variables and to classify physical and chemical variables can be used as an effective way in water quality management. Keywords: Physical and Chemical Indicators, Drinking Water and Sources, Boroujerd, Principal Component Analysis
Principal Component Analysis in the Spectral Analysis of the Dynamic Laser Speckle Patterns
Ribeiro, K. M.; Braga, R. A., Jr.; Horgan, G. W.; Ferreira, D. D.; Safadi, T.
2014-02-01
Dynamic laser speckle is a phenomenon that interprets an optical patterns formed by illuminating a surface under changes with coherent light. Therefore, the dynamic change of the speckle patterns caused by biological material is known as biospeckle. Usually, these patterns of optical interference evolving in time are analyzed by graphical or numerical methods, and the analysis in frequency domain has also been an option, however involving large computational requirements which demands new approaches to filter the images in time. Principal component analysis (PCA) works with the statistical decorrelation of data and it can be used as a data filtering. In this context, the present work evaluated the PCA technique to filter in time the data from the biospeckle images aiming the reduction of time computer consuming and improving the robustness of the filtering. It was used 64 images of biospeckle in time observed in a maize seed. The images were arranged in a data matrix and statistically uncorrelated by PCA technique, and the reconstructed signals were analyzed using the routine graphical and numerical methods to analyze the biospeckle. Results showed the potential of the PCA tool in filtering the dynamic laser speckle data, with the definition of markers of principal components related to the biological phenomena and with the advantage of fast computational processing.
Directory of Open Access Journals (Sweden)
Manoj Tripathy
2012-01-01
Full Text Available This paper describes a new approach for power transformer differential protection which is based on the wave-shape recognition technique. An algorithm based on neural network principal component analysis (NNPCA with back-propagation learning is proposed for digital differential protection of power transformer. The principal component analysis is used to preprocess the data from power system in order to eliminate redundant information and enhance hidden pattern of differential current to discriminate between internal faults from inrush and overexcitation conditions. This algorithm has been developed by considering optimal number of neurons in hidden layer and optimal number of neurons at output layer. The proposed algorithm makes use of ratio of voltage to frequency and amplitude of differential current for transformer operating condition detection. This paper presents a comparative study of power transformer differential protection algorithms based on harmonic restraint method, NNPCA, feed forward back propagation neural network (FFBPNN, space vector analysis of the differential signal, and their time characteristic shapes in Park’s plane. The algorithms are compared as to their speed of response, computational burden, and the capability to distinguish between a magnetizing inrush and power transformer internal fault. The mathematical basis for each algorithm is briefly described. All the algorithms are evaluated using simulation performed with PSCAD/EMTDC and MATLAB.
Higher-order principal component pursuit via tensor approximation and convex optimization
Institute of Scientific and Technical Information of China (English)
Sijia Cai; Ping Wang; Linhao Li; Chuhan Zhang
2014-01-01
Recovering the low-rank structure of data matrix from sparse errors arises in the principal component pursuit (PCP). This paper exploits the higher-order generalization of matrix recovery, named higher-order principal component pursuit (HOPCP), since it is critical in multi-way data analysis. Unlike the convexification (nuclear norm) for matrix rank function, the tensorial nuclear norm is stil an open problem. While existing preliminary works on the tensor completion field provide a viable way to indicate the low complexity estimate of tensor, therefore, the paper focuses on the low multi-linear rank tensor and adopt its convex relaxation to formulate the convex optimization model of HOPCP. The paper further propose two algorithms for HOPCP based on alternative minimization scheme: the augmented Lagrangian alternating di-rection method (ALADM) and its truncated higher-order singular value decomposition (ALADM-THOSVD) version. The former can obtain a high accuracy solution while the latter is more efficient to handle the computational y intractable problems. Experimental re-sults on both synthetic data and real magnetic resonance imaging data show the applicability of our algorithms in high-dimensional tensor data processing.
State and group dynamics of world stock market by principal component analysis
Nobi, Ashadun
2015-01-01
We study the dynamic interactions and structural changes in global financial indices in the years 1998-2012. We apply a principal component analysis (PCA) to cross-correlation coefficients of the stock indices. We calculate the correlations between principal components (PCs) and each asset, known as PC coefficients. A change in market state is identified as a change in the first PC coefficients. Some indices do not show significant change of PCs in market state during crises. The indices exposed to the invested capitals in the stock markets are at the minimum level of risk. Using the first two PC coefficients, we identify indices that are similar and more strongly correlated than the others. We observe that the European indices form a robust group over the observation period. The dynamics of the individual indices within the group increase in similarity with time, and the dynamics of indices are more similar during the crises. Furthermore, the group formation of indices changes position in two-dimensional spa...
Butler, Rebecca A; Lambon Ralph, Matthew A; Woollams, Anna M
2014-12-01
Stroke aphasia is a multidimensional disorder in which patient profiles reflect variation along multiple behavioural continua. We present a novel approach to separating the principal aspects of chronic aphasic performance and isolating their neural bases. Principal components analysis was used to extract core factors underlying performance of 31 participants with chronic stroke aphasia on a large, detailed battery of behavioural assessments. The rotated principle components analysis revealed three key factors, which we labelled as phonology, semantic and executive/cognition on the basis of the common elements in the tests that loaded most strongly on each component. The phonology factor explained the most variance, followed by the semantic factor and then the executive-cognition factor. The use of principle components analysis rendered participants' scores on these three factors orthogonal and therefore ideal for use as simultaneous continuous predictors in a voxel-based correlational methodology analysis of high resolution structural scans. Phonological processing ability was uniquely related to left posterior perisylvian regions including Heschl's gyrus, posterior middle and superior temporal gyri and superior temporal sulcus, as well as the white matter underlying the posterior superior temporal gyrus. The semantic factor was uniquely related to left anterior middle temporal gyrus and the underlying temporal stem. The executive-cognition factor was not correlated selectively with the structural integrity of any particular region, as might be expected in light of the widely-distributed and multi-functional nature of the regions that support executive functions. The identified phonological and semantic areas align well with those highlighted by other methodologies such as functional neuroimaging and neurostimulation. The use of principle components analysis allowed us to characterize the neural bases of participants' behavioural performance more robustly and
Zha, N.; Capaldi, D. P. I.; Pike, D.; McCormack, D. G.; Cunningham, I. A.; Parraga, G.
2015-03-01
Pulmonary x-ray computed tomography (CT) may be used to characterize emphysema and airways disease in patients with chronic obstructive pulmonary disease (COPD). One analysis approach - parametric response mapping (PMR) utilizes registered inspiratory and expiratory CT image volumes and CT-density-histogram thresholds, but there is no consensus regarding the threshold values used, or their clinical meaning. Principal-component-analysis (PCA) of the CT density histogram can be exploited to quantify emphysema using data-driven CT-density-histogram thresholds. Thus, the objective of this proof-of-concept demonstration was to develop a PRM approach using PCA-derived thresholds in COPD patients and ex-smokers without airflow limitation. Methods: Fifteen COPD ex-smokers and 5 normal ex-smokers were evaluated. Thoracic CT images were also acquired at full inspiration and full expiration and these images were non-rigidly co-registered. PCA was performed for the CT density histograms, from which the components with the highest eigenvalues greater than one were summed. Since the values of the principal component curve correlate directly with the variability in the sample, the maximum and minimum points on the curve were used as threshold values for the PCA-adjusted PRM technique. Results: A significant correlation was determined between conventional and PCA-adjusted PRM with 3He MRI apparent diffusion coefficient (p<0.001), with CT RA950 (p<0.0001), as well as with 3He MRI ventilation defect percent, a measurement of both small airways disease (p=0.049 and p=0.06, respectively) and emphysema (p=0.02). Conclusions: PRM generated using PCA thresholds of the CT density histogram showed significant correlations with CT and 3He MRI measurements of emphysema, but not airways disease.
Magnetic unmixing of first-order reversal curve diagrams using principal component analysis
Lascu, Ioan; Harrison, Richard J.; Li, Yuting; Muraszko, Joy R.; Channell, James E. T.; Piotrowski, Alexander M.; Hodell, David A.
2015-09-01
We describe a quantitative magnetic unmixing method based on principal component analysis (PCA) of first-order reversal curve (FORC) diagrams. For PCA, we resample FORC distributions on grids that capture diagnostic signatures of single-domain (SD), pseudosingle-domain (PSD), and multidomain (MD) magnetite, as well as of minerals such as hematite. Individual FORC diagrams are recast as linear combinations of end-member (EM) FORC diagrams, located at user-defined positions in PCA space. The EM selection is guided by constraints derived from physical modeling and imposed by data scatter. We investigate temporal variations of two EMs in bulk North Atlantic sediment cores collected from the Rockall Trough and the Iberian Continental Margin. Sediments from each site contain a mixture of magnetosomes and granulometrically distinct detrital magnetite. We also quantify the spatial variation of three EM components (a coarse silt-sized MD component, a fine silt-sized PSD component, and a mixed clay-sized component containing both SD magnetite and hematite) in surficial sediments along the flow path of the North Atlantic Deep Water (NADW). These samples were separated into granulometric fractions, which helped constrain EM definition. PCA-based unmixing reveals systematic variations in EM relative abundance as a function of distance along NADW flow. Finally, we apply PCA to the combined data set of Rockall Trough and NADW sediments, which can be recast as a four-EM mixture, providing enhanced discrimination between components. Our method forms the foundation of a general solution to the problem of unmixing multicomponent magnetic mixtures, a fundamental task of rock magnetic studies.
National Research Council Canada - National Science Library
S. Shahid Shaukat; Toqeer Ahmed Rao; Moazzam A. Khan
2016-01-01
...) on the eigenvalues and eigenvectors resulting from principal component analysis (PCA). For each sample size, 100 bootstrap samples were drawn from environmental data matrix pertaining to water quality variables (p = 22...
Ghosh, Debarchana; Manson, Steven M.
2008-01-01
In this paper, we present a hybrid approach, robust principal component geographically weighted regression (RPCGWR), in examining urbanization as a function of both extant urban land use and the effect of social and environmental factors in the Twin Cities Metropolitan Area (TCMA) of Minnesota. We used remotely sensed data to treat urbanization via the proxy of impervious surface. We then integrated two different methods, robust principal component analysis (RPCA) and geographically weighted ...
Harris, Paul; Clarke, Annemarie; Juggins, Steve; Brunsdon, Chris; Charlton, Martin
2015-01-01
In many physical geography settings, principal component analysis (PCA) is applied without consideration for important spatial effects, and in doing so, tends to provide an incomplete understanding of a given process. In such circumstances, a spatial adaptation of PCA can be adopted, and to this end, this study focuses on the use of geographically weighted principal component analysis (GWPCA). GWPCA is a localized version of PCA that is an appropriate exploratory tool when a ne...
Institute of Scientific and Technical Information of China (English)
Hashmi Imran; Khan M Altaf; Kim Jong-Guk
2006-01-01
Popular descriptive multivariate statistical method currently employed is the principal component analyses (PCA) method.PCA is used to develop linear combinations that successively maximize the total variance of a sample where there is no known group structure. This study aimed at demonstrating the performance evaluation of pilot activated sludge treatment system by inoculating a strain of Pseudomonas capable of degrading malathion which was isolated by enrichment technique. An intensive analytical program was followed for evaluating the efficiency of biosimulator by maintaining the dissolved oxygen (DO) concentration at 4.0 mg/L.Analyses by high performance liquid chromatographic technique revealed that 90% of malathion removal was achieved within 29 h of treatment whereas COD got reduced considerably during the treatment process and mean removal efficiency was found to be 78%.The mean pH values increased gradually during the treatment process ranging from 7.36-8.54. Similarly the mean ammonia-nitrogen (NH3-N) values were found to be fluctuating between 19.425-28.488 mg/L, mean nitrite-nitrogen (NO3-N) ranging between 1.301-2.940 mg/L and mean nitrate-nitrogen (NO3-N) ranging between 0.0071-0.0711 mg/L. The study revealed that inoculation of bacterial culture under laboratory conditions could be used in bioremediation of environmental pollution caused by xenobiotics. The PCA analyses showed that pH, COD, organic load and total malathion concentration were highly correlated and emerged as the variables controlling the first component, whereas dissolved oxygen, NO3-N and NH3-N govemed the second component. The third component repeated the trend exhibited by the first two components.
Chen, Yanxian; Chang, Billy Heung Wing; Ding, Xiaohu; He, Mingguang
2016-11-22
In the present study we attempt to use hypothesis-independent analysis in investigating the patterns in refraction growth in Chinese children, and to explore the possible risk factors affecting the different components of progression, as defined by Principal Component Analysis (PCA). A total of 637 first-born twins in Guangzhou Twin Eye Study with 6-year annual visits (baseline age 7-15 years) were available in the analysis. Cluster 1 to 3 were classified after a partitioning clustering, representing stable, slow and fast progressing groups of refraction respectively. Baseline age and refraction, paternal refraction, maternal refraction and proportion of two myopic parents showed significant differences across the three groups. Three major components of progression were extracted using PCA: "Average refraction", "Acceleration" and the combination of "Myopia stabilization" and "Late onset of refraction progress". In regression models, younger children with more severe myopia were associated with larger "Acceleration". The risk factors of "Acceleration" included change of height and weight, near work, and parental myopia, while female gender, change of height and weight were associated with "Stabilization", and increased outdoor time was related to "Late onset of refraction progress". We therefore concluded that genetic and environmental risk factors have different impacts on patterns of refraction progression.
Oil classification using X-ray scattering and principal component analysis
Energy Technology Data Exchange (ETDEWEB)
Almeida, Danielle S.; Souza, Amanda S.; Lopes, Ricardo T., E-mail: dani.almeida84@gmail.com, E-mail: ricardo@lin.ufrj.br, E-mail: amandass@bioqmed.ufrj.br [Universidade Federal do Rio de Janeiro (UFRJ), Rio de Janeiro, RJ (Brazil); Oliveira, Davi F.; Anjos, Marcelino J., E-mail: davi.oliveira@uerj.br, E-mail: marcelin@uerj.br [Universidade do Estado do Rio de Janeiro (UERJ), Rio de Janeiro, RJ (Brazil). Inst. de Fisica Armando Dias Tavares
2015-07-01
X-ray scattering techniques have been considered promising for the classification and characterization of many types of samples. This study employed this technique combined with chemical analysis and multivariate analysis to characterize 54 vegetable oil samples (being 25 olive oils)with different properties obtained in commercial establishments in Rio de Janeiro city. The samples were chemically analyzed using the following indexes: iodine, acidity, saponification and peroxide. In order to obtain the X-ray scattering spectrum, an X-ray tube with a silver anode operating at 40kV and 50 μA was used. The results showed that oils cab ne divided in tow large groups: olive oils and non-olive oils. Additionally, in a multivariate analysis (Principal Component Analysis - PCA), two components were obtained and accounted for more than 80% of the variance. One component was associated with chemical parameters and the other with scattering profiles of each sample. Results showed that use of X-ray scattering spectra combined with chemical analysis and PCA can be a fast, cheap and efficient method for vegetable oil characterization. (author)
Progress Towards Improved Analysis of TES X-ray Data Using Principal Component Analysis
Busch, S. E.; Adams, J. S.; Bandler, S. R.; Chervenak, J. A.; Eckart, M. E.; Finkbeiner, F. M.; Fixsen, D. J.; Kelley, R. L.; Kilbourne, C. A.; Lee, S.-J.; Moseley, S. H.; Porst, J.-P.; Porter, F. S.; Sadleir, J. E.; Smith, S. J.
2016-07-01
The traditional method of applying a digital optimal filter to measure X-ray pulses from transition-edge sensor (TES) devices does not achieve the best energy resolution when the signals have a highly non-linear response to energy, or the noise is non-stationary during the pulse. We present an implementation of a method to analyze X-ray data from TESs, which is based upon principal component analysis (PCA). Our method separates the X-ray signal pulse into orthogonal components that have the largest variance. We typically recover pulse height, arrival time, differences in pulse shape, and the variation of pulse height with detector temperature. These components can then be combined to form a representation of pulse energy. An added value of this method is that by reporting information on more descriptive parameters (as opposed to a single number representing energy), we generate a much more complete picture of the pulse received. Here we report on progress in developing this technique for future implementation on X-ray telescopes. We used an ^{55}Fe source to characterize Mo/Au TESs. On the same dataset, the PCA method recovers a spectral resolution that is better by a factor of two than achievable with digital optimal filters.
Ghosh, Antara; Barman, Soma
2016-06-01
Gene systems are extremely complex, heterogeneous, and noisy in nature. Many statistical tools which are used to extract relevant feature from genes provide fuzzy and ambiguous information. High-dimensional gene expression database available in public domain usually contains thousands of genes. Efficient prediction method is demanding nowadays for accurate identification of such database. Euclidean distance measurement and principal component analysis methods are applied on such databases to identify the genes. In both methods, prediction algorithm is based on homology search approach. Digital Signal Processing technique along with statistical method is used for analysis of genes in both cases. A two-level decision logic is used for gene classification as healthy or cancerous. This binary logic minimizes the prediction error and improves prediction accuracy. Superiority of the method is judged by receiver operating characteristic curve.
Credit Risk Assessment Model Based Using Principal component Analysis And Artificial Neural Network
Directory of Open Access Journals (Sweden)
Hamdy Abeer
2016-01-01
Full Text Available Credit risk assessment for bank customers has gained increasing attention in recent years. Several models for credit scoring have been proposed in the literature for this purpose. The accuracy of the model is crucial for any financial institution’s profitability. This paper provided a high accuracy credit scoring model that could be utilized with small and large datasets utilizing a principal component analysis (PCA based breakdown to the significance of the attributes commonly used in the credit scoring models. The proposed credit scoring model applied PCA to acquire the main attributes of the credit scoring data then an ANN classifier to determine the credit worthiness of an individual applicant. The performance of the proposed model was compared to other models in terms of accuracy and training time. Results, based on German dataset showed that the proposed model is superior to others and computationally cheaper. Thus it can be a potential candidate for future credit scoring systems.
Directory of Open Access Journals (Sweden)
Pasi A. Karjalainen
2007-01-01
Full Text Available Ventricular repolarization duration (VRD is affected by heart rate and autonomic control, and thus VRD varies in time in a similar way as heart rate. VRD variability is commonly assessed by determining the time differences between successive R- and T-waves, that is, RT intervals. Traditional methods for RT interval detection necessitate the detection of either T-wave apexes or offsets. In this paper, we propose a principal-component-regression- (PCR- based method for estimating RT variability. The main benefit of the method is that it does not necessitate T-wave detection. The proposed method is compared with traditional RT interval measures, and as a result, it is observed to estimate RT variability accurately and to be less sensitive to noise than the traditional methods. As a specific application, the method is applied to exercise electrocardiogram (ECG recordings.
Directory of Open Access Journals (Sweden)
Shengkun Xie
2014-01-01
Full Text Available Classification of electroencephalography (EEG is the most useful diagnostic and monitoring procedure for epilepsy study. A reliable algorithm that can be easily implemented is the key to this procedure. In this paper a novel signal feature extraction method based on dynamic principal component analysis and nonoverlapping moving window is proposed. Along with this new technique, two detection methods based on extracted sparse features are applied to deal with signal classification. The obtained results demonstrated that our proposed methodologies are able to differentiate EEGs from controls and interictal for epilepsy diagnosis and to separate EEGs from interictal and ictal for seizure detection. Our approach yields high classification accuracy for both single-channel short-term EEGs and multichannel long-term EEGs. The classification performance of the method is also compared with other state-of-the-art techniques on the same datasets and the effect of signal variability on the presented methods is also studied.
Analysis of breast cancer progression using principal component analysis and clustering
Indian Academy of Sciences (India)
G Alexe; G S Dalgin; S Ganesan; C DeLisi; G Bhanot
2007-08-01
We develop a new technique to analyse microarray data which uses a combination of principal components analysis and consensus ensemble -clustering to find robust clusters and gene markers in the data. We apply our method to a public microarray breast cancer dataset which has expression levels of genes in normal samples as well as in three pathological stages of disease; namely, atypical ductal hyperplasia or ADH, ductal carcinoma in situ or DCIS and invasive ductal carcinoma or IDC. Our method averages over clustering techniques and data perturbation to find stable, robust clusters and gene markers. We identify the clusters and their pathways with distinct subtypes of breast cancer (Luminal, Basal and Her2+). We confirm that the cancer phenotype develops early (in early hyperplasia or ADH stage) and find from our analysis that each subtype progresses from ADH to DCIS to IDC along its own specific pathway, as if each was a distinct disease.
Plant-wide process monitoring based on mutual information-multiblock principal component analysis.
Jiang, Qingchao; Yan, Xuefeng
2014-09-01
Multiblock principal component analysis (MBPCA) methods are gaining increasing attentions in monitoring plant-wide processes. Generally, MBPCA assumes that some process knowledge is incorporated for block division; however, process knowledge is not always available. A new totally data-driven MBPCA method, which employs mutual information (MI) to divide the blocks automatically, has been proposed. By constructing sub-blocks using MI, the division not only considers linear correlations between variables, but also takes into account non-linear relations thereby involving more statistical information. The PCA models in sub-blocks reflect more local behaviors of process, and the results in all blocks are combined together by support vector data description. The proposed method is implemented on a numerical process and the Tennessee Eastman process. Monitoring results demonstrate the feasibility and efficiency.
Natural Product Discovery Using Planes of Principal Component Analysis in R (PoPCAR
Directory of Open Access Journals (Sweden)
Shaurya Chanana
2017-07-01
Full Text Available Rediscovery of known natural products hinders the discovery of new, unique scaffolds. Efforts have mostly focused on streamlining the determination of what compounds are known vs. unknown (dereplication, but an alternative strategy is to focus on what is different. Utilizing statistics and assuming that common actinobacterial metabolites are likely known, focus can be shifted away from dereplication and towards discovery. LC-MS-based principal component analysis (PCA provides a perfect tool to distinguish unique vs. common metabolites, but the variability inherent within natural products leads to datasets that do not fit ideal standards. To simplify the analysis of PCA models, we developed a script that identifies only those masses or molecules that are unique to each strain within a group, thereby greatly reducing the number of data points to be inspected manually. Since the script is written in R, it facilitates integration with other metabolomics workflows and supports automated mass matching to databases such as Antibase.
Water quality of the Chhoti Gandak River using principal component analysis, Ganga Plain, India
Indian Academy of Sciences (India)
Vikram Bhardwaj; Dhruv Sen Singh; A K Singh
2010-02-01
Chhoti Gandak is a meandering river which originates in the terai area of the Ganga Plain and serves as a lifeline for the people of Deoria district, Uttar Pradesh. It travels a distance of about 250 km and drains into Ghaghara near Gothani, Siwan district of Bihar. It has been observed that people of this region suffer from water-borne health problems; therefore water samples were collected to analyse its quality along the entire length of Chhoti Gandak River. The principal components of water quality are controlled by lithology, gentle slope gradient, poor drainage, long residence of water, ion exchange, weathering of minerals, heavy use of fertilizers, and domestic wastes. At some stations water is hard with an excess alkalinity and is not suitable for drinking and irrigation purposes. The variation in the local and regional hydrogeochemical processes distinguished the geogenic sources from the anthropogenic one.
Institute of Scientific and Technical Information of China (English)
Kechang FU; Liankui DAI; Tiejun WU; Ming ZHU
2009-01-01
A new sensor fault diagnosis method based on structured kernel principal component analysis (KPCA) is proposed for nonlinear processes.By performing KPCA on subsets of variables,a set of structured residuals,i.e.,scaled powers of KPCA,can be obtained in the same way as partial PCA.The structured residuals are utilized in composing an isolation scheme for sensor fault diagnosis,according to a properly designed incidence matrix.Sensor fault sensitivity and critical sensitivity are defined,based on which an incidence matrix optimization algorithm is proposed to improve the performance of the structured KPCA.The effectiveness of the proposed method is demonstrated on the simulated continuous stirred tank reactor (CSTR) process.
Directory of Open Access Journals (Sweden)
J. Pavlovicova
2007-04-01
Full Text Available In this contribution, human face as biometric is considered. Original method of feature extraction from image data is introduced using MLP (multilayer perceptron and PCA (principal component analysis. This method is used in human face recognition system and results are compared to face recognition system using PCA directly, to a system with direct classification of input images by MLP and RBF (radial basis function networks, and to a system using MLP as a feature extractor and MLP and RBF networks in the role of classifier. Also a two-stage method for face recognition is presented, in which Kohonen self-organizing map is used as a feature extractor. MLP and RBF network are used as classifiers. In order to obtain deeper insight into presented methods, also visualizations of internal representation of input data obtained by neural networks are presented.
Principal Component Analysis of Gait Kinematics Data in Acute and Chronic Stroke Patients
Directory of Open Access Journals (Sweden)
Ivana Milovanović
2012-01-01
Full Text Available We present the joint angles analysis by means of the principal component analysis (PCA. The data from twenty-seven acute and chronic hemiplegic patients were used and compared with data from five healthy subjects. The data were collected during walking along a 10-meter long path. The PCA was applied on a data set consisting of hip, knee, and ankle joint angles of the paretic and the nonparetic leg. The results point to significant differences in joint synergies between the acute and chronic hemiplegic patients that are not revealed when applying typical methods for gait assessment (clinical scores, gait speed, and gait symmetry. The results suggest that the PCA allows classification of the origin for the deficit in the gait when compared to healthy subjects; hence, the most appropriate treatment can be applied in the rehabilitation.
PRINCIPAL COMPONENTS IN MULTIVARIATE CONTROL CHARTS APPLIED TO DATA INSTRUMENTATION OF DAMS
Directory of Open Access Journals (Sweden)
Emerson Lazzarotto
2016-03-01
Full Text Available Hydroelectric plants are monitored by a high number of instruments that assess various quality characteristics of interest that have an inherent variability. The readings of these instruments generate time series of data on many occasions have correlation. Each project of a dam plant has characteristics that make it unique. Faced with the need to establish statistical control limits for the instrumentation data, this article makes an approach to multivariate statistical analysis and proposes a model that uses principal components control charts and statistical and to explain variability and establish a method of monitoring to control future observations. An application for section E of the Itaipu hydroelectric plant is performed to validate the model. The results show that the method used is appropriate and can help identify the type of outliers, reducing false alarms and reveal instruments that have higher contribution to the variability.
Indian Academy of Sciences (India)
Jyh-Woei Lin
2012-08-01
Principal Component Analysis (PCA) and image processing are used to determine Total Electron Content (TEC) anomalies in the F-layer of the ionosphere relating to Typhoon Nakri for 29 May, 2008 (UTC). PCA and image processing are applied to the global ionospheric map (GIM) with transforms conducted for the time period 12:00–14:00 UT on 29 May, 2008 when the wind was most intense. Results show that at a height of approximately 150–200 km the TEC anomaly is highly localized; however, it becomes more intense and widespread with height. Potential causes of these results are discussed with emphasis given to acoustic gravity waves caused by wind force.
Zalameda, Joseph N.; Bolduc, Sean; Harman, Rebecca
2017-01-01
A composite fuselage aircraft forward section was inspected with flash thermography. The fuselage section is 24 feet long and approximately 8 feet in diameter. The structure is primarily configured with a composite sandwich structure of carbon fiber face sheets with a Nomex(Trademark) honeycomb core. The outer surface area was inspected. The thermal data consisted of 477 data sets totaling in size of over 227 Gigabytes. Principal component analysis (PCA) was used to process the data sets for substructure and defect detection. A fixed eigenvector approach using a global covariance matrix was used and compared to a varying eigenvector approach. The fixed eigenvector approach was demonstrated to be a practical analysis method for the detection and interpretation of various defects such as paint thickness variation, possible water intrusion damage, and delamination damage. In addition, inspection considerations are discussed including coordinate system layout, manipulation of the fuselage section, and the manual scanning technique used for full coverage.
Learning representative features for facial images based on a modified principal component analysis
Averkin, Anton; Potapov, Alexey
2013-05-01
The paper is devoted to facial image analysis and particularly deals with the problem of automatic evaluation of the attractiveness of human faces. We propose a new approach for automatic construction of feature space based on a modified principal component analysis. Input data sets for the algorithm are the learning data sets of facial images, which are rated by one person. The proposed approach allows one to extract features of the individual subjective face beauty perception and to predict attractiveness values for new facial images, which were not included into a learning data set. The Pearson correlation coefficient between values predicted by our method for new facial images and personal attractiveness estimation values equals to 0.89. This means that the new approach proposed is promising and can be used for predicting subjective face attractiveness values in real systems of the facial images analysis.
Dense Error Correction for Low-Rank Matrices via Principal Component Analysis
Ganesh, Arvind; Li, Xiaodong; Candes, Emmanuel J; Ma, Yi
2010-01-01
We consider the problem of recovering a low-rank matrix when some of its entries, whose locations are not known a priori, are corrupted by errors of arbitrarily large magnitude. It has recently been shown that this problem can be solved efficiently and effectively by a convex program named Principal Component Pursuit (PCP), provided that the fraction of corrupted entries and the rank of the matrix are both sufficiently small. In this paper, we extend that result to show that the same convex program, with a slightly improved weighting parameter, exactly recovers the low-rank matrix even if "almost all" of its entries are arbitrarily corrupted, provided the signs of the errors are random. We corroborate our result with simulations on randomly generated matrices and errors.
Institute of Scientific and Technical Information of China (English)
牛东晓; 刘达; 邢棉
2008-01-01
A combined model based on principal components analysis (PCA) and generalized regression neural network (GRNN) was adopted to forecast electricity price in day-ahead electricity market. PCA was applied to mine the main influence on day-ahead price, avoiding the strong correlation between the input factors that might influence electricity price, such as the load of the forecasting hour, other history loads and prices, weather and temperature; then GRNN was employed to forecast electricity price according to the main information extracted by PCA. To prove the efficiency of the combined model, a case from PJM (Pennsylvania-New Jersey-Maryland) day-ahead electricity market was evaluated. Compared to back-propagation (BP) neural network and standard GRNN, the combined method reduces the mean absolute percentage error about 3%.
Energy Technology Data Exchange (ETDEWEB)
Clegg, Samuel M [Los Alamos National Laboratory; Barefield, James E [Los Alamos National Laboratory; Wiens, Roger C [Los Alamos National Laboratory; Sklute, Elizabeth [MT HOLYOKE COLLEGE; Dyare, Melinda D [MT HOLYOKE COLLEGE
2008-01-01
Quantitative analysis with LIBS traditionally employs calibration curves that are complicated by the chemical matrix effects. These chemical matrix effects influence the LIBS plasma and the ratio of elemental composition to elemental emission line intensity. Consequently, LIBS calibration typically requires a priori knowledge of the unknown, in order for a series of calibration standards similar to the unknown to be employed. In this paper, three new Multivariate Analysis (MV A) techniques are employed to analyze the LIBS spectra of 18 disparate igneous and highly-metamorphosed rock samples. Partial Least Squares (PLS) analysis is used to generate a calibration model from which unknown samples can be analyzed. Principal Components Analysis (PCA) and Soft Independent Modeling of Class Analogy (SIMCA) are employed to generate a model and predict the rock type of the samples. These MV A techniques appear to exploit the matrix effects associated with the chemistries of these 18 samples.
Classification of Wines Based on Combination of 1H NMR Spectroscopy and Principal Component Analysis
Institute of Scientific and Technical Information of China (English)
DU, Yuan-Yuan; BAI, Guo-Yun; ZHANG, Xu; LIU, Mai-Li
2007-01-01
A combination of 1H nuclear magnetic resonance (NMR) spectroscopy and principal component analysis (PCA)has shown the potential for being a useful method for classification of type, production origin or geographic origin of wines. In this preliminary study, twenty-one bottled wines were classified/separated for their location of production in Shacheng, Changli and Yantai, and the types of the blended, medium dry, dry white and dry red wines, using the NMR-PCA method. The wines were produced by three subsidiary companies of an enterprise according to the same national standard. The separation was believed to be mainly due to the fermentation process for different wines and environmental variations, such as local climate, soil, underground water, sunlight and rainfall. The major chemicals associated with the separation were identified.
Zia, Asif Iqbal
2015-06-01
The surface roughness of thin-film gold electrodes induces instability in impedance spectroscopy measurements of capacitive interdigital printable sensors. Post-fabrication thermodynamic annealing was carried out at temperatures ranging from 30 °C to 210 °C in a vacuum oven and the variation in surface morphology of thin-film gold electrodes was observed by scanning electron microscopy. Impedance spectra obtained at different temperatures were translated into equivalent circuit models by applying complex nonlinear least square curve-fitting algorithm. Principal component analysis was applied to deduce the classification of the parameters affected due to the annealing process and to evaluate the performance stability using mathematical model. Physics of the thermodynamic annealing was discussed based on the surface activation energies. The post anneal testing of the sensors validated the achieved stability in impedance measurement. © 2001-2012 IEEE.
Bozorgzadeh, Bardia; Covey, Daniel P; Garris, Paul A; Mohseni, Pedram
2015-01-01
This paper reports on field-programmable gate array (FPGA) implementation of a digital signal processing (DSP) unit for real-time processing of neurochemical data obtained by fast-scan cyclic voltammetry (FSCV) at a carbonfiber microelectrode (CFM). The DSP unit comprises a decimation filter and two embedded processors to process the FSCV data obtained by an oversampling recording front-end and differentiate the target analyte from interferents in real time with a chemometrics algorithm using principal component regression (PCR). Interfaced with an integrated, FSCV-sensing front-end, the DSP unit successfully resolves the dopamine response from that of pH change and background-current drift, two common dopamine interferents, in flow injection analysis involving bolus injection of mixed solutions, as well as in biological tests involving electrically evoked, transient dopamine release in the forebrain of an anesthetized rat.
Portable XRF and principal component analysis for bill characterization in forensic science.
Appoloni, C R; Melquiades, F L
2014-02-01
Several modern techniques have been applied to prevent counterfeiting of money bills. The objective of this study was to demonstrate the potential of Portable X-ray Fluorescence (PXRF) technique and the multivariate analysis method of Principal Component Analysis (PCA) for classification of bills in order to use it in forensic science. Bills of Dollar, Euro and Real (Brazilian currency) were measured directly at different colored regions, without any previous preparation. Spectra interpretation allowed the identification of Ca, Ti, Fe, Cu, Sr, Y, Zr and Pb. PCA analysis separated the bills in three groups and subgroups among Brazilian currency. In conclusion, the samples were classified according to its origin identifying the elements responsible for differentiation and basic pigment composition. PXRF allied to multivariate discriminate methods is a promising technique for rapid and no destructive identification of false bills in forensic science. Copyright © 2013 Elsevier Ltd. All rights reserved.
Institute of Scientific and Technical Information of China (English)
FAN Wenru; WANG Huaxiang; YANG Chengyi; MA Shiwen
2010-01-01
The aim of this paper is to propose a useful method for exploring regional ventilation and perfusion in the chest and also separation of pulmonary and cardiac changes.The approach is based on estimating both electrical impedance tomography(EIT)measurements and reconstructed images by means of principal component analysis(PCA).In the experiments in vivo,43 cycles of heart-beat rhythm could be detected by PCA when the volunteer held breath; 9 breathing cycles and 50 heart-beat cycles could be detected by PCA when the volunteer breathed normally.The results indicate that the rhythms of cardiac activity and respiratory process can be exploited and separated through analyzing the boundary measurements by PCA without image reconstruction.
DEFF Research Database (Denmark)
Mears, Lisa; Nørregaard, Rasmus; Sin, Gürkan;
2016-01-01
process operating at Novozymes A/S. Following the FUPCR methodology, the final product concentration could be predicted with an average prediction error of 7.4%. Multiple iterations of preprocessing were applied by implementing the methodology to identify the best data handling methods for the model....... It is shown that application of functional data analysis and the choice of variance scaling method have the greatest impact on the prediction accuracy. Considering the vast amount of batch process data continuously generated in industry, this methodology can potentially contribute as a tool to identify......This work proposes a methodology utilizing functional unfold principal component regression (FUPCR), for application to industrial batch process data as a process modeling and optimization tool. The methodology is applied to an industrial fermentation dataset, containing 30 batches of a production...
Xie, Shengkun; Krishnan, Sridhar
2014-01-01
Classification of electroencephalography (EEG) is the most useful diagnostic and monitoring procedure for epilepsy study. A reliable algorithm that can be easily implemented is the key to this procedure. In this paper a novel signal feature extraction method based on dynamic principal component analysis and nonoverlapping moving window is proposed. Along with this new technique, two detection methods based on extracted sparse features are applied to deal with signal classification. The obtained results demonstrated that our proposed methodologies are able to differentiate EEGs from controls and interictal for epilepsy diagnosis and to separate EEGs from interictal and ictal for seizure detection. Our approach yields high classification accuracy for both single-channel short-term EEGs and multichannel long-term EEGs. The classification performance of the method is also compared with other state-of-the-art techniques on the same datasets and the effect of signal variability on the presented methods is also studied.
Application of Principal Component Analysis in Prompt Gamma Spectra for Material Sorting
Energy Technology Data Exchange (ETDEWEB)
Im, Hee Jung; Lee, Yun Hee; Song, Byoung Chul; Park, Yong Joon; Kim, Won Ho
2006-11-15
For the detection of illicit materials in a very short time by comparing unknown samples' gamma spectra to pre-programmed material signatures, we at first, selected a method to reduce the noise of the obtained gamma spectra. After a noise reduction, a pattern recognition technique was applied to discriminate the illicit materials from the innocuous materials in the noise reduced data. Principal component analysis was applied for a noise reduction and pattern recognition in prompt gamma spectra. A computer program for the detection of illicit materials based on PCA method was developed in our lab and can be applied to the PGNAA system for the baggage checking at all ports of entry at a very short time.
DEFF Research Database (Denmark)
Malmquist, Linus M.V.; Olsen, Rasmus R.; Hansen, Asger B.
2007-01-01
Detailed characterization and understanding of oil weathering at the molecular level is an essential part of tiered approaches for forensic oil spill identification, for risk assessment of terrestrial and marine oil spills, and for evaluating effects of bioremediation initiatives. Here......, a chemometricbased method is applied to data from two in vitro experiments in order to distinguish the effects of evaporation and dissolution processes on oil composition. The potential of the method for obtaining detailed chemical information of the effects from evaporation and dissolution processes, to determine...... weathering state and to distinguish between various weathering processes is investigated and discussed. The method is based on comprehensive and objective chromatographic data processing followed by principal component analysis (PCA) of concatenated sections of gas chromatography–mass spectrometry...
Directory of Open Access Journals (Sweden)
A. Bhushan
2015-07-01
Full Text Available In this paper, we address outliers in spatiotemporal data streams obtained from sensors placed across geographically distributed locations. Outliers may appear in such sensor data due to various reasons such as instrumental error and environmental change. Real-time detection of these outliers is essential to prevent propagation of errors in subsequent analyses and results. Incremental Principal Component Analysis (IPCA is one possible approach for detecting outliers in such type of spatiotemporal data streams. IPCA has been widely used in many real-time applications such as credit card fraud detection, pattern recognition, and image analysis. However, the suitability of applying IPCA for outlier detection in spatiotemporal data streams is unknown and needs to be investigated. To fill this research gap, this paper contributes by presenting two new IPCA-based outlier detection methods and performing a comparative analysis with the existing IPCA-based outlier detection methods to assess their suitability for spatiotemporal sensor data streams.
PCA 4 DCA: The Application Of Principal Component Analysis To The Dendritic Cell Algorithm
Gu, Feng; Oates, Robert; Aickelin, Uwe
2010-01-01
As one of the newest members in the ?field of arti?cial immune systems (AIS), the Dendritic Cell Algorithm (DCA) is based on behavioural models of natural dendritic cells (DCs). Unlike other AIS, the DCA does not rely on training data, instead domain or expert knowledge is required to predetermine the mapping between input signals from a particular instance to the three categories used by the DCA. This data preprocessing phase has received the criticism of having manually over-?tted the data to the algorithm, which is undesirable. Therefore, in this paper we have attempted to ascertain if it is possible to use principal component analysis (PCA) techniques to automatically categorise input data while still generating useful and accurate classication results. The integrated system is tested with a biometrics dataset for the stress recognition of automobile drivers. The experimental results have shown the application of PCA to the DCA for the purpose of automated data preprocessing is successful.
Directory of Open Access Journals (Sweden)
Muzhir Shaban Al-Ani
2011-05-01
Full Text Available Detecting faces across multiple views is more challenging than in a frontal view. To address this problem,an efficient approach is presented in this paper using a kernel machine based approach for learning suchnonlinear mappings to provide effective view-based representation for multi-view face detection. In thispaper Kernel Principal Component Analysis (KPCA is used to project data into the view-subspaces thencomputed as view-based features. Multi-view face detection is performed by classifying each input imageinto face or non-face class, by using a two class Kernel Support Vector Classifier (KSVC. Experimentalresults demonstrate successful face detection over a wide range of facial variation in color, illuminationconditions, position, scale, orientation, 3D pose, and expression in images from several photo collections.
Frausto-Reyes, C.; Medina-Gutiérrez, C.; Sato-Berrú, R.; Sahagún, L. R.
2005-09-01
Using Raman spectroscopy, with an excitation radiation source of 514.5 nm, and principal component analysis (PCA) was elaborated a method to study qualitatively the ethanol content in tequila samples. This method is based in the OH region profile (water) of the Raman spectra. Also, this method, using the fluorescence background of the Raman spectra, can be used to distinguish silver tequila from aged tequilas. The first three PCs of the Raman spectra, that provide the 99% of the total variance of the data set, were used for the samples classification. The PCA1 and PCA2 are related with the water (or ethanol) content of the sample, whereas the PCA3 is related with the fluorescence background of the Raman spectra.
Directory of Open Access Journals (Sweden)
Diah Wulandari
2016-12-01
Full Text Available This study aimed to analyze the most influential variable in the implementation of ICT in schools. Principal component analysis using linear algebra to reduce the dimension of data with variables that are interconnected into a new set of data with variables that are not related to each other, called the principal component. Principal component is used to save and calculate how much correlation within varian. The ICT data is collected from 50 schools, this data is grouped into five group based on reference domain of ICT for education indicator by UIS 2009. Dataset per group is used as input for principal component analysis algorithm with Matlab R2014a and produce principal component. Principal component analysis produce five variable with the most influence based on their domain, there are mean hour for individual using of ICT in curicculum domain, existence school in internet in infrastructure domain, learner proportion in using computer laboratory for learning in teacher development domain, learner propostion that computer basic skill course in participation domain.
Principal Component Analysis to Explore Climatic Variability and Dengue Outbreak in Lahore
Directory of Open Access Journals (Sweden)
Syed Afrozuddin Ahmed
2014-08-01
Full Text Available Normal 0 false false false EN-US X-NONE X-NONE Various studies have reported that global warming causes unstable climate and many serious impact to physical environment and public health. The increasing incidence of dengue incidence is now a priority health issue and become a health burden of Pakistan. In this study it has been investigated that spatial pattern of environment causes the emergence or increasing rate of dengue fever incidence that effects the population and its health. Principal component analysis is performed for the purpose of finding if there is/are any general environmental factor/structure which could be affected in the emergence of dengue fever cases in Pakistani climate. Principal component is applied to find structure in data for all four periods i.e. 1980 to 2012, 1980 to 1995 and 1996 to 2012. The first three PCs for the period (1980-2012, 1980-1994, 1995-2012 are almost the same and it represent hot and windy weather. The PC1s of all dengue periods are different to each other. PC2 for all period are same and it is wetness in weather. PC3s are different and it is the combination of wetness and windy weather. PC4s for all period show humid but no rain in weather. For climatic variable only minimum temperature and maximum temperature are significantly correlated with daily dengue cases. PC1, PC3 and PC4 are highly significantly correlated with daily dengue cases
Fernández-Arjona, María del Mar; Grondona, Jesús M.; Granados-Durán, Pablo; Fernández-Llebrez, Pedro; López-Ávalos, María D.
2017-01-01
It is known that microglia morphology and function are closely related, but only few studies have objectively described different morphological subtypes. To address this issue, morphological parameters of microglial cells were analyzed in a rat model of aseptic neuroinflammation. After the injection of a single dose of the enzyme neuraminidase (NA) within the lateral ventricle (LV) an acute inflammatory process occurs. Sections from NA-injected animals and sham controls were immunolabeled with the microglial marker IBA1, which highlights ramifications and features of the cell shape. Using images obtained by section scanning, individual microglial cells were sampled from various regions (septofimbrial nucleus, hippocampus and hypothalamus) at different times post-injection (2, 4 and 12 h). Each cell yielded a set of 15 morphological parameters by means of image analysis software. Five initial parameters (including fractal measures) were statistically different in cells from NA-injected rats (most of them IL-1β positive, i.e., M1-state) compared to those from control animals (none of them IL-1β positive, i.e., surveillant state). However, additional multimodal parameters were revealed more suitable for hierarchical cluster analysis (HCA). This method pointed out the classification of microglia population in four clusters. Furthermore, a linear discriminant analysis (LDA) suggested three specific parameters to objectively classify any microglia by a decision tree. In addition, a principal components analysis (PCA) revealed two extra valuable variables that allowed to further classifying microglia in a total of eight sub-clusters or types. The spatio-temporal distribution of these different morphotypes in our rat inflammation model allowed to relate specific morphotypes with microglial activation status and brain location. An objective method for microglia classification based on morphological parameters is proposed. Main points Microglia undergo a quantifiable
Characterization of soil chemical properties of strawberry fields using principal component analysis
Directory of Open Access Journals (Sweden)
Gláucia Oliveira Islabão
2013-02-01
Full Text Available One of the largest strawberry-producing municipalities of Rio Grande do Sul (RS is Turuçu, in the South of the State. The strawberry production system adopted by farmers is similar to that used in other regions in Brazil and in the world. The main difference is related to the soil management, which can change the soil chemical properties during the strawberry cycle. This study had the objective of assessing the spatial and temporal distribution of soil fertility parameters using principal component analysis (PCA. Soil sampling was based on topography, dividing the field in three thirds: upper, middle and lower. From each of these thirds, five soil samples were randomly collected in the 0-0.20 m layer, to form a composite sample for each third. Four samples were taken during the strawberry cycle and the following properties were determined: soil organic matter (OM, soil total nitrogen (N, available phosphorus (P and potassium (K, exchangeable calcium (Ca and magnesium (Mg, soil pH (pH, cation exchange capacity (CEC at pH 7.0, soil base (V% and soil aluminum saturation(m%. No spatial variation was observed for any of the studied soil fertility parameters in the strawberry fields and temporal variation was only detected for available K. Phosphorus and K contents were always high or very high from the beginning of the strawberry cycle, while pH values ranged from very low to very high. Principal component analysis allowed the clustering of all strawberry fields based on variables related to soil acidity and organic matter content.
Principal Component Analysis to Explore Climatic Variability and Dengue Outbreak in Lahore
Directory of Open Access Journals (Sweden)
Syed Afrozuddin Ahmed
2014-08-01
Full Text Available Normal 0 false false false EN-US X-NONE X-NONE Various studies have reported that global warming causes unstable climate and many serious impact to physical environment and public health. The increasing incidence of dengue incidence is now a priority health issue and become a health burden of Pakistan. In this study it has been investigated that spatial pattern of environment causes the emergence or increasing rate of dengue fever incidence that effects the population and its health. Principal component analysis is performed for the purpose of finding if there is/are any general environmental factor/structure which could be affected in the emergence of dengue fever cases in Pakistani climate. Principal component is applied to find structure in data for all four periods i.e. 1980 to 2012, 1980 to 1995 and 1996 to 2012. The first three PCs for the period (1980-2012, 1980-1994, 1995-2012 are almost the same and it represent hot and windy weather. The PC1s of all dengue periods are different to each other. PC2 for all period are same and it is wetness in weather. PC3s are different and it is the combination of wetness and windy weather. PC4s for all period show humid but no rain in weather. For climatic variable only minimum temperature and maximum temperature are significantly correlated with daily dengue cases. PC1, PC3 and PC4 are highly significantly correlated with daily dengue cases
Institute of Scientific and Technical Information of China (English)
Nilanchal Patel; Brijesh Kumar Kaushal
2010-01-01
The classification accuracy of the various categories on the classified remotely sensed images are usually evaluated by two different measures of accuracy, namely, producer's accuracy (PA) and user's accuracy (UA). The PA of a category indicates to what extent the reference pixels of the category are correctly classified, whereas the UA ora category represents to what extent the other categories are less misclassified into the category in question. Therefore, the UA of the various categories determines the reliability of their interpretation on the classified image and is more important to the analyst than the PA. The present investigation has been performed in order to determine ifthere occurs improvement in the UA of the various categories on the classified image of the principal components of the original bands and on the classified image of the stacked image of two different years. We performed the analyses using the IRS LISS Ⅲ images of two different years, i.e., 1996 and 2009, that represent the different magnitude of urbanization and the stacked image of these two years pertaining to Ranchi area, Jharkhand, India, with a view to assessing the impacts of urbanization on the UA of the different categories. The results of the investigation demonstrated that there occurs significant improvement in the UA of the impervious categories in the classified image of the stacked image, which is attributable to the aggregation of the spectral information from twice the number of bands from two different years. On the other hand, the classified image of the principal components did not show any improvement in the UA as compared to the original images.
Principal components analysis based control of a multi-dof underactuated prosthetic hand
Directory of Open Access Journals (Sweden)
Magenes Giovanni
2010-04-01
Full Text Available Abstract Background Functionality, controllability and cosmetics are the key issues to be addressed in order to accomplish a successful functional substitution of the human hand by means of a prosthesis. Not only the prosthesis should duplicate the human hand in shape, functionality, sensorization, perception and sense of body-belonging, but it should also be controlled as the natural one, in the most intuitive and undemanding way. At present, prosthetic hands are controlled by means of non-invasive interfaces based on electromyography (EMG. Driving a multi degrees of freedom (DoF hand for achieving hand dexterity implies to selectively modulate many different EMG signals in order to make each joint move independently, and this could require significant cognitive effort to the user. Methods A Principal Components Analysis (PCA based algorithm is used to drive a 16 DoFs underactuated prosthetic hand prototype (called CyberHand with a two dimensional control input, in order to perform the three prehensile forms mostly used in Activities of Daily Living (ADLs. Such Principal Components set has been derived directly from the artificial hand by collecting its sensory data while performing 50 different grasps, and subsequently used for control. Results Trials have shown that two independent input signals can be successfully used to control the posture of a real robotic hand and that correct grasps (in terms of involved fingers, stability and posture may be achieved. Conclusions This work demonstrates the effectiveness of a bio-inspired system successfully conjugating the advantages of an underactuated, anthropomorphic hand with a PCA-based control strategy, and opens up promising possibilities for the development of an intuitively controllable hand prosthesis.
Hwang, Joonki; Park, Aaron; Chung, Jin Hyuk; Choi, Namhyun; Park, Jun-Qyu; Cho, Soo Gyeong; Baek, Sung-June; Choo, Jaebum
2013-06-01
Recently, the development of methods for the identification of explosive materials that are faster, more sensitive, easier to use, and more cost-effective has become a very important issue for homeland security and counter-terrorism applications. However, limited applicability of several analytical methods such as, the incapability of detecting explosives in a sealed container, the limited portability of instruments, and false alarms due to the inherent lack of selectivity, have motivated the increased interest in the application of Raman spectroscopy for the rapid detection and identification of explosive materials. Raman spectroscopy has received a growing interest due to its stand-off capacity, which allows samples to be analyzed at distance from the instrument. In addition, Raman spectroscopy has the capability to detect explosives in sealed containers such as glass or plastic bottles. We report a rapid and sensitive recognition technique for explosive compounds using Raman spectroscopy and principal component analysis (PCA). Seven hundreds of Raman spectra (50 measurements per sample) for 14 selected explosives were collected, and were pretreated with noise suppression and baseline elimination methods. PCA, a well-known multivariate statistical method, was applied for the proper evaluation, feature extraction, and identification of measured spectra. Here, a broad wavenumber range (200- 3500 cm-1) on the collected spectra set was used for the classification of the explosive samples into separate classes. It was found that three principal components achieved 99.3 % classification rates in the sample set. The results show that Raman spectroscopy in combination with PCA is well suited for the identification and differentiation of explosives in the field.
A hybrid least squares and principal component analysis algorithm for Raman spectroscopy.
Directory of Open Access Journals (Sweden)
Dominique Van de Sompel
Full Text Available Raman spectroscopy is a powerful technique for detecting and quantifying analytes in chemical mixtures. A critical part of Raman spectroscopy is the use of a computer algorithm to analyze the measured Raman spectra. The most commonly used algorithm is the classical least squares method, which is popular due to its speed and ease of implementation. However, it is sensitive to inaccuracies or variations in the reference spectra of the analytes (compounds of interest and the background. Many algorithms, primarily multivariate calibration methods, have been proposed that increase robustness to such variations. In this study, we propose a novel method that improves robustness even further by explicitly modeling variations in both the background and analyte signals. More specifically, it extends the classical least squares model by allowing the declared reference spectra to vary in accordance with the principal components obtained from training sets of spectra measured in prior characterization experiments. The amount of variation allowed is constrained by the eigenvalues of this principal component analysis. We compare the novel algorithm to the least squares method with a low-order polynomial residual model, as well as a state-of-the-art hybrid linear analysis method. The latter is a multivariate calibration method designed specifically to improve robustness to background variability in cases where training spectra of the background, as well as the mean spectrum of the analyte, are available. We demonstrate the novel algorithm's superior performance by comparing quantitative error metrics generated by each method. The experiments consider both simulated data and experimental data acquired from in vitro solutions of Raman-enhanced gold-silica nanoparticles.
A hybrid least squares and principal component analysis algorithm for Raman spectroscopy.
Van de Sompel, Dominique; Garai, Ellis; Zavaleta, Cristina; Gambhir, Sanjiv Sam
2012-01-01
Raman spectroscopy is a powerful technique for detecting and quantifying analytes in chemical mixtures. A critical part of Raman spectroscopy is the use of a computer algorithm to analyze the measured Raman spectra. The most commonly used algorithm is the classical least squares method, which is popular due to its speed and ease of implementation. However, it is sensitive to inaccuracies or variations in the reference spectra of the analytes (compounds of interest) and the background. Many algorithms, primarily multivariate calibration methods, have been proposed that increase robustness to such variations. In this study, we propose a novel method that improves robustness even further by explicitly modeling variations in both the background and analyte signals. More specifically, it extends the classical least squares model by allowing the declared reference spectra to vary in accordance with the principal components obtained from training sets of spectra measured in prior characterization experiments. The amount of variation allowed is constrained by the eigenvalues of this principal component analysis. We compare the novel algorithm to the least squares method with a low-order polynomial residual model, as well as a state-of-the-art hybrid linear analysis method. The latter is a multivariate calibration method designed specifically to improve robustness to background variability in cases where training spectra of the background, as well as the mean spectrum of the analyte, are available. We demonstrate the novel algorithm's superior performance by comparing quantitative error metrics generated by each method. The experiments consider both simulated data and experimental data acquired from in vitro solutions of Raman-enhanced gold-silica nanoparticles.
Standardized principal components for vegetation variability monitoring across space and time
Mathew, T. R.; Vohora, V. K.
2016-08-01
Vegetation at any given location changes through time and in space. In what quantity it changes, where and when can help us in identifying sources of ecosystem stress, which is very useful for understanding changes in biodiversity and its effect on climate change. Such changes known for a region are important in prioritizing management. The present study considers the dynamics of savanna vegetation in Kruger National Park (KNP) through the use of temporal satellite remote sensing images. Spatial variability of vegetation is a key characteristic of savanna landscapes and its importance to biodiversity has been demonstrated by field-based studies. The data used for the study were sourced from the U.S. Agency for International Development where AVHRR derived Normalized Difference Vegetation Index (NDVI) images available at spatial resolutions of 8 km and at dekadal scales. The study area was extracted from these images for the time-period 1984-2002. Maximum value composites were derived for individual months resulting in an image dataset of 216 NDVI images. Vegetation dynamics across spatio-temporal domains were analyzed using standardized principal components analysis (SPCA) on the NDVI time-series. Each individual image variability in the time-series is considered. The outcome of this study demonstrated promising results - the variability of vegetation change in the area across space and time, and also indicated changes in landscape on 6 individual principal components (PCs) showing differences not only in magnitude, but also in pattern, of different selected eco-zones with constantly changing and evolving ecosystem.
Mollazadeh, Mohsen; Aggarwal, Vikram; Thakor, Nitish V; Schieber, Marc H
2014-10-15
A few kinematic synergies identified by principal component analysis (PCA) account for most of the variance in the coordinated joint rotations of the fingers and wrist used for a wide variety of hand movements. To examine the possibility that motor cortex might control the hand through such synergies, we collected simultaneous kinematic and neurophysiological data from monkeys performing a reach-to-grasp task. We used PCA, jPCA and isomap to extract kinematic synergies from 18 joint angles in the fingers and wrist and analyzed the relationships of both single-unit and multiunit spike recordings, as well as local field potentials (LFPs), to these synergies. For most spike recordings, the maximal absolute cross-correlations of firing rates were somewhat stronger with an individual joint angle than with any principal component (PC), any jPC or any isomap dimension. In decoding analyses, where spikes and LFP power in the 100- to 170-Hz band each provided better decoding than other LFP-based signals, the first PC was decoded as well as the best decoded joint angle. But the remaining PCs and jPCs were predicted with lower accuracy than individual joint angles. Although PCs, jPCs or isomap dimensions might provide a more parsimonious description of kinematics, our findings indicate that the kinematic synergies identified with these techniques are not represented in motor cortex more strongly than the original joint angles. We suggest that the motor cortex might act to sculpt the synergies generated by subcortical centers, superimposing an ability to individuate finger movements and adapt the hand to grasp a wide variety of objects.
Lovell, D P
1999-06-01
Principal component analyses (PCA) have been carried out on the tissue scores from Draize eye irritation tests on the 55 formulations and chemical ingredients included in the COLIPA Eye Irritation Validation Study. A PCA was carried out on the tissue scores 24, 48 and 72 hours after instillation of the substances. The first Principal Component (PC I) explained 77% of the total variation in the tissues scores and showed a high negative correlation (r=-0.971) with the scores used to derive the Modified Maximum Average Score (MMAS). The second component (PC II) explained 7% of the total variability and contrasted corneal and iris damage with conjunctival damage as in a similar analysis carried out previously on the ECETOC databank. The third component (PCIII), while only explaining about 3% of the variability, identified individuals treated with formulations that were observed to have low corneal opacity but large corneal area scores. This may represent some particular manner of scoring at the laboratory administering the Draize test or a specific effect of some formulations. A further PCA was carried out on tissue scores from observations at 1hr to 21 days. PC I in this analysis explained 62% of the variability and there was a high negative correlation with the sum of all the tissue scores, while PC II explained 14% of the variability and contrasted damage up to 72 hours with damage after 72 hours. A number of formulations were identified with relatively low MMAS scores but tissue damage that persisted. PCA analysis is thus shown to be a powerful method for exploring complex datasets and for identification of outliers and subgroups. It has shown that the MMAS score captures most of the information on tissue scores in the first 72 hours following exposure, and it is unlikely to be of any advantage in using individual tissue scores for comparisons with alternative tests. The relationship of the classifications schemes used by three alternative methods in the COLIPA
RPCA-KFE: Key Frame Extraction for Video Using Robust Principal Component Analysis.
Dang, Chinh; Radha, Hayder
2015-11-01
Key frame extraction algorithms consider the problem of selecting a subset of the most informative frames from a video to summarize its content. Several applications, such as video summarization, search, indexing, and prints from video, can benefit from extracted key frames of the video under consideration. Most approaches in this class of algorithms work directly with the input video data set, without considering the underlying low-rank structure of the data set. Other algorithms exploit the low-rank component only, ignoring the other key information in the video. In this paper, a novel key frame extraction framework based on robust principal component analysis (RPCA) is proposed. Furthermore, we target the challenging application of extracting key frames from unstructured consumer videos. The proposed framework is motivated by the observation that the RPCA decomposes an input data into: 1) a low-rank component that reveals the systematic information across the elements of the data set and 2) a set of sparse components each of which containing distinct information about each element in the same data set. The two information types are combined into a single l1-norm-based non-convex optimization problem to extract the desired number of key frames. Moreover, we develop a novel iterative algorithm to solve this optimization problem. The proposed RPCA-based framework does not require shot(s) detection, segmentation, or semantic understanding of the underlying video. Finally, experiments are performed on a variety of consumer and other types of videos. A comparison of the results obtained by our method with the ground truth and with related state-of-the-art algorithms clearly illustrates the viability of the proposed RPCA-based framework.
Directory of Open Access Journals (Sweden)
Rodrigo Reis Mota
2016-09-01
Full Text Available ABSTRACT: The aim of this research was to evaluate the dimensional reduction of additive direct genetic covariance matrices in genetic evaluations of growth traits (range 100-730 days in Simmental cattle using principal components, as well as to estimate (covariance components and genetic parameters. Principal component analyses were conducted for five different models-one full and four reduced-rank models. Models were compared using Akaike information (AIC and Bayesian information (BIC criteria. Variance components and genetic parameters were estimated by restricted maximum likelihood (REML. The AIC and BIC values were similar among models. This indicated that parsimonious models could be used in genetic evaluations in Simmental cattle. The first principal component explained more than 96% of total variance in both models. Heritability estimates were higher for advanced ages and varied from 0.05 (100 days to 0.30 (730 days. Genetic correlation estimates were similar in both models regardless of magnitude and number of principal components. The first principal component was sufficient to explain almost all genetic variance. Furthermore, genetic parameter similarities and lower computational requirements allowed for parsimonious models in genetic evaluations of growth traits in Simmental cattle.
New Role of Thermal Mapping in Winter Maintenance with Principal Components Analysis
Directory of Open Access Journals (Sweden)
Mario Marchetti
2014-01-01
Full Text Available Thermal mapping uses IR thermometry to measure road pavement temperature at a high resolution to identify and to map sections of the road network prone to ice occurrence. However, measurements are time-consuming and ultimately only provide a snapshot of road conditions at the time of the survey. As such, there is a need for surveys to be restricted to a series of specific climatic conditions during winter. Typically, five to six surveys are used, but it is questionable whether the full range of atmospheric conditions is adequately covered. This work investigates the role of statistics in adding value to thermal mapping data. Principal components analysis is used to interpolate between individual thermal mapping surveys to build a thermal map (or even a road surface temperature forecast, for a wider range of climatic conditions than that permitted by traditional surveys. The results indicate that when this approach is used, fewer thermal mapping surveys are actually required. Furthermore, comparisons with numerical models indicate that this approach could yield a suitable verification method for the spatial component of road weather forecasts—a key issue currently in winter road maintenance.
Scullin, Michael K; Harrison, Tyler L; Factor, Stewart A; Bliwise, Donald L
2014-01-15
Sleep disturbances are common in many neurodegenerative diseases and may include altered sleep duration, fragmented sleep, nocturia, excessive daytime sleepiness, and vivid dreaming experiences, with occasional parasomnias. Although representing the "gold standard," polysomnography is not always cost-effective or available for measuring sleep disturbance, particularly for screening. Although numerous sleep-related questionnaires exist, many focus on a specific sleep disturbance (e.g., restless legs, REM Behavior Disorder) and do not capture efficiently the variety of sleep issues experienced by such patients. We administered the 12-item Neurodegenerative Disease Sleep Questionnaire (NDSQ) and the Epworth Sleepiness Scale to 145 idiopathic Parkinson's disease patients. Principal component analysis using eigenvalues greater than 1 suggested five separate components: sleep quality (e.g., sleep fragmentation), nocturia, vivid dreams/nightmares, restless legs symptoms, and sleep-disordered breathing. These results demonstrate construct validity of our sleep questionnaire and suggest that the NDSQ may be a useful screening tool for sleep disturbances in at least some types of neurodegenerative disorders.
Analysis of heavy metal sources in soil using kriging interpolation on principal components.
Ha, Hoehun; Olson, James R; Bian, Ling; Rogerson, Peter A
2014-05-06
Anniston, Alabama has a long history of operation of foundries and other heavy industry. We assessed the extent of heavy metal contamination in soils by determining the concentrations of 11 heavy metals (Pb, As, Cd, Cr, Co, Cu, Mn, Hg, Ni, V, and Zn) based on 2046 soil samples collected from 595 industrial and residential sites. Principal Component Analysis (PCA) was adopted to characterize the distribution of heavy metals in soil in this region. In addition, a geostatistical technique (kriging) was used to create regional distribution maps for the interpolation of nonpoint sources of heavy metal contamination using geographical information system (GIS) techniques. There were significant differences found between sampling zones in the concentrations of heavy metals, with the exception of the levels of Ni. Three main components explaining the heavy metal variability in soils were identified. The results suggest that Pb, Cd, Cu, and Zn were associated with anthropogenic activities, such as the operations of some foundries and major railroads, which released these heavy metals, whereas the presence of Co, Mn, and V were controlled by natural sources, such as soil texture, pedogenesis, and soil hydrology. In general terms, the soil levels of heavy metals analyzed in this study were higher than those reported in previous studies in other industrial and residential communities.
Kupski, L; Badiale-Furlong, E
2015-06-15
This work aimed to establish an innovative approach to evaluate the effect of cereals composition on ochratoxin A extraction by multivariate analysis. Principal components analysis was applied to identify the effect of major matrix components on the recovery of ochratoxin A by QuEChERS method using HPTLC and HPLC, and to validate the method for ochratoxin A determination in wheat flour by HPLC. The matrices rice bran, wheat bran and wheat flour were characterized for their physical and chemical attributes. The ochratoxin A recovery in these matrices was highly influenced (R=0.99) by the sugar content of the matrix, while the lipids content showed a minor interference (R=0.29). From these data, the QuEChERS method was standardized for extracting ochratoxin A from flour using 1% ACN:water (2:1) as extraction solvent and dried magnesium sulfate and sodium chloride as salts. The recovery values ranged from 97.6% to 105%. The validated method was applied to evaluate natural occurrence of ochratoxin A in 20 wheat flour samples, which were contaminated with ochratoxin A levels in the range of 0.22-0.85 μg kg(-1).
Wang, Xinguang; O'Dwyer, Nicholas; Halaki, Mark; Smith, Richard
2013-01-01
Principal component analysis is a powerful and popular technique for capturing redundancy in muscle activity and kinematic patterns. A primary limitation of the correlations or covariances between signals on which this analysis is based is that they do not account for dynamic relations between signals, yet such relations-such as that between neural drive and muscle tension-are widespread in the sensorimotor system. Low correlations may thus be obtained and signals may appear independent despite a dynamic linear relation between them. To address this limitation, linear systems analysis can be used to calculate the matrix of overall coherences between signals, which measures the strength of the relation between signals taking dynamic relations into account. Using ankle, knee, and hip sagittal-plane angles from 6 healthy subjects during ~50% of total variance in the data set, while with overall coherence matrices the first component accounted for > 95% of total variance. The results demonstrate that the dimensionality of the coordinative structure can be overestimated using conventional correlation, whereas a more parsimonious structure is identified with overall coherence.
Revealing the X-ray Variability of AGN with Principal Component Analysis
Parker, M L; Matt, G; Koljonen, K I I; Kara, E; Alston, W; Walton, D J; Marinucci, A; Brenneman, L; Risaliti, G
2014-01-01
We analyse a sample of 26 active galactic nuclei with deep XMM-Newton observations, using principal component analysis (PCA) to find model independent spectra of the different variable components. In total, we identify at least 12 qualitatively different patterns of spectral variability, involving several different mechanisms, including five sources which show evidence of variable relativistic reflection (MCG-6-30-15, NGC 4051, 1H 0707-495, NGC 3516 and Mrk 766) and three which show evidence of varying partial covering neutral absorption (NGC 4395, NGC 1365, and NGC 4151). In over half of the sources studied, the variability is dominated by changes in a power law continuum, both in terms of changes in flux and power law index, which could be produced by propagating fluctuations within the corona. Simulations are used to find unique predictions for different physical models, and we then attempt to qualitatively match the results from the simulations to the behaviour observed in the real data. We are able to ex...
Directory of Open Access Journals (Sweden)
Selin Aviyente
2010-01-01
Full Text Available Joint time-frequency representations offer a rich representation of event related potentials (ERPs that cannot be obtained through individual time or frequency domain analysis. This representation, however, comes at the expense of increased data volume and the difficulty of interpreting the resulting representations. Therefore, methods that can reduce the large amount of time-frequency data to experimentally relevant components are essential. In this paper, we present a method that reduces the large volume of ERP time-frequency data into a few significant time-frequency parameters. The proposed method is based on applying the widely used matching pursuit (MP approach, with a Gabor dictionary, to principal components extracted from the time-frequency domain. The proposed PCA-Gabor decomposition is compared with other time-frequency data reduction methods such as the time-frequency PCA approach alone and standard matching pursuit methods using a Gabor dictionary for both simulated and biological data. The results show that the proposed PCA-Gabor approach performs better than either the PCA alone or the standard MP data reduction methods, by using the smallest amount of ERP data variance to produce the strongest statistical separation between experimental conditions.
Aviyente, Selin; Bernat, Edward M.; Malone, Stephen M.; Iacono, William G.
2010-12-01
Joint time-frequency representations offer a rich representation of event related potentials (ERPs) that cannot be obtained through individual time or frequency domain analysis. This representation, however, comes at the expense of increased data volume and the difficulty of interpreting the resulting representations. Therefore, methods that can reduce the large amount of time-frequency data to experimentally relevant components are essential. In this paper, we present a method that reduces the large volume of ERP time-frequency data into a few significant time-frequency parameters. The proposed method is based on applying the widely used matching pursuit (MP) approach, with a Gabor dictionary, to principal components extracted from the time-frequency domain. The proposed PCA-Gabor decomposition is compared with other time-frequency data reduction methods such as the time-frequency PCA approach alone and standard matching pursuit methods using a Gabor dictionary for both simulated and biological data. The results show that the proposed PCA-Gabor approach performs better than either the PCA alone or the standard MP data reduction methods, by using the smallest amount of ERP data variance to produce the strongest statistical separation between experimental conditions.
Directory of Open Access Journals (Sweden)
Anna Maria Stellacci
2012-07-01
Full Text Available Hyperspectral (HS data represents an extremely powerful means for rapidly detecting crop stress and then aiding in the rational management of natural resources in agriculture. However, large volume of data poses a challenge for data processing and extracting crucial information. Multivariate statistical techniques can play a key role in the analysis of HS data, as they may allow to both eliminate redundant information and identify synthetic indices which maximize differences among levels of stress. In this paper we propose an integrated approach, based on the combined use of Principal Component Analysis (PCA and Canonical Discriminant Analysis (CDA, to investigate HS plant response and discriminate plant status. The approach was preliminary evaluated on a data set collected on durum wheat plants grown under different nitrogen (N stress levels. Hyperspectral measurements were performed at anthesis through a high resolution field spectroradiometer, ASD FieldSpec HandHeld, covering the 325-1075 nm region. Reflectance data were first restricted to the interval 510-1000 nm and then divided into five bands of the electromagnetic spectrum [green: 510-580 nm; yellow: 581-630 nm; red: 631-690 nm; red-edge: 705-770 nm; near-infrared (NIR: 771-1000 nm]. PCA was applied to each spectral interval. CDA was performed on the extracted components to identify the factors maximizing the differences among plants fertilised with increasing N rates. Within the intervals of green, yellow and red only the first principal component (PC had an eigenvalue greater than 1 and explained more than 95% of total variance; within the ranges of red-edge and NIR, the first two PCs had an eigenvalue higher than 1. Two canonical variables explained cumulatively more than 81% of total variance and the first was able to discriminate wheat plants differently fertilised, as confirmed also by the significant correlation with aboveground biomass and grain yield parameters. The combined
Directory of Open Access Journals (Sweden)
Karacaören Burak
2011-05-01
Full Text Available Abstract Background It has been shown that if genetic relationships among individuals are not taken into account for genome wide association studies, this may lead to false positives. To address this problem, we used Genome-wide Rapid Association using Mixed Model and Regression and principal component stratification analyses. To account for linkage disequilibrium among the significant markers, principal components loadings obtained from top markers can be included as covariates. Estimation of Bayesian networks may also be useful to investigate linkage disequilibrium among SNPs and their relation with environmental variables. For the quantitative trait we first estimated residuals while taking polygenic effects into account. We then used a single SNP approach to detect the most significant SNPs based on the residuals and applied principal component regression to take linkage disequilibrium among these SNPs into account. For the categorical trait we used principal component stratification methodology to account for background effects. For correction of linkage disequilibrium we used principal component logit regression. Bayesian networks were estimated to investigate relationship among SNPs. Results Using the Genome-wide Rapid Association using Mixed Model and Regression and principal component stratification approach we detected around 100 significant SNPs for the quantitative trait (p Conclusions GRAMMAR could efficiently incorporate the information regarding random genetic effects. Principal component stratification should be cautiously used with stringent multiple hypothesis testing correction to correct for ancestral stratification and association analyses for binary traits when there are systematic genetic effects such as half sib family structures. Bayesian networks are useful to investigate relationships among SNPs and environmental variables.
Institute of Scientific and Technical Information of China (English)
牛有国; 王守岩; 张玉海; 王兴邦; 张立藩
2004-01-01
Objective: To introduce a method to calculate cardiovascular age, a new, accurate and much simpler index for assessing cardiovascular autonomic regulatory function, based on statistical analysis of heart rate and blood pressure variability (HRV and BPV) and baroreflex sensitivity (BRS) data. Methods: Firstly, HRV and BPV of 89 healthy aviation personnel were analyzed by the conventional autoregressive (AR) spectral analysis and their spontaneous BRS was obtained by the sequence method. Secondly, principal component analysis was conducted over original and derived indices of HRV, BPV and BRS data and the relevant principal components, Pciorig and Pcideri (I=1, 2, 3,...) were obtained. Finally, the equation for calculating cardiovascular age was obtained by multiple regression with the chronological age being assigned as the dependent variable and the principal components significantly related to age as the regressors. Results: The first four principal components of original indices accounted for over 90% of total variance of the indices, so did the first three principal components of derived indices. So, these seven principal components could reflect the information of cardiovascular autonomic regulation which was embodied in the 17 indices of HRV, BPV and BRS exactly with a minimal loss of information. Of the seven principal components, PC2orig, PC4orig and PC2deri were negatively correlated with the chronological age (P<0.05), whereas the PC3orig was positively correlated with the chronological age (P<0.01). The cardiovascular age thus calculated from the regression equation was significantly correlated with the chronological age among the 89 aviation personnel (r=0.73, P<0.01). Conclusion: The cardiovascular age calculated based on a multi-variate analysis of HRV, BPV and BRS could be regarded as a comprehensive indicator reflecting the age dependency of autonomic regulation of cardiovascular system in healthy aviation personnel.
Modified Diatomaceous earth as a principal stationary phase component in TLC.
Ergül, Soner; Kadan, Imdat; Savaşci, Sahin; Ergül, Suzan
2005-09-01
Modified natural diatomaceous earth (DE) is a principal component of the stationary phase in normal thin-layer chromatography (TLC) applications and is mixed with commercial silica gel 60GF254 (Si-60GF254). Modification is carried out by flux calcination and refluxing with acid. Natural DE, modified DEs [flux calcinated (FC)DE and FCDE-I), and Si-60GF254 are characterized by scanning electron microscopy and Fourier-transform-IR spectroscopy. Particle size, specific surface area, pore distribution, pore volume, and surface hydroxyl group density parameters of materials are determined by various techniques. FCDE-I and Si-60GF254 are investigated for their usefulness in the stationary phase of TLC both individually and in composition. Commercially available red and blue ink samples are run on layers of Si-60GF254 and FCDE-I individually, and on various FCDE-I and Si-60GF254 mixtures. Butanol-ethanol-2M ammonia (3:1:1, v/v) and butanol-acetic acid-water (12:3:5, v/v) mixtures are used as mobile phases. The polarities of stationary phases decrease, and the retention factor (Rf) values of ink components increase when the FCDE-I content of the stationary phase increases. The properties of the stationary phase can be optimized by adding FCDE-I to Si-60GF254. This study may be useful in understanding both the systematic effects of stationary phase properties [e.g., specific surface area and surface hydroxyl group density, aOH(s)] and those of the mobile phase (e.g., polarity and acidity) on Rf values and the separability of components.
Boligon, A A; Vicente, I S; Vaz, R Z; Campos, G S; Souza, F R P; Carvalheiro, R; Albuquerque, L G
2016-12-01
Principal component analysis was applied to evaluate the variability and relationships among univariate breeding values predicted for 9 weaning and yearling traits, as well as suggest functions of the traits that would promote a particular breeding objective. Phenotypic and pedigree information from 600,132 Nelore animals was used. Genetic parameters and breeding values were obtained from univariate analyses of birth to weaning weight gain; weaning to yearling weight gain; conformation, finishing precocity, and muscling scores at weaning and at yearling; and yearling scrotal circumference. The principal component mainly associated with maturity (precocious vs. late animals) was used as a pseudophenotype in bivariate analyses with either adult weight or adult height of cows. Direct heritability estimates ranging from 0.19 ± 0.01 to 0.41 ± 0.01 indicate that these 9 traits are all heritable to varying degrees. Correlations between the breeding values for the various traits ranged from 0.14 to 0.88. Principal component analysis was performed on the standardized breeding values. The first 3 principal components attained the Kaiser criterion, retaining 48.06%, 18.03%, and 12.97% of the total breeding value variance, respectively. The first component was characterized by positive coefficients for all traits. The second component contrasted weaning traits with yearling traits. The third component represented a contrast between late maturity animals (better for weight gain and conformation) and early maturity animals (better for finishing precocity, muscling, and scrotal circumference). Thus, the first 3 components represent 3 different potential selection criteria. Selecting for the first principal component would identify animals with positive breeding values for all studied traits. The second principal component may be used to identify animals with higher or lower maturation rates (precocity). Animals with negative values in the third principal component are regarded
Directory of Open Access Journals (Sweden)
Matrone Giulia C
2012-06-01
Full Text Available Abstract Background In spite of the advances made in the design of dexterous anthropomorphic hand prostheses, these sophisticated devices still lack adequate control interfaces which could allow amputees to operate them in an intuitive and close-to-natural way. In this study, an anthropomorphic five-fingered robotic hand, actuated by six motors, was used as a prosthetic hand emulator to assess the feasibility of a control approach based on Principal Components Analysis (PCA, specifically conceived to address this problem. Since it was demonstrated elsewhere that the first two principal components (PCs can describe the whole hand configuration space sufficiently well, the controller here employed reverted the PCA algorithm and allowed to drive a multi-DoF hand by combining a two-differential channels EMG input with these two PCs. Hence, the novelty of this approach stood in the PCA application for solving the challenging problem of best mapping the EMG inputs into the degrees of freedom (DoFs of the prosthesis. Methods A clinically viable two DoFs myoelectric controller, exploiting two differential channels, was developed and twelve able-bodied participants, divided in two groups, volunteered to control the hand in simple grasp trials, using forearm myoelectric signals. Task completion rates and times were measured. The first objective (assessed through one group of subjects was to understand the effectiveness of the approach; i.e., whether it is possible to drive the hand in real-time, with reasonable performance, in different grasps, also taking advantage of the direct visual feedback of the moving hand. The second objective (assessed through a different group was to investigate the intuitiveness, and therefore to assess statistical differences in the performance throughout three consecutive days. Results Subjects performed several grasp, transport and release trials with differently shaped objects, by operating the hand with the myoelectric
Computing a Nonnegative Matrix Factorization -- Provably
Arora, Sanjeev; Kannan, Ravi; Moitra, Ankur
2011-01-01
In the Nonnegative Matrix Factorization (NMF) problem we are given an $n \\times m$ nonnegative matrix $M$ and an integer $r > 0$. Our goal is to express $M$ as $A W$ where $A$ and $W$ are nonnegative matrices of size $n \\times r$ and $r \\times m$ respectively. In some applications, it makes sense to ask instead for the product $AW$ to approximate $M$ -- i.e. (approximately) minimize $\
Principal component analysis for the early detection of mastitis and lameness in dairy cows.
Miekley, Bettina; Traulsen, Imke; Krieter, Joachim
2013-08-01
This investigation analysed the applicability of principal component analysis (PCA), a latent variable method, for the early detection of mastitis and lameness. Data used were recorded on the Karkendamm dairy research farm between August 2008 and December 2010. For mastitis and lameness detection, data of 338 and 315 cows in their first 200 d in milk were analysed, respectively. Mastitis as well as lameness were specified according to veterinary treatments. Diseases were defined as disease blocks. The different definitions used (two for mastitis, three for lameness) varied solely in the sequence length of the blocks. Only the days before the treatment were included in the blocks. Milk electrical conductivity, milk yield and feeding patterns (feed intake, number of feeding visits and time at the trough) were used for recognition of mastitis. Pedometer activity and feeding patterns were utilised for lameness detection. To develop and verify the PCA model, the mastitis and the lameness datasets were divided into training and test datasets. PCA extracted uncorrelated principle components (PC) by linear transformations of the raw data so that the first few PCs captured most of the variations in the original dataset. For process monitoring and disease detection, these resulting PCs were applied to the Hotelling's T 2 chart and to the residual control chart. The results show that block sensitivity of mastitis detection ranged from 77·4 to 83·3%, whilst specificity was around 76·7%. The error rates were around 98·9%. For lameness detection, the block sensitivity ranged from 73·8 to 87·8% while the obtained specificities were between 54·8 and 61·9%. The error rates varied from 87·8 to 89·2%. In conclusion, PCA seems to be not yet transferable into practical usage. Results could probably be improved if different traits and more informative sensor data are included in the analysis.
Directory of Open Access Journals (Sweden)
Shaohui Foong
2016-08-01
Full Text Available In this paper, a novel magnetic field-based sensing system employing statistically optimized concurrent multiple sensor outputs for precise field-position association and localization is presented. This method capitalizes on the independence between simultaneous spatial field measurements at multiple locations to induce unique correspondences between field and position. This single-source-multi-sensor configuration is able to achieve accurate and precise localization and tracking of translational motion without contact over large travel distances for feedback control. Principal component analysis (PCA is used as a pseudo-linear filter to optimally reduce the dimensions of the multi-sensor output space for computationally efficient field-position mapping with artificial neural networks (ANNs. Numerical simulations are employed to investigate the effects of geometric parameters and Gaussian noise corruption on PCA assisted ANN mapping performance. Using a 9-sensor network, the sensing accuracy and closed-loop tracking performance of the proposed optimal field-based sensing system is experimentally evaluated on a linear actuator with a significantly more expensive optical encoder as a comparison.
Detecting Combustion and Flow Features In Situ Using Principal Component Analysis
Energy Technology Data Exchange (ETDEWEB)
Thompson, David [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Grout, Ray W. [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Fabian, Nathan D. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Bennett, Janine Camille [Sandia National Lab. (SNL-CA), Livermore, CA (United States)
2009-03-01
This report presents progress on identifying and classifying features involving combustion in turbulent flow using principal component analysis (PCA) and k-means clustering using an in situ analysis framework. We describe a process for extracting temporally- and spatially-varying information from the simulation, classifying the information, and then applying the classification algorithm to either other portions of the simulation not used for training the classifier or further simulations. Because the regions classified as being of interest take up a small portion of the overall simulation domain, it will consume fewer resources to perform further analysis or save these regions at a higher fidelity than previously possible. The implementation of this process is partially complete and results obtained from PCA of test data is presented that indicates the process may have merit: the basis vectors that PCA provides are significantly different in regions where combustion is occurring and even when all 21 species of a lifted flame simulation are correlated the computational cost of PCA is minimal. What remains to be determined is whether k-means (or other) clustering techniques will be able to identify combined combustion and flow features with an accuracy that makes further characterization of these regions feasible and meaningful.
Spectral quantitation by principal component analysis using complex singular value decomposition.
Elliott, M A; Walter, G A; Swift, A; Vandenborne, K; Schotland, J C; Leigh, J S
1999-03-01
Principal component analysis (PCA) is a powerful method for quantitative analysis of nuclear magnetic resonance spectral data sets. It has the advantage of being model independent, making it well suited for the analysis of spectra with complicated or unknown line shapes. Previous applications of PCA have required that all spectra in a data set be in phase or have implemented iterative methods to analyze spectra that are not perfectly phased. However, improper phasing or imperfect convergence of the iterative methods has resulted in systematic errors in the estimation of peak areas with PCA. Presented here is a modified method of PCA, which utilizes complex singular value decomposition (SVD) to analyze spectral data sets with any amount of variation in spectral phase. The new method is shown to be completely insensitive to spectral phase. In the presence of noise, PCA with complex SVD yields a lower variation in the estimation of peak area than conventional PCA by a factor of approximately 2. The performance of the method is demonstrated with simulated data and in vivo 31P spectra from human skeletal muscle.
Sensory characterization of doda burfi (Indian milk cake) using Principal Component Analysis.
Chawla, Rekha; Patil, Girdhari Ramdas; Singh, Ashish Kumar
2014-03-01
Traditional sweetmeats of various countries hold a great and promising scope in their improvement and in order to tap the potential of the same, several companies and co-operative federations have started their organized production. Doda burfi, a heat desiccated and popular sweetmeat of northern India, is one of the regional specific, unfamiliarized products of India. The typical sweetmeat is characterized by caramelized and nutty flavour and granular texture. The purpose of this study was to determine the close relationship among various sensory attributes of the product collected from renowned manufacturers located in four different cities and to characterize an overall acceptable product. Individuals from academia participated in a round table discussion to generate descriptive terms related to colour and appearance, flavour and texture. Prior to sensory evaluation, sensory panel was trained and briefed about the terminology used to judge the product involving a descriptive intensity scale of 100 points for describing major sensory attributes. Results were analyzed using ANOVA and principal component analysis. Correlation table indicated a good degree of positive association between the attributes such as glossy appearance, dark colour, caramelized and nutty flavour and cohesive and chewy texture with the overall acceptability of the product.
Hirayama, Jun-ichiro; Hyvärinen, Aapo; Kiviniemi, Vesa; Kawanabe, Motoaki; Yamashita, Okito
2016-01-01
Characterizing the variability of resting-state functional brain connectivity across subjects and/or over time has recently attracted much attention. Principal component analysis (PCA) serves as a fundamental statistical technique for such analyses. However, performing PCA on high-dimensional connectivity matrices yields complicated “eigenconnectivity” patterns, for which systematic interpretation is a challenging issue. Here, we overcome this issue with a novel constrained PCA method for connectivity matrices by extending the idea of the previously proposed orthogonal connectivity factorization method. Our new method, modular connectivity factorization (MCF), explicitly introduces the modularity of brain networks as a parametric constraint on eigenconnectivity matrices. In particular, MCF analyzes the variability in both intra- and inter-module connectivities, simultaneously finding network modules in a principled, data-driven manner. The parametric constraint provides a compact module-based visualization scheme with which the result can be intuitively interpreted. We develop an optimization algorithm to solve the constrained PCA problem and validate our method in simulation studies and with a resting-state functional connectivity MRI dataset of 986 subjects. The results show that the proposed MCF method successfully reveals the underlying modular eigenconnectivity patterns in more general situations and is a promising alternative to existing methods. PMID:28002474
Energy Technology Data Exchange (ETDEWEB)
Lu, Wei-Zhen [Department of Building and Construction, City University of Hong Kong (China); He, Hong-Di [Department of Building and Construction, City University of Hong Kong (China); Logistics Research Center, Shanghai Maritime University, Shanghai (China); Dong, Li-yun [Shanghai Institute of Applied Mathematics and Mechanics, Shanghai University, Shanghai (China)
2011-03-15
This study aims to evaluate the performance of two statistical methods, principal component analysis and cluster analysis, for the management of air quality monitoring network of Hong Kong and the reduction of associated expenses. The specific objectives include: (i) to identify city areas with similar air pollution behavior; and (ii) to locate emission sources. The statistical methods were applied to the mass concentrations of sulphur dioxide (SO{sub 2}), respirable suspended particulates (RSP) and nitrogen dioxide (NO{sub 2}), collected in monitoring network of Hong Kong from January 2001 to December 2007. The results demonstrate that, for each pollutant, the monitoring stations are grouped into different classes based on their air pollution behaviors. The monitoring stations located in nearby area are characterized by the same specific air pollution characteristics and suggested with an effective management of air quality monitoring system. The redundant equipments should be transferred to other monitoring stations for allowing further enlargement of the monitored area. Additionally, the existence of different air pollution behaviors in the monitoring network is explained by the variability of wind directions across the region. The results imply that the air quality problem in Hong Kong is not only a local problem mainly from street-level pollutions, but also a region problem from the Pearl River Delta region. (author)
Sustainability Assessment of the Natural Gas Industry in China Using Principal Component Analysis
Directory of Open Access Journals (Sweden)
Xiucheng Dong
2015-05-01
Full Text Available Under pressure toward carbon emission reduction and air protection, China has accelerated energy restructuring by greatly improving the supply and consumption of natural gas in recent years. However, several issues with the sustainable development of the natural gas industry in China still need in-depth discussion. Therefore, based on the fundamental ideas of sustainable development, industrial development theories and features of the natural gas industry, a sustainable development theory is proposed in this thesis. The theory consists of five parts: resource, market, enterprise, technology and policy. The five parts, which unite for mutual connection and promotion, push the gas industry’s development forward together. Furthermore, based on the theoretical structure, the Natural Gas Industry Sustainability Index in China is established and evaluated via the Principal Component Analysis (PCA method. Finally, a conclusion is reached: that the sustainability of the natural gas industry in China kept rising from 2008 to 2013, mainly benefiting from increasing supply and demand, the enhancement of enterprise profits, technological innovation, policy support and the optimization and reformation of the gas market.
Dultzin-Hacyan, D.; Ruano, C.
1996-01-01
We present a statistical study including principal component analysis (PCA) of multiwavelength properties of types 1 and 2 Seyfert galaxies. We have applied PCA to an ensemble of X-ray, optical, near and far infrared, and radio data of Seyfert galaxies. We used Lipovetsky et al. (1987) catalog which provides the largest list of Seyfert galaxies with multiwavelength data. Our main result is that the Spectral Energy Distribution (SED) of Seyfert 1 galaxies is well accounted by one and only one underlying variable, at least to a first approximation. On the other hand, in the case of Seyfert 2 galaxies, at least three variables are required. Several details of the analysis lead us to the following interpretation of this result: In the case of Seyfert 1 galaxies, the main process at the origin of radiation is the release of energy of gravitational origin by accretion unto a supermassive black hole. In the case of Seyfert 2 galaxies, there are other important processes apart from energy of gravitational origin, which we may identify with stellar and interstellar radiation (mainly dust absortion and re-emission) from the circumnuclear region. In the framework of this interpretation the analysis reveals that the variance in luminosity related to radiation of stellar/interstellar origin in no case exceeds ~13% for Seyfert 1 galaxies. In contrast, for Seyfert 2 galaxies the radiation of stellar/interstellar origin can account for ~46% of the variance in certain luminosities.
Directory of Open Access Journals (Sweden)
Sauvik Das Gupta
2012-12-01
Full Text Available This paper deals with the recognition of different hand gestures through machine learning approaches and principal component analysis. A Bio-Medical signal amplifier is built after doing a software simulation with the help of NI Multisim. At first a couple of surface electrodes are used to obtain the Electro-Myo-Gram (EMG signals from the hands. These signals from the surface electrodes have to be amplified with the help of the Bio-Medical Signal amplifier. The Bio-Medical Signal amplifier used is basically an Instrumentation amplifier made with the help of IC AD 620.The output from the Instrumentation amplifier is then filtered with the help of a suitable Band-Pass Filter. The output from the Band Pass filter is then fed to an Analog to Digital Converter (ADC which in this case is the NI USB 6008.The data from the ADC is then fed into a suitable algorithm which helps in recognition of the different hand gestures. The algorithm analysis is done in MATLAB. The results shown in this paper show a close to One-hundred per cent (100% classification result for three given hand gestures.
Principal component analysis of the cytokine and chemokine response to human traumatic brain injury.
Directory of Open Access Journals (Sweden)
Adel Helmy
Full Text Available There is a growing realisation that neuro-inflammation plays a fundamental role in the pathology of Traumatic Brain Injury (TBI. This has led to the search for biomarkers that reflect these underlying inflammatory processes using techniques such as cerebral microdialysis. The interpretation of such biomarker data has been limited by the statistical methods used. When analysing data of this sort the multiple putative interactions between mediators need to be considered as well as the timing of production and high degree of statistical co-variance in levels of these mediators. Here we present a cytokine and chemokine dataset from human brain following human traumatic brain injury and use principal component analysis and partial least squares discriminant analysis to demonstrate the pattern of production following TBI, distinct phases of the humoral inflammatory response and the differing patterns of response in brain and in peripheral blood. This technique has the added advantage of making no assumptions about the Relative Recovery (RR of microdialysis derived parameters. Taken together these techniques can be used in complex microdialysis datasets to summarise the data succinctly and generate hypotheses for future study.
Lamb wave feature extraction using discrete wavelet transformation and Principal Component Analysis
Ghodsi, Mojtaba; Ziaiefar, Hamidreza; Amiryan, Milad; Honarvar, Farhang; Hojjat, Yousef; Mahmoudi, Mehdi; Al-Yahmadi, Amur; Bahadur, Issam
2016-04-01
In this research, a new method is presented for eliciting the proper features for recognizing and classifying the kinds of the defects by guided ultrasonic waves. After applying suitable preprocessing, the suggested method extracts the base frequency band from the received signals by discrete wavelet transform and discrete Fourier transform. This frequency band can be used as a distinctive feature of ultrasonic signals in different defects. Principal Component Analysis with improving this feature and decreasing extra data managed to improve classification. In this study, ultrasonic test with A0 mode lamb wave is used and is appropriated to reduce the difficulties around the problem. The defects under analysis included corrosion, crack and local thickness reduction. The last defect is caused by electro discharge machining (EDM). The results of the classification by optimized Neural Network depicts that the presented method can differentiate different defects with 95% precision and thus, it is a strong and efficient method. Moreover, comparing the elicited features for corrosion and local thickness reduction and also the results of the two's classification clarifies that modeling the corrosion procedure by local thickness reduction which was previously common, is not an appropriate method and the signals received from the two defects are different from each other.
Improved Principal Component Analysis for Anomaly Detection: Application to an Emergency Department
Harrou, Fouzi
2015-07-03
Monitoring of production systems, such as those in hospitals, is primordial for ensuring the best management and maintenance desired product quality. Detection of emergent abnormalities allows preemptive actions that can prevent more serious consequences. Principal component analysis (PCA)-based anomaly-detection approach has been used successfully for monitoring systems with highly correlated variables. However, conventional PCA-based detection indices, such as the Hotelling’s T2T2 and the Q statistics, are ill suited to detect small abnormalities because they use only information from the most recent observations. Other multivariate statistical metrics, such as the multivariate cumulative sum (MCUSUM) control scheme, are more suitable for detection small anomalies. In this paper, a generic anomaly detection scheme based on PCA is proposed to monitor demands to an emergency department. In such a framework, the MCUSUM control chart is applied to the uncorrelated residuals obtained from the PCA model. The proposed PCA-based MCUSUM anomaly detection strategy is successfully applied to the practical data collected from the database of the pediatric emergency department in the Lille Regional Hospital Centre, France. The detection results evidence that the proposed method is more effective than the conventional PCA-based anomaly-detection methods.
In-TFT-Array-Process Micro Defect Inspection Using Nonlinear Principal Component Analysis
Directory of Open Access Journals (Sweden)
Zhi-Hao Kang
2009-10-01
Full Text Available Defect inspection plays a critical role in thin film transistor liquid crystal display (TFT-LCD manufacture, and has received much attention in the field of automatic optical inspection (AOI. Previously, most focus was put on the problems of macro-scale Mura-defect detection in cell process, but it has recently been found that the defects which substantially influence the yield rate of LCD panels are actually those in the TFT array process, which is the first process in TFT-LCD manufacturing. Defect inspection in TFT array process is therefore considered a difficult task. This paper presents a novel inspection scheme based on kernel principal component analysis (KPCA algorithm, which is a nonlinear version of the well-known PCA algorithm. The inspection scheme can not only detect the defects from the images captured from the surface of LCD panels, but also recognize the types of the detected defects automatically. Results, based on real images provided by a LCD manufacturer in Taiwan, indicate that the KPCA-based defect inspection scheme is able to achieve a defect detection rate of over 99% and a high defect classification rate of over 96% when the imbalanced support vector machine (ISVM with 2-norm soft margin is employed as the classifier. More importantly, the inspection time is less than 1 s per input image.
PRINCIPAL COMPONENT ANALYSIS AND CLUSTER ANALYSIS IN MULTIVARIATE ASSESSMENT OF WATER QUALITY
Directory of Open Access Journals (Sweden)
Elzbieta Radzka
2017-03-01
Full Text Available This paper deals with the use of multivariate methods in drinking water analysis. During a five-year project, from 2008 to 2012, selected chemical parameters in 11 water supply networks of the Siedlce County were studied. Throughout that period drinking water was of satisfactory quality, with only iron and manganese ions exceeding the limits (21 times and 12 times, respectively. In accordance with the results of cluster analysis, all water networks were put into three groups of different water quality. A high concentration of chlorides, sulphates, and manganese and a low concentration of copper and sodium was found in the water of Group 1 supply networks. The water in Group 2 had a high concentration of copper and sodium, and a low concentration of iron and sulphates. The water from Group 3 had a low concentration of chlorides and manganese, but a high concentration of fluorides. Using principal component analysis and cluster analysis, multivariate correlation between the studied parameters was determined, helping to put water supply networks into groups according to similar water quality.
Bidimensional and Multidimensional Principal Component Analysis in Long Term Atmospheric Monitoring
Directory of Open Access Journals (Sweden)
Barbara Giussani
2016-12-01
Full Text Available Atmospheric monitoring produces huge amounts of data. Univariate and bivariate statistics are widely used to investigate variations in the parameters. To summarize information graphs are usually used in the form of histograms or tendency profiles (e.g., variable concentration vs. time, as well as bidimensional plots where two-variable correlations are considered. However, when dealing with big data sets at least two problems arise: a great quantity of numbers (statistics and graphs are produced, and only two-variable interactions are often considered. The aim of this article is to show how the use of multivariate statistics helps in handling atmospheric data sets. Multivariate modeling considers all the variables simultaneously and returns the main results as bidimensional graphs that are easy-to-read. Principal Component Analysis (PCA; the most known multivariate method and multiway-PCA (Tucker3 are compared from methodological and interpretative points of view. The article demonstrates the ability to emphasize different information depending on the data handling performed. The results and benefits achieved using a more complex model that allows for the simultaneous consideration of the entire variability of the system are compared with the results provided by the simpler but better-known model. Atmospheric monitoring (SO2, NOx, NO2, NO, and O3 data from the Lake Como Area (Italy since 1992 to 2007 were chosen for consideration for the case study.
Dai, Yimian; Wu, Yiquan; Song, Yu
2016-07-01
When facing extremely complex infrared background, due to the defect of l1 norm based sparsity measure, the state-of-the-art infrared patch-image (IPI) model would be in a dilemma where either the dim targets are over-shrinked in the separation or the strong cloud edges remains in the target image. In order to suppress the strong edges while preserving the dim targets, a weighted infrared patch-image (WIPI) model is proposed, incorporating structural prior information into the process of infrared small target and background separation. Instead of adopting a global weight, we allocate adaptive weight to each column of the target patch-image according to its patch structure. Then the proposed WIPI model is converted to a column-wise weighted robust principal component analysis (CWRPCA) problem. In addition, a target unlikelihood coefficient is designed based on the steering kernel, serving as the adaptive weight for each column. Finally, in order to solve the CWPRCA problem, a solution algorithm is developed based on Alternating Direction Method (ADM). Detailed experiment results demonstrate that the proposed method has a significant improvement over the other nine classical or state-of-the-art methods in terms of subjective visual quality, quantitative evaluation indexes and convergence rate.
Hoell, S.; Omenzetter, P.
2015-07-01
The utilization of vibration signals for structural damage detection (SDD) is appealing due to the strong theoretical foundation of such approaches, ease of data acquisition and processing efficiency. Different methods are available for defining damage sensitive features (DSFs) based on vibrations, such as modal analysis or time series methods. The present paper proposes the use of partial autocorrelation coefficients of acceleration responses as DSFs. Principal component (PC) analysis is used to transform the initial DSFs to scores. The resulting scores from the healthy and damaged states are used to select the PCs which are most sensitive to damage. These are then used for making decisions about the structural state by means of statistical hypothesis testing conducted on the scores. The approach is applied to experiments with a laboratory scale wind turbine blade (WTB) made of glass-fibre reinforced epoxy composite. Damage is non-destructively simulated by attaching small masses and the WTB is excited with the help of an electrodynamic shaker using band-limited white noise. The SDD results for the selected subsets of PCs show a clear improvement of the detectability of early damages compared to other DSF selections.
Moving object detection based on on-line block-robust principal component analysis decomposition
Yang, Biao; Cao, Jinmeng; Zou, Ling
2017-07-01
Robust principal component analysis (RPCA) decomposition is widely applied in moving object detection due to its ability in suppressing environmental noises while separating sparse foreground from low rank background. However, it may suffer from constant punishing parameters (resulting in confusion between foreground and background) and holistic processing of all input frames (leading to bad real-time performance). Improvements to these issues are studied in this paper. A block-RPCA decomposition approach was proposed to handle the confusion while separating foreground from background. Input frame was initially separated into blocks using three-frame difference. Then, punishing parameter of each block was computed by its motion saliency acquired based on selective spatio-temporal interesting points. Aiming to improve the real-time performance of the proposed method, an on-line solution to block-RPCA decomposition was utilized. Both qualitative and quantitative tests were implemented and the results indicate the superiority of our method to some state-of-the-art approaches in detection accuracy or real-time performance, or both of them.
PRINCIPAL COMPONENT ANALYSIS OF FACTORS DETERMINING PHOSPHATE ROCK DISSOLUTION ON ACID SOILS
Directory of Open Access Journals (Sweden)
Yusdar Hilman
2016-10-01
Full Text Available Many of the agricultural soils in Indonesia are acidic and low in both total and available phosphorus which severely limits their potential for crops production. These problems can be corrected by application of chemical fertilizers. However, these fertilizers are expensive, and cheaper alternatives such as phosphate rock (PR have been considered. Several soil factors may influence the dissolution of PR in soils, including both chemical and physical properties. The study aimed to identify PR dissolution factors and evaluate their relative magnitude. The experiment was conducted in Soil Chemical Laboratory, Universiti Putra Malaysia and Indonesian Center for Agricultural Land Resources Research and Development from January to April 2002. The principal component analysis (PCA was used to characterize acid soils in an incubation system into a number of factors that may affect PR dissolution. Three major factors selected were soil texture, soil acidity, and fertilization. Using the scores of individual factors as independent variables, stepwise regression analysis was performed to derive a PR dissolution function. The factors influencing PR dissolution in order of importance were soil texture, soil acidity, then fertilization. Soil texture factors including clay content and organic C, and soil acidity factor such as P retention capacity interacted positively with P dissolution and promoted PR dissolution effectively. Soil texture factors, such as sand and silt content, soil acidity factors such as pH, and exchangeable Ca decreased PR dissolution.
Keithley, Richard B; Wightman, R Mark
2011-06-07
Principal component regression is a multivariate data analysis approach routinely used to predict neurochemical concentrations from in vivo fast-scan cyclic voltammetry measurements. This mathematical procedure can rapidly be employed with present day computer programming languages. Here, we evaluate several methods that can be used to evaluate and improve multivariate concentration determination. The cyclic voltammetric representation of the calculated regression vector is shown to be a valuable tool in determining whether the calculated multivariate model is chemically appropriate. The use of Cook's distance successfully identified outliers contained within in vivo fast-scan cyclic voltammetry training sets. This work also presents the first direct interpretation of a residual color plot and demonstrated the effect of peak shifts on predicted dopamine concentrations. Finally, separate analyses of smaller increments of a single continuous measurement could not be concatenated without substantial error in the predicted neurochemical concentrations due to electrode drift. Taken together, these tools allow for the construction of more robust multivariate calibration models and provide the first approach to assess the predictive ability of a procedure that is inherently impossible to validate because of the lack of in vivo standards.
Toupin, S.; de Senneville, B. Denis; Ozenne, V.; Bour, P.; Lepetit-Coiffe, M.; Boissenin, M.; Jais, P.; Quesson, B.
2017-02-01
The use of magnetic resonance (MR) thermometry for the monitoring of thermal ablation is rapidly expanding. However, this technique remains challenging for the monitoring of the treatment of cardiac arrhythmia by radiofrequency ablation due to the heart displacement with respiration and contraction. Recent studies have addressed this problem by compensating in-plane motion in real-time with optical-flow based tracking technique. However, these algorithms are sensitive to local variation of signal intensity on magnitude images associated with tissue heating. In this study, an optical-flow algorithm was combined with a principal component analysis method to reduce the impact of such effects. The proposed method was integrated to a fully automatic cardiac MR thermometry pipeline, compatible with a future clinical workflow. It was evaluated on nine healthy volunteers under free breathing conditions, on a phantom and in vivo on the left ventricle of a sheep. The results showed that local intensity changes in magnitude images had lower impact on motion estimation with the proposed method. Using this strategy, the temperature mapping accuracy was significantly improved.
Kalinova, Veselina; Rosolowsky, Erik; van de Ven, Glenn; Lyubenova, Mariya; Falcón-Barroso, Jesus; Kannan, Rahul; Läsker, Ronald; Galbany, Lluís; García-Benito, Rubén; Delgado, Rosa González; Sánchez, Sebastian F; Ruiz-Lara, Tomás
2015-01-01
We present a dynamical classification system for galaxies based on the shapes of their circular velocity curves (CVCs). We derive the CVCs of 40 SAURON and 42 CALIFA galaxies across Hubble sequence via a full line-of-sight integration as provided by solutions of the axisymmetric Jeans equations. We use Principal Component Analysis (PCA) applied to the circular curve shapes to find characteristic features and use a k-means classifier to separate circular curves into classes. This objective classification method identifies four different classes, which we name Slow-Rising (SR), Flat (F), Sharp-Peaked (SP) and Round-Peaked (RP) circular curves. SR-CVCs are mostly represented by late-type spiral galaxies (Scd-Sd) with no prominent spheroids in the central parts and slowly rising velocities; F-CVCs span almost all morphological types (E,S0,Sab,Sb-Sbc) with flat velocity profiles at almost all radii; SP-CVCs are represented by early-type and early-type spiral galaxies (E,S0,Sb-Sbc) with prominent spheroids and shar...
Institute of Scientific and Technical Information of China (English)
何宁; 王树青; 谢磊
2004-01-01
Multi-way principal component analysis (MPCA) had been successfully applied to monitoring the batch and semi-batch process in most chemical industry. An improved MPCA approach, step-by-step adaptive MPCA (SAMPCA), using the process variable trajectories to monitoring the batch process is presented in this paper. It does not need to estimate or fill in the unknown part of the process variable trajectory deviation from the current time until the end. The approach is based on a MPCA method that processes the data in a sequential and adaptive manner. The adaptive rate is easily controlled through a forgetting factor that controls the weight of past data in a summation. This algorithm is used to evaluate the industrial streptomycin fermentation process data and is compared with the traditional MPCA. The results show that the method is more advantageous than MPCA, especially when monitoring multi-stage batch process where the latent vector structure can change at several points during the batch.
Manojlovic, D; Lenhardt, L; Milićević, B; Antonov, M; Miletic, V; Dramićanin, M D
2015-10-09
Colour changes in Gradia Direct™ composite after immersion in tea, coffee, red wine, Coca-Cola, Colgate mouthwash, and distilled water were evaluated using principal component analysis (PCA) and the CIELAB colour coordinates. The reflection spectra of the composites were used as input data for the PCA. The output data (scores and loadings) provided information about the magnitude and origin of the surface reflection changes after exposure to the staining solutions. The reflection spectra of the stained samples generally exhibited lower reflection in the blue spectral range, which was manifested in the lower content of the blue shade for the samples. Both analyses demonstrated the high staining abilities of tea, coffee, and red wine, which produced total colour changes of 4.31, 6.61, and 6.22, respectively, according to the CIELAB analysis. PCA revealed subtle changes in the reflection spectra of composites immersed in Coca-Cola, demonstrating Coca-Cola's ability to stain the composite to a small degree.
Directory of Open Access Journals (Sweden)
Francisco Criado-Aldeanueva
2013-01-01
Full Text Available Two different paradigms of the Mediterranean Oscillation (MO teleconnection index have been compared in this work: station-based definitions obtained by the difference of some climate variable between two selected points in the eastern and western basins (i.e., Algiers and Cairo, Gibraltar and Israel, Marseille and Jerusalem, or south France and Levantine basin and the principal component (PC approach in which the index is obtained as the time series of the first mode of normalised sea level pressure anomalies across the extended Mediterranean region. Interannual to interdecadal precipitation (P, evaporation (E, E-P, and net heat flux have been correlated with the different MO indices to compare their relative importance in the long-term variability of heat and freshwater budgets over the Mediterranean Sea. On an annual basis, the PC paradigm is the most effective tool to assess the effect of the large-scale atmospheric forcing in the Mediterranean Sea because the station-based indices exhibit a very poor correlation with all climatic variables and only influence a reduced fraction of the basin. In winter, the station-based indices highly improve their ability to represent the atmospheric forcing and results are fairly independent of the paradigm used.
Wang, Xue Z; Yang, Yang; Li, Ruifa; McGuinnes, Catherine; Adamson, Janet; Megson, Ian L; Donaldson, Kenneth
2014-08-01
Structure toxicity relationship analysis was conducted using principal component analysis (PCA) for a panel of nanoparticles that included dry powders of oxides of titanium, zinc, cerium and silicon, dry powders of silvers, suspensions of polystyrene latex beads and dry particles of carbon black, nanotubes and fullerene, as well as diesel exhaust particles. Acute in vitro toxicity was assessed by different measures of cell viability, apoptosis and necrosis, haemolytic effects and the impact on cell morphology, while structural properties were characterised by particle size and size distribution, surface area, morphology, metal content, reactivity, free radical generation and zeta potential. Different acute toxicity measures were processed using PCA that classified the particles and identified four materials with an acute toxicity profile: zinc oxide, polystyrene latex amine, nanotubes and nickel oxide. PCA and contribution plot analysis then focused on identifying the structural properties that could determine the acute cytotoxicity of these four materials. It was found that metal content was an explanatory variable for acute toxicity associated with zinc oxide and nickel oxide, while high aspect ratio appeared the most important feature in nanotubes. Particle charge was considered as a determinant for high toxicity of polystyrene latex amine.
Multiple linear and principal component regressions for modelling ecotoxicity bioassay response.
Gomes, Ana I; Pires, José C M; Figueiredo, Sónia A; Boaventura, Rui A R
2014-01-01
The ecotoxicological response of the living organisms in an aquatic system depends on the physical, chemical and bacteriological variables, as well as the interactions between them. An important challenge to scientists is to understand the interaction and behaviour of factors involved in a multidimensional process such as the ecotoxicological response. With this aim, multiple linear regression (MLR) and principal component regression were applied to the ecotoxicity bioassay response of Chlorella vulgaris and Vibrio fischeri in water collected at seven sites of Leça river during five monitoring campaigns (February, May, June, August and September of 2006). The river water characterization included the analysis of 22 physicochemical and 3 microbiological parameters. The model that best fitted the data was MLR, which shows: (i) a negative correlation with dissolved organic carbon, zinc and manganese, and a positive one with turbidity and arsenic, regarding C. vulgaris toxic response; (ii) a negative correlation with conductivity and turbidity and a positive one with phosphorus, hardness, iron, mercury, arsenic and faecal coliforms, concerning V. fischeri toxic response. This integrated assessment may allow the evaluation of the effect of future pollution abatement measures over the water quality of Leça River.
Foong, Shaohui; Sun, Zhenglong
2016-01-01
In this paper, a novel magnetic field-based sensing system employing statistically optimized concurrent multiple sensor outputs for precise field-position association and localization is presented. This method capitalizes on the independence between simultaneous spatial field measurements at multiple locations to induce unique correspondences between field and position. This single-source-multi-sensor configuration is able to achieve accurate and precise localization and tracking of translational motion without contact over large travel distances for feedback control. Principal component analysis (PCA) is used as a pseudo-linear filter to optimally reduce the dimensions of the multi-sensor output space for computationally efficient field-position mapping with artificial neural networks (ANNs). Numerical simulations are employed to investigate the effects of geometric parameters and Gaussian noise corruption on PCA assisted ANN mapping performance. Using a 9-sensor network, the sensing accuracy and closed-loop tracking performance of the proposed optimal field-based sensing system is experimentally evaluated on a linear actuator with a significantly more expensive optical encoder as a comparison. PMID:27529253
Directory of Open Access Journals (Sweden)
S. Prabhu
2014-06-01
Full Text Available Carbon nanotube (CNT mixed grinding wheel has been used in the electrolytic in-process dressing (ELID grinding process to analyze the surface characteristics of AISI D2 Tool steel material. CNT grinding wheel is having an excellent thermal conductivity and good mechanical property which is used to improve the surface finish of the work piece. The multiobjective optimization of grey relational analysis coupled with principal component analysis has been used to optimize the process parameters of ELID grinding process. Based on the Taguchi design of experiments, an L9 orthogonal array table was chosen for the experiments. The confirmation experiment verifies the proposed that grey-based Taguchi method has the ability to find out the optimal process parameters with multiple quality characteristics of surface roughness and metal removal rate. Analysis of variance (ANOVA has been used to verify and validate the model. Empirical model for the prediction of output parameters has been developed using regression analysis and the results were compared for with and without using CNT grinding wheel in ELID grinding process.
Principal component analysis of proteolytic profiles as markers of authenticity of PDO cheeses.
Guerreiro, Joana Santos; Barros, Mário; Fernandes, Paulo; Pires, Preciosa; Bardsley, Ronald
2013-02-15
The casein fraction of 13 Portuguese PDO cheeses were analysed using Urea-PAGE and reverse phase-high performance liquid chromatography (RP-HPLC) and then subjected to chemometric evaluation. The chemometric techniques of cluster analysis (CA) and principal component analysis (PCA) were used for the classification studies. Peptide mapping using Urea-PAGE followed by CA revealed two major clusters according to the similarity of the proteolytic profile of the cheeses. PCA results were in accordance with the grouping performed using CA. CA of RP-HPLC results of the matured cheeses revealed the presence of one major cluster comprising samples manufactured with only ovine milk or milk admixtures. When the results of CA technique were compared with the two PCA approaches performed, it was found that the grouping of the samples was similar. Both approaches, revealed the potential of proteolytic profiles (which is an essential aspect of cheese maturation) as markers of authenticity of PDO cheeses in terms of ripening time and milk admixtures not mentioned on the label.
Xie, Zhongliu; Kitamoto, Asanobu; Tamura, Masaru; Shiroishi, Toshihiko; Gillies, Duncan
2016-03-01
Intensive international efforts are underway towards phenotyping the mouse genome, by knocking out each of its ≍25,000 genes one-by-one for comparative study. With vast amounts of data to analyze, the traditional method using time-consuming histological examination is clearly impractical, leading to an overwhelming demand for some high-throughput phenotyping framework, especially with the employment of biomedical image informatics to efficiently identify phenotypes concerning morphological abnormality. Existing work has either excessively relied on volumetric analytics which is insensitive to phenotypes associated with no severe volume variations, or tailored for specific defects and thus fails to serve a general phenotyping purpose. Furthermore, the prevailing requirement of an atlas for image segmentation in contrast to its limited availability further complicates the issue in practice. In this paper we propose a high-throughput general-purpose phenotyping framework that is able to efficiently perform batch-wise anomaly detection without prior knowledge of the phenotype and the need for atlas-based segmentation. Anomaly detection is centered on the combined use of group-wise non-rigid image registration and robust principal component analysis (RPCA) for feature extraction and decomposition.
Institute of Scientific and Technical Information of China (English)
Ferran Reverter; Esteban Vegas; Pedro Sánchez
2010-01-01
The detection of genes that show similar profiles under different experimental conditions is often an initial step in inferring the biological significance of such genes.Visualization tools are used to identify genes with similar profiles in microarray studies.Given the large number of genes recorded in microarray experiments,gene expression data are generally displayed on a low dimensional plot,based on linear methods.However,microarray data show nonlinearity,due to high-order terms of interaction between genes,so alternative approaches,such as kernel methods,may be more appropriate.We introduce a technique that combines kernel principal component analysis(KPCA)and Biplot to visualize gene expression profiles.Our approach relies on the singular value decomposition of the input matrix and incorporates an additional step that involves KPCA.The main properties of our method are the extraction of nonlinear features and the preservation of the input variables(genes)in the output display.We apply this algorithm to colon tumor,leukemia and lymphoma datasets.Our approach reveals the underlying structure of the gene expression profiles and provides a more intuitive understanding of the gene and sample association.
Blind deconvolution with principal components analysis for wide-field and small-aperture telescopes
Jia, Peng; Sun, Rongyu; Wang, Weinan; Cai, Dongmei; Liu, Huigen
2017-09-01
Telescopes with a wide field of view (greater than 1°) and small apertures (less than 2 m) are workhorses for observations such as sky surveys and fast-moving object detection, and play an important role in time-domain astronomy. However, images captured by these telescopes are contaminated by optical system aberrations, atmospheric turbulence, tracking errors and wind shear. To increase the quality of images and maximize their scientific output, we propose a new blind deconvolution algorithm based on statistical properties of the point spread functions (PSFs) of these telescopes. In this new algorithm, we first construct the PSF feature space through principal component analysis, and then classify PSFs from a different position and time using a self-organizing map. According to the classification results, we divide images of the same PSF types and select these PSFs to construct a prior PSF. The prior PSF is then used to restore these images. To investigate the improvement that this algorithm provides for data reduction, we process images of space debris captured by our small-aperture wide-field telescopes. Comparing the reduced results of the original images and the images processed with the standard Richardson-Lucy method, our method shows a promising improvement in astrometry accuracy.
Principal Component Analysis of computed emission lines from proto-stellar jets
Cerqueira, A H; De Colle, F; Vasconcelos, M J
2015-01-01
A very important issue concerning protostellar jets is the mechanism behind their formation. Obtaining information on the region at the base of a jet can shed light into the subject and some years ago this has been done through a search for a rotational signature at the jet line spectrum. The existence of such signatures, however, remains controversial. In order to contribute to the clarification of this issue, in this paper we show that the Principal Component Analysis (PCA) can potentially help to distinguish between rotation and precession effects in protostellar jet images. We apply the PCA to synthetic spectro-imaging datacubes generated as an output of numerical simulations of protostellar jets. In this way we generate a benchmark to which a PCA diagnostics of real observations can be confronted. Using the computed emission line profiles for [O I]6300A and [S II]6716A, we recover and analyze the effects of rotation and precession in tomograms generated by PCA. We show that different combinations of the ...
State and group dynamics of world stock market by principal component analysis
Nobi, Ashadun; Lee, Jae Woo
2016-05-01
We study the dynamic interactions and structural changes by a principal component analysis (PCA) to cross-correlation coefficients of global financial indices in the years 1998-2012. The variances explained by the first PC increase with time and show a drastic change during the crisis. A sharp change in PC coefficient implies a transition of market state, a situation which occurs frequently in the American and Asian indices. However, the European indices remain stable over time. Using the first two PC coefficients, we identify indices that are similar and more strongly correlated than the others. We observe that the European indices form a robust group over the observation period. The dynamics of the individual indices within the group increase in similarity with time, and the dynamics of indices are more similar during the crises. Furthermore, the group formation of indices changes position in two-dimensional spaces due to crises. Finally, after a financial crisis, the difference of PCs between the European and American indices narrows.
Aggregate eco-efficiency indices for New Zealand--a principal components analysis.
Jollands, Nigel; Lermit, Jonathan; Patterson, Murray
2004-12-01
Eco-efficiency has emerged as a management response to waste issues associated with current production processes. Despite the popularity of the term in both business and government circles, limited attention has been paid to measuring and reporting eco-efficiency to government policy makers. Aggregate measures of eco-efficiency are needed, to complement existing measures and to help highlight important patterns in eco-efficiency data. This paper aims to develop aggregate measures of eco-efficiency for use by policy makers. Specifically, this paper provides a unique analysis by applying principal components analysis (PCA) to eco-efficiency indicators in New Zealand. The study reveals that New Zealand's overall eco-efficiency improved for two out of the five aggregate measures over the period 1994/1995-1997/1998. The worsening of the other aggregate measures reflects, among other things, the relatively poor performance of the primary production and related processing sectors. These results show PCA is an effective approach for aggregating eco-efficiency indicators and assisting decision makers by reducing redundancy in an eco-efficiency indicators matrix.
Foong, Shaohui; Sun, Zhenglong
2016-08-12
In this paper, a novel magnetic field-based sensing system employing statistically optimized concurrent multiple sensor outputs for precise field-position association and localization is presented. This method capitalizes on the independence between simultaneous spatial field measurements at multiple locations to induce unique correspondences between field and position. This single-source-multi-sensor configuration is able to achieve accurate and precise localization and tracking of translational motion without contact over large travel distances for feedback control. Principal component analysis (PCA) is used as a pseudo-linear filter to optimally reduce the dimensions of the multi-sensor output space for computationally efficient field-position mapping with artificial neural networks (ANNs). Numerical simulations are employed to investigate the effects of geometric parameters and Gaussian noise corruption on PCA assisted ANN mapping performance. Using a 9-sensor network, the sensing accuracy and closed-loop tracking performance of the proposed optimal field-based sensing system is experimentally evaluated on a linear actuator with a significantly more expensive optical encoder as a comparison.
Lost-in-Space Star Identification Using Planar Triangle Principal Component Analysis Algorithm
Directory of Open Access Journals (Sweden)
Fuqiang Zhou
2015-01-01
Full Text Available It is a challenging task for a star sensor to implement star identification and determine the attitude of a spacecraft in the lost-in-space mode. Several algorithms based on triangle method are proposed for star identification in this mode. However, these methods hold great time consumption and large guide star catalog memory size. The star identification performance of these methods requires improvements. To address these problems, a star identification algorithm using planar triangle principal component analysis is presented here. A star pattern is generated based on the planar triangle created by stars within the field of view of a star sensor and the projection of the triangle. Since a projection can determine an index for a unique triangle in the catalog, the adoption of the k-vector range search technique makes this algorithm very fast. In addition, a sharing star validation method is constructed to verify the identification results. Simulation results show that the proposed algorithm is more robust than the planar triangle and P-vector algorithms under the same conditions.
Directory of Open Access Journals (Sweden)
Dongdong Song
2015-01-01
Full Text Available To predict the service life of polystyrene (PS under an aggressive environment, the nondimensional expression Z was established from a data set of multiple properties of PS by principal component analysis (PCA. In this study, PS specimens were exposed to the tropical environment on Xisha Islands in China for two years. Chromatic aberration, gloss, tensile strength, elongation at break, flexural strength, and impact strength were tested to evaluate the aging behavior of PS. Based on different needs of industries, each of the multiple properties could be used to evaluate the service life of PS. However, selecting a single performance variation will inevitably hide some information about the entire aging process. Therefore, finding a comprehensive measure representing the overall aging performance of PS can be highly significant. Herein, PCA was applied to obtain a specific property (Z which can represent all properties of PS. Z of PS degradation showed a slight decrease for the initial two months of exposure after which it increased rapidly in the next eight months. Subsequently, a slower increase of Z value was observed. From the three different stages shown as Z value increases, three stages have been identified for PS service life.
Principal Components of Thermography analyses of the Silk Tomb, Petra (Jordan)
Gomez-Heras, Miguel; Alvarez de Buergo, Monica; Fort, Rafael
2015-04-01
This communication presents the results of an active thermography survey of the Silk Tomb, which belongs to the Royal Tombs compound in the archaeological city of Petra in Jordan. The Silk Tomb is carved in the variegated Palaeozoic Umm Ishrin sandstone and it is heavily backweathered due to surface runoff from the top of the cliff where it is carved. Moreover, the name "Silk Tomb" was given because of the colourful display of the variegated sandstone due to backweathering. A series of infrared images were taken as the façade was heated by sunlight to perform a Principal Component of Thermography analyses with IR view 1.7.5 software. This was related to indirect moisture measurements (percentage of Wood Moisture Equivalent) taken across the façade, by means of a Protimeter portable moisture meter. Results show how moisture retention is deeply controlled by lithological differences across the façade. Research funded by Geomateriales 2 S2013/MIT-2914 and CEI Moncloa (UPM, UCM, CSIC) through a PICATA contract and the equipment from RedLAbPAt Network
SR-FTIR Coupled with Principal Component Analysis Shows Evidence for the Cellular Bystander Effect.
Lipiec, E; Bambery, K R; Lekki, J; Tobin, M J; Vogel, C; Whelan, D R; Wood, B R; Kwiatek, W M
2015-07-01
Synchrotron radiation-Fourier transform infrared (SR-FTIR) microscopy coupled with multivariate data analysis was used as an independent modality to monitor the cellular bystander effect. Single, living prostate cancer PC-3 cells were irradiated with various numbers of protons, ranging from 50-2,000, with an energy of either 1 or 2 MeV using a proton microprobe. SR-FTIR spectra of cells, fixed after exposure to protons and nonirradiated neighboring cells (bystander cells), were recorded. Spectral differences were observed in both the directly targeted and bystander cells and included changes in the DNA backbone and nucleic bases, along with changes in the protein secondary structure. Principal component analysis (PCA) was used to investigate the variance in the entire data set. The percentage of bystander cells relative to the applied number of protons with two different energies was calculated. Of all the applied quantities, the dose of 400 protons at 2 MeV was found to be the most effective for causing significant macromolecular perturbation in bystander PC-3 cells.
Model reduction of cavity nonlinear optics for photonic logic: a quasi-principal components approach
Shi, Zhan; Nurdin, Hendra I.
2016-11-01
Kerr nonlinear cavities displaying optical thresholding have been proposed for the realization of ultra-low power photonic logic gates. In the ultra-low photon number regime, corresponding to energy levels in the attojoule scale, quantum input-output models become important to study the effect of unavoidable quantum fluctuations on the performance of such logic gates. However, being a quantum anharmonic oscillator, a Kerr-cavity has an infinite dimensional Hilbert space spanned by the Fock states of the oscillator. This poses a challenge to simulate and analyze photonic logic gates and circuits composed of multiple Kerr nonlinearities. For simulation, the Hilbert of the oscillator is typically truncated to the span of only a finite number of Fock states. This paper develops a quasi-principal components approach to identify important subspaces of a Kerr-cavity Hilbert space and exploits it to construct an approximate reduced model of the Kerr-cavity on a smaller Hilbert space. Using this approach, we find a reduced dimension model with a Hilbert space dimension of 15 that can closely match the magnitudes of the mean transmitted and reflected output fields of a conventional truncated Fock state model of dimension 75, when driven by an input coherent field that switches between two levels. For the same input, the reduced model also closely matches the magnitudes of the mean output fields of Kerr-cavity-based AND and NOT gates and a NAND latch obtained from simulation of the full 75 dimension model.
Nogueira, Grazielle V.; Silveira, Landulfo, Jr.; Martin, Airton A.; Zangaro, Renato A.; Pacheco, Marcos T.; Chavantes, Maria C.; Zampieri, Marcelo; Pasqualucci, Carlos A. G.
2004-07-01
FT- Raman Spectroscopy (FT-Raman) could allow identification and evaluation of human atherosclerotic lesions. A Raman spectrum can provide biochemical information of arteries which can help identifying the disease status and evolution. In this study, it is shown the results of FT-Raman for identification of human carotid arteries in vitro. Fragments of human carotid arteries were analyzed using a FT-Raman spectrometer with a Nd:YAG laser at 1064nm operating at an excitation power of 300mW. Spectra were obtained with 250 scans and spectral resolution of 4 cm-1. Each collection time was approximately 8 min. A total of 75 carotid fragments were spectroscopically scanned and FT-Raman results were compared with histopathology. Principal Components Analysis (PCA) was used to model an algorithm for tissue classification into three categories: normal, atherosclerotic plaque without calcification and atherosclerotic plaque with calcification. Non-atherosclerotic (normal) artery, atherosclerotic plaque and calcified plaque exhibit different spectral signatures related to biochemicals presented in each tissue type, such as, bands of collagen and elastin (proteins), cholesterol and its esters and calcium hydroxyapatite and carbonate apatite respectively. Results show that there is 96% match between classifications based on PCA algorithm and histopathology. The diagnostic applied over all 75 samples had sensitivity and specificity of about 89% and 100%, respectively, for atherosclerotic plaque and 100% and 98% for calcified plaque.
Directory of Open Access Journals (Sweden)
Ida Vajčnerová
2016-01-01
Full Text Available The objective of the paper is to explore possibilities of evaluating the quality of a tourist destination by means of the principal components analysis (PCA and the cluster analysis. In the paper both types of analysis are compared on the basis of the results they provide. The aim is to identify advantage and limits of both methods and provide methodological suggestion for their further use in the tourism research. The analyses is based on the primary data from the customers’ satisfaction survey with the key quality factors of a destination. As output of the two statistical methods is creation of groups or cluster of quality factors that are similar in terms of respondents’ evaluations, in order to facilitate the evaluation of the quality of tourist destinations. Results shows the possibility to use both tested methods. The paper is elaborated in the frame of wider research project aimed to develop a methodology for the quality evaluation of tourist destinations, especially in the context of customer satisfaction and loyalty.
Ciucci, Sara; Ge, Yan; Durán, Claudio; Palladini, Alessandra; Jiménez-Jiménez, Víctor; Martínez-Sánchez, Luisa María; Wang, Yuting; Sales, Susanne; Shevchenko, Andrej; Poser, Steven W.; Herbig, Maik; Otto, Oliver; Androutsellis-Theotokis, Andreas; Guck, Jochen; Gerl, Mathias J.; Cannistraci, Carlo Vittorio
2017-01-01
Omic science is rapidly growing and one of the most employed techniques to explore differential patterns in omic datasets is principal component analysis (PCA). However, a method to enlighten the network of omic features that mostly contribute to the sample separation obtained by PCA is missing. An alternative is to build correlation networks between univariately-selected significant omic features, but this neglects the multivariate unsupervised feature compression responsible for the PCA sample segregation. Biologists and medical researchers often prefer effective methods that offer an immediate interpretation to complicated algorithms that in principle promise an improvement but in practice are difficult to be applied and interpreted. Here we present PC-corr: a simple algorithm that associates to any PCA segregation a discriminative network of features. Such network can be inspected in search of functional modules useful in the definition of combinatorial and multiscale biomarkers from multifaceted omic data in systems and precision biomedicine. We offer proofs of PC-corr efficacy on lipidomic, metagenomic, developmental genomic, population genetic, cancer promoteromic and cancer stem-cell mechanomic data. Finally, PC-corr is a general functional network inference approach that can be easily adopted for big data exploration in computer science and analysis of complex systems in physics. PMID:28287094
Warren, J. S.; Hughes, J. P.; Badenes, C.
2005-12-01
We present results from a Principal Component Analysis (PCA) of Tycho's supernova remnant (SNR). PCA is a statistical technique we implemented to characterize X-ray spectra extracted from distinct spatial regions across the entire image of the remnant. We used the PCA to determine the location of the contact discontinuity (CD) in Tycho, which marks the boundary between shocked ejecta and shocked interstellar material, and found an azimuthal-angle-averaged radius of 241". For the average radius of the outer blast wave (BW) we found 251". Taking account of projection effects, the ratio of BW:CD is 1:0.93, which is inconsistent with adiabatic hydrodynamic models of SNR evolution. The BW:CD ratio can be explained if cosmic ray acceleration of ions is occurring at the forward shock. Such a scenario is further supported by evidence from the morphology and spectral nature of the BW emission for the acceleration of cosmic ray electrons. We also present PCA results regarding the ranges in Si and Fe composition in Tycho, and a newly uncovered spectral variation in the form of a low energy excess that has not been previously noted.
Spitzer spectral line mapping of supernova remnants: I. Basic data and principal component analysis
Neufeld, David A; Kaufman, Michael J; Snell, Ronald L; Melnick, Gary J; Bergin, Edwin A; Sonnentrucker, Paule
2007-01-01
We report the results of spectroscopic mapping observations carried out toward small (1 x 1 arcmin) regions within the supernova remnants W44, W28, IC443, and 3C391 using the Infrared Spectrograph of the Spitzer Space Telescope. These observations, covering the 5.2 - 37 micron spectral region, have led to the detection of a total of 15 fine structure transitions of Ne+, Ne++, Si+, P+, S, S++, Cl+, Fe+, and Fe++; the S(0) - S(7) pure rotational lines of molecular hydrogen; and the R(3) and R(4) transitions of hydrogen deuteride. In addition to these 25 spectral lines, the 6.2, 7.7, 8.6, 11.3 and 12.6 micron PAH emission bands were also observed. Most of the detected line transitions have proven strong enough to map in several sources, providing a comprehensive picture of the relative distribution of the various line emissions observable in the Spitzer/IRS bandpass. A principal component analysis of the spectral line maps reveals that the observed emission lines fall into five distinct groups, each of which may...
Schloemer, Sarah A; Thompson, Julie A; Silder, Amy; Thelen, Darryl G; Siston, Robert A
2017-03-01
Age-related increased hip extensor recruitment during gait is a proposed compensation strategy for reduced ankle power generation and may indicate a distal-to-proximal shift in muscle function with age. Extending beyond joint level analyses, identifying age-related changes at the muscle level could capture more closely the underlying mechanisms responsible for movement. The purpose of this study was to characterize and compare muscle forces and induced accelerations during gait in healthy older adults with those of young adults. Simulations of one gait cycle for ten older (73.9 ± 5.3 years) and six young (21.0 ± 2.1 years) adults walking at their self-selected speed were analyzed. Muscle force and induced acceleration waveforms, along with kinematic, kinetic, and muscle activation waveforms, were compared between age-groups using principal component analysis. Simulations of healthy older adults had greater gluteus maximus force and vertical support contribution, but smaller iliacus force, psoas force, and psoas vertical support contribution. There were no age-group differences in distal muscle force, contribution, or ankle torque magnitudes. Later peak dorsiflexion and peak ankle angular velocity in older adults may have contributed to their greater ankle power absorption during stance. These findings reveal the complex interplay between age-related changes in neuromuscular control, kinematics, and muscle function during gait.
Xu, Shanzhi; Wang, Peng; Dong, Yonggui
2016-04-22
In order to measure the impedance variation process in electrolyte solutions, a method of triangular waveform voltage excitation is investigated together with principal component analysis (PCA). Using triangular waveform voltage as the excitation signal, the response current during one duty cycle is sampled to construct a measurement vector. The measurement matrix is then constructed by the measurement vectors obtained from different measurements. After being processed by PCA, the changing information of solution impedance is contained in the loading vectors while the response current and noise information is contained in the score vectors. The measurement results of impedance variation by the proposed signal processing method are independent of the equivalent impedance model. The noise-induced problems encountered during equivalent impedance calculation are therefore avoided, and the real-time variation information of noise in the electrode-electrolyte interface can be extracted at the same time. Planar-interdigitated electrodes are experimentally tested for monitoring the KCl concentration variation process. Experimental results indicate that the measured impedance variation curve reflects the changing process of solution conductivity, and the amplitude distribution of the noise during one duty cycle can be utilized to analyze the contact conditions of the electrode and electrolyte interface.
Clouthier, Allison L; Bohm, Eric R; Rudan, John F; Shay, Barbara L; Rainbow, Michael J; Deluzio, Kevin J
2017-01-01
Multicentre studies are rare in three dimensional motion analyses due to challenges associated with combining waveform data from different centres. Principal component analysis (PCA) is a statistical technique that can be used to quantify variability in waveform data and identify group differences. A correction technique based on PCA is proposed that can be used in post processing to remove nuisance variation introduced by the differences between centres. Using this technique, the waveform bias that exists between the two datasets is corrected such that the means agree. No information is lost in the individual datasets, but the overall variability in the combined data is reduced. The correction is demonstrated on gait kinematics with synthesized crosstalk and on gait data from knee arthroplasty patients collected in two centres. The induced crosstalk was successfully removed from the knee joint angle data. In the second example, the removal of the nuisance variation due to the multicentre data collection allowed significant differences in implant type to be identified. This PCA-based technique can be used to correct for differences between waveform datasets in post processing and has the potential to enable multicentre motion analysis studies.
Directory of Open Access Journals (Sweden)
Shanzhi Xu
2016-04-01
Full Text Available In order to measure the impedance variation process in electrolyte solutions, a method of triangular waveform voltage excitation is investigated together with principal component analysis (PCA. Using triangular waveform voltage as the excitation signal, the response current during one duty cycle is sampled to construct a measurement vector. The measurement matrix is then constructed by the measurement vectors obtained from different measurements. After being processed by PCA, the changing information of solution impedance is contained in the loading vectors while the response current and noise information is contained in the score vectors. The measurement results of impedance variation by the proposed signal processing method are independent of the equivalent impedance model. The noise-induced problems encountered during equivalent impedance calculation are therefore avoided, and the real-time variation information of noise in the electrode-electrolyte interface can be extracted at the same time. Planar-interdigitated electrodes are experimentally tested for monitoring the KCl concentration variation process. Experimental results indicate that the measured impedance variation curve reflects the changing process of solution conductivity, and the amplitude distribution of the noise during one duty cycle can be utilized to analyze the contact conditions of the electrode and electrolyte interface.
Desdouits, Nathan; Nilges, Michael; Blondel, Arnaud
2015-02-01
Protein conformation has been recognized as the key feature determining biological function, as it determines the position of the essential groups specifically interacting with substrates. Hence, the shape of the cavities or grooves at the protein surface appears to drive those functions. However, only a few studies describe the geometrical evolution of protein cavities during molecular dynamics simulations (MD), usually with a crude representation. To unveil the dynamics of cavity geometry evolution, we developed an approach combining cavity detection and Principal Component Analysis (PCA). This approach was applied to four systems subjected to MD (lysozyme, sperm whale myoglobin, Dengue envelope protein and EF-CaM complex). PCA on cavities allows us to perform efficient analysis and classification of the geometry diversity explored by a cavity. Additionally, it reveals correlations between the evolutions of the cavities and structures, and can even suggest how to modify the protein conformation to induce a given cavity geometry. It also helps to perform fast and consensual clustering of conformations according to cavity geometry. Finally, using this approach, we show that both carbon monoxide (CO) location and transfer among the different xenon sites of myoglobin are correlated with few cavity evolution modes of high amplitude. This correlation illustrates the link between ligand diffusion and the dynamic network of internal cavities.
Joint cluster and non-negative least squares analysis for aerosol mass spectrum data
Energy Technology Data Exchange (ETDEWEB)
Zhang, T; Zhu, W [Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY 11794-3600 (United States); McGraw, R [Environmental Sciences Department, Brookhaven National Laboratory, Upton, NY 11973-5000 (United States)], E-mail: zhu@ams.sunysb.edu
2008-07-15
Aerosol mass spectrum (AMS) data contain hundreds of mass to charge ratios and their corresponding intensities from air collected through the mass spectrometer. The observations are usually taken sequentially in time to monitor the air composition, quality and temporal change in an area of interest. An important goal of AMS data analysis is to reduce the dimensionality of the original data yielding a small set of representing tracers for various atmospheric and climatic models. In this work, we present an approach to jointly apply the cluster analysis and the non-negative least squares method towards this goal. Application to a relevant study demonstrates the effectiveness of this new approach. Comparisons are made to other relevant multivariate statistical techniques including the principal component analysis and the positive matrix factorization method, and guidelines are provided.
Energy Technology Data Exchange (ETDEWEB)
Egan, William; Morgan, Stephen L. [Department of Chemistry and Biochemistry, The University of South Carolina, Columbia, South Carolina 29208 (United States)] Brewer, William E. [Toxicology Department, South Carolina Law Enforcement Division, 4416 Broad River Road, Columbia, South Carolina 29210 (United States)
1999-02-01
The forensic determination of carboxyhemoglobin (COHb) in blood was performed by using an improved principal component regression (PCR) technique applied to UV-visible spectra. Calibration data were decomposed into principal components, and the principal components useful for prediction were selected by their correlation with calibration spectra. Cross-validation of prediction results was done by leverage-corrected residuals. Confidence and prediction intervals derived from classical regression theory were found to be reasonable in size. The results compared favorably to a comparison study conducted by using a CO Oximeter method. In analysis of forensic case study samples, the improved PCR method allowed detection of abnormal samples and successfully predicted percentages of COHb and methemoglobin (MetHb), and provided error estimates for those predictions. {copyright} {ital 1999} {ital Society for Applied Spectroscopy}
Adams, Jamie C; Mellor, Matthew; Joyce, Malcolm J
2011-10-01
A method to determine the depth of buried localized radioactive contamination nonintrusively and nondestructively using principal component analysis is described. The γ-ray spectra from two radionuclides, cesium-137 and cobalt-60, have been analyzed to derive the two principal components that change most significantly as a result of varying the depth of the sources in a bespoke sand-filled phantom. The relationship between depth (d) and the angle (θ) between the first two principal component coefficients has been derived for both cases, viz. d(Φ) = x + y log(e) Φ where x and y are constants dependent on the shielding material and the γ-ray energy spectrum of the radioactivity in question, and φ is a function of θ. The technique enables the depth of a localized radioactive source to be determined nonintrusively in the range 5 to 50 mm with an accuracy of ±1 mm.
Institute of Scientific and Technical Information of China (English)
GUO Zhongyang; DAI Xiaoyan; LI Xiaodong; YE Shufeng
2013-01-01
To reduce typhoon-caused damages,numerical and empirical methods are often used to forecast typhoon storm surge.However,typhoon surge is a complex nonlinear process that is difficult to forecast accurately.We applied a principal component back-propagation neural network (PCBPNN) to predict the deviation in typhoon storm surge,in which data of the typhoon,upstream flood,and historical case studies were involved.With principal component analysis,15 input factors were reduced to five principal components,and the application of the model was improved.Observation data from Huangpu Park in Shanghai,China were used to test the feasibility of the model.The results indicate that the model is capable of predicting a 12-hour warning before a typhoon surge.
Directory of Open Access Journals (Sweden)
Grahić Jasmin
2013-01-01
Full Text Available In order to analyze morphological characteristics of locally cultivated common bean landraces from Bosnia and Herzegovina (B&H, thirteen quantitative and qualitative traits of 40 P. vulgaris accessions, collected from four geographical regions (Northwest B&H, Northeast B&H, Central B&H and Sarajevo and maintained at the Gene bank of the Faculty of Agriculture and Food Sciences in Sarajevo, were examined. Principal component analysis (PCA showed that the proportion of variance retained in the first two principal components was 54.35%. The first principal component had high contributing factor loadings from seed width, seed height and seed weight, whilst the second principal component had high contributing factor loadings from the analyzed traits seed per pod and pod length. PCA plot, based on the first two principal components, displayed a high level of variability among the analyzed material. The discriminant analysis of principal components (DAPC created 3 discriminant functions (DF, whereby the first two discriminant functions accounted for 90.4% of the variance retained. Based on the retained DFs, DAPC provided group membership probabilities which showed that 70% of the accessions examined were correctly classified between the geographically defined groups. Based on the taxonomic distance, 40 common bean accessions analyzed in this study formed two major clusters, whereas two accessions Acc304 and Acc307 didn’t group in any of those. Acc360 and Acc362, as well as Acc324 and Acc371 displayed a high level of similarity and are probably the same landrace. The present diversity of Bosnia and Herzegovina’s common been landraces could be useful in future breeding programs.
Soares, Denise Paschoal; de Castro, Marcelo Peduzzi; Mendes, Emilia Assunção; Machado, Leandro
2016-12-01
The alterations in gait pattern of people with transfemoral amputation leave them more susceptible to musculoskeletal injury. Principal component analysis is a method that reduces the amount of gait data and allows analyzing the entire waveform. To use the principal component analysis to compare the ground reaction force and center of pressure displacement waveforms obtained during gait between able-bodied subjects and both limbs of individuals with transfemoral amputation. This is a transversal study with a convenience sample. We used a force plate and pressure plate to record the anterior-posterior, medial-lateral and vertical ground reaction force, and anterior-posterior and medial-lateral center of pressure positions of 12 participants with transfemoral amputation and 20 able-bodied subjects during gait. The principal component analysis was performed to compare the gait waveforms between the participants with transfemoral amputation and the able-bodied individuals. The principal component analysis model explained between 74% and 93% of the data variance. In all ground reaction force and center of pressure waveforms relevant portions were identified; and always at least one principal component presented scores statistically different (p amputation compared to the able-bodied participants. Principal component analysis reduced the amount of data, allowed analyzing the whole waveform, and identified specific sub-phases of gait that were different between the groups. Therefore, this approach seems to be a powerful tool to be used in gait evaluation and following the rehabilitation status of people with transfemoral amputation. © The International Society for Prosthetics and Orthotics 2015.
Directory of Open Access Journals (Sweden)
Amin Babaei Falah
2015-04-01
Full Text Available In this paper, we try to determine the stockholder’s desire approach to financial ratios using a combination of principal component analysis and grey theory.Grey Principal Component Analysis (GPCAhandles poor information reduces dimensions of variables and gives an appropriate score to each company. Here we employ GPCA to identify more appropriate strategies of normalizing data curves to reduce the discrepancy between the GPCA-ranking and return-ranking, hence determining the approaches of stockholders of listed pharmaceutical firms of Tehran Stock Exchange (TSE regarding financial ratios.
Pittelkow, Yvonne; Wilson, Susan R
2005-06-01
This note is in response to Wouters et al. (2003, Biometrics 59, 1131-1139) who compared three methods for exploring gene expression data. Contrary to their summary that principal component analysis is not very informative, we show that it is possible to determine principal component analyses that are useful for exploratory analysis of microarray data. We also present another biplot representation, the GE-biplot (Gene Expression biplot), that is a useful method for exploring gene expression data with the major advantage of being able to aid interpretation of both the samples and the genes relative to each other.
Hearty, Aine P; Gibney, Michael J
2009-02-01
The aims of the present study were to examine and compare dietary patterns in adults using cluster and factor analyses and to examine the format of the dietary variables on the pattern solutions (i.e. expressed as grams/day (g/d) of each food group or as the percentage contribution to total energy intake). Food intake data were derived from the North/South Ireland Food Consumption Survey 1997-9, which was a randomised cross-sectional study of 7 d recorded food and nutrient intakes of a representative sample of 1379 Irish adults aged 18-64 years. Cluster analysis was performed using the k-means algorithm and principal component analysis (PCA) was used to extract dietary factors. Food data were reduced to thirty-three food groups. For cluster analysis, the most suitable format of the food-group variable was found to be the percentage contribution to energy intake, which produced six clusters: 'Traditional Irish'; 'Continental'; 'Unhealthy foods'; 'Light-meal foods & low-fat milk'; 'Healthy foods'; 'Wholemeal bread & desserts'. For PCA, food groups in the format of g/d were found to be the most suitable format, and this revealed four dietary patterns: 'Unhealthy foods & high alcohol'; 'Traditional Irish'; 'Healthy foods'; 'Sweet convenience foods & low alcohol'. In summary, cluster and PCA identified similar dietary patterns when presented with the same dataset. However, the two dietary pattern methods required a different format of the food-group variable, and the most appropriate format of the input variable should be considered in future studies.
Laclaustra, Martin; Frangi, Alejandro F; Garcia, Daniel; Boisrobert, Loïc; Frangi, Andres G; Pascual, Isaac
2007-03-01
Endothelial dysfunction is associated with cardiovascular diseases and their risk factors (CVRF), and flow-mediated dilation (FMD) is increasingly used to explore it. In this test, artery diameter changes after post-ischaemic hyperaemia are classically quantified using maximum peak vasodilation (FMDc). To obtain more detailed descriptors of FMD we applied principal component analysis (PCA) to diameter-time curves (absolute), vasodilation-time curves (relative) and blood-velocity-time curves. Furthermore, combined PCA of vessel size and blood-velocity curves allowed exploring links between flow and dilation. Vessel diameter data for PCA (post-ischaemic: 140 s) were acquired from brachial ultrasound image sequences of 173 healthy male subjects using a computerized technique previously reported by our team based on image registration (Frangi et al 2003 IEEE Trans. Med. Imaging 22 1458). PCA provides a set of axes (called eigenmodes) that captures the underlying variation present in a database of waveforms so that the first few eigenmodes retain most of the variation. These eigenmodes can be used to synthesize each waveform analysed by means of only a few parameters, as well as potentially any signal of the same type derived from tests of new patients. The eigenmodes obtained seemed related to visual features of the waveform of the FMD process. Subsequently, we used eigenmodes to parameterize our data. Most of the main parameters (13 out of 15) correlated with FMDc. Furthermore, not all parameters correlated with the same CVRF tested, that is, serum lipids (i.e., high LDL-c associated with slow vessel return to a baseline, while low HDL-c associated with a lower vasodilation in response to similar velocity stimulus), thus suggesting that this parameterization allows a more detailed and factored description of the process than FMDc.
Directory of Open Access Journals (Sweden)
Balloux François
2010-10-01
Full Text Available Abstract Background The dramatic progress in sequencing technologies offers unprecedented prospects for deciphering the organization of natural populations in space and time. However, the size of the datasets generated also poses some daunting challenges. In particular, Bayesian clustering algorithms based on pre-defined population genetics models such as the STRUCTURE or BAPS software may not be able to cope with this unprecedented amount of data. Thus, there is a need for less computer-intensive approaches. Multivariate analyses seem particularly appealing as they are specifically devoted to extracting information from large datasets. Unfortunately, currently available multivariate methods still lack some essential features needed to study the genetic structure of natural populations. Results We introduce the Discriminant Analysis of Principal Components (DAPC, a multivariate method designed to identify and describe clusters of genetically related individuals. When group priors are lacking, DAPC uses sequential K-means and model selection to infer genetic clusters. Our approach allows extracting rich information from genetic data, providing assignment of individuals to groups, a visual assessment of between-population differentiation, and contribution of individual alleles to population structuring. We evaluate the performance of our method using simulated data, which were also analyzed using STRUCTURE as a benchmark. Additionally, we illustrate the method by analyzing microsatellite polymorphism in worldwide human populations and hemagglutinin gene sequence variation in seasonal influenza. Conclusions Analysis of simulated data revealed that our approach performs generally better than STRUCTURE at characterizing population subdivision. The tools implemented in DAPC for the identification of clusters and graphical representation of between-group structures allow to unravel complex population structures. Our approach is also faster than
Energy Technology Data Exchange (ETDEWEB)
Kimura, Y. [Positron Medical Center, Tokyo Metropolitan Institute of Gerontology, Naka, Itabashi, Tokyo (Japan); Senda, M. [Foundation for Biomedical Research and Innovation, 7F Chamber of Commerce, Minatojima-Nakamachi, Chuo, Kobe (Japan); Alpert, N.M. [Division of Nuclear Medicine, Massachusetts General Hospital, Boston, MA (United States)]. E-mail: alpert@pet.mgh.harvard.edu
2002-02-07
Formation of parametric images requires voxel-by-voxel estimation of rate constants, a process sensitive to noise and computationally demanding. A model-based clustering method for a two-parameter model (CAKS) was extended to the FDG three-parameter model. The concept was to average voxels with similar kinetic signatures to reduce noise. Voxel kinetics were categorized by the first two principal components of the tissue time-activity curves for all voxels. k{sub 2} and k{sub 3} were estimated cluster-by-cluster, and K{sub 1} was estimated voxel-by-voxel within clusters. When CAKS was applied to simulated images with noise levels similar to brain FDG scans, estimation bias was well suppressed, and estimation errors were substantially smaller - 1.3 times for K{sub i} and 1.5 times for k{sub 3} - than those of conventional voxel-based estimation. The statistical reliability of voxel-level estimation by CAKS was comparable with ROI analysis including 100 voxels. CAKS was applied to clinical cases with Alzheimer's disease (ALZ) and cortico basal degeneration (CBD). In ALZ, the affected regions had low K{sub i}(K{sub 1}k{sub 3}/(k{sub 2}+k{sub 3})) and k{sub 3}. In CBD, K{sub i} was low, but k{sub 3} was preserved. These results were consistent with ROI-based kinetic analysis. Because CAKS decreased the number of invoked estimations, the calculation time was reduced substantially. In conclusion, CAKS has been extended to allow parametric imaging of a three-compartment model. The method is computationally efficient, with low bias and excellent noise properties. (author)
Applications of gauge duality in robust principal component analysis and semidefinite programming
Ma, ShiQian; Yang, JunFeng
2016-08-01
Gauge duality theory was originated by Freund [Math. Programming, 38(1):47-67, 1987] and was recently further investigated by Friedlander, Mac{\\^e}do and Pong [SIAM J. Optm., 24(4):1999-2022, 2014]. When solving some matrix optimization problems via gauge dual, one is usually able to avoid full matrix decompositions such as singular value and/or eigenvalue decompositions. In such an approach, a gauge dual problem is solved in the first stage, and then an optimal solution to the primal problem can be recovered from the dual optimal solution obtained in the first stage. Recently, this theory has been applied to a class of \\emph{semidefinite programming} (SDP) problems with promising numerical results [Friedlander and Mac{\\^e}do, SIAM J. Sci. Comp., to appear, 2016]. In this paper, we establish some theoretical results on applying the gauge duality theory to robust \\emph{principal component analysis} (PCA) and general SDP. For each problem, we present its gauge dual problem, characterize the optimality conditions for the primal-dual gauge pair, and validate a way to recover a primal optimal solution from a dual one. These results are extensions of [Friedlander and Mac{\\^e}do, SIAM J. Sci. Comp., to appear, 2016] from nuclear norm regularization to robust PCA and from a special class of SDP which requires the coefficient matrix in the linear objective to be positive definite to SDP problems without this restriction. Our results provide further understanding in the potential advantages and disadvantages of the gauge duality theory.
Krishnan, M.; Bhowmik, B.; Tiwari, A. K.; Hazra, B.
2017-08-01
In this paper, a novel baseline free approach for continuous online damage detection of multi degree of freedom vibrating structures using recursive principal component analysis (RPCA) in conjunction with online damage indicators is proposed. In this method, the acceleration data is used to obtain recursive proper orthogonal modes in online using the rank-one perturbation method, and subsequently utilized to detect the change in the dynamic behavior of the vibrating system from its pristine state to contiguous linear/nonlinear-states that indicate damage. The RPCA algorithm iterates the eigenvector and eigenvalue estimates for sample covariance matrices and new data point at each successive time instants, using the rank-one perturbation method. An online condition indicator (CI) based on the L2 norm of the error between actual response and the response projected using recursive eigenvector matrix updates over successive iterations is proposed. This eliminates the need for offline post processing and facilitates online damage detection especially when applied to streaming data. The proposed CI, named recursive residual error, is also adopted for simultaneous spatio-temporal damage detection. Numerical simulations performed on five-degree of freedom nonlinear system under white noise and El Centro excitations, with different levels of nonlinearity simulating the damage scenarios, demonstrate the robustness of the proposed algorithm. Successful results obtained from practical case studies involving experiments performed on a cantilever beam subjected to earthquake excitation, for full sensors and underdetermined cases; and data from recorded responses of the UCLA Factor building (full data and its subset) demonstrate the efficacy of the proposed methodology as an ideal candidate for real-time, reference free structural health monitoring.
Shaffer, John R; Feingold, Eleanor; Wang, Xiaojing; Tcuenco, Karen T; Weeks, Daniel E; DeSensi, Rebecca S; Polk, Deborah E; Wendell, Steve; Weyant, Robert J; Crout, Richard; McNeil, Daniel W; Marazita, Mary L
2012-03-09
Dental caries is the result of a complex interplay among environmental, behavioral, and genetic factors, with distinct patterns of decay likely due to specific etiologies. Therefore, global measures of decay, such as the DMFS index, may not be optimal for identifying risk factors that manifest as specific decay patterns, especially if the risk factors such as genetic susceptibility loci have small individual effects. We used two methods to extract patterns of decay from surface-level caries data in order to generate novel phenotypes with which to explore the genetic regulation of caries. The 128 tooth surfaces of the permanent dentition were scored as carious or not by intra-oral examination for 1,068 participants aged 18 to 75 years from 664 biological families. Principal components analysis (PCA) and factor analysis (FA), two methods of identifying underlying patterns without a priori surface classifications, were applied to our data. The three strongest caries patterns identified by PCA recaptured variation represented by DMFS index (correlation, r = 0.97), pit and fissure surface caries (r = 0.95), and smooth surface caries (r = 0.89). However, together, these three patterns explained only 37% of the variability in the data, indicating that a priori caries measures are insufficient for fully quantifying caries variation. In comparison, the first pattern identified by FA was strongly correlated with pit and fissure surface caries (r = 0.81), but other identified patterns, including a second pattern representing caries of the maxillary incisors, were not representative of any previously defined caries indices. Some patterns identified by PCA and FA were heritable (h(2) = 30-65%, p = 0.043-0.006), whereas other patterns were not, indicating both genetic and non-genetic etiologies of individual decay patterns. This study demonstrates the use of decay patterns as novel phenotypes to assist in understanding the multifactorial nature of dental caries.
Foulks, Gary N.; Yappert, Marta C.; Milliner, Sarah E.
2012-01-01
Purpose. Nuclear magnetic resonance (NMR) spectroscopy has been used to quantify lipid wax, cholesterol ester terpenoid and glyceride composition, saturation, oxidation, and CH2 and CH3 moiety distribution. This tool was used to measure changes in human meibum composition with meibomian gland dysfunction (MGD). Methods. 1H-NMR spectra of meibum from 39 donors with meibomian gland dysfunction (Md) were compared to meibum from 33 normal donors (Mn). Results. Principal component analysis (PCA) was applied to the CH2/CH3 regions of a set of training NMR spectra of human meibum. PCA discriminated between Mn and Md with an accuracy of 86%. There was a bias toward more accurately predicting normal samples (92%) compared with predicting MGD samples (78%). When the NMR spectra of Md were compared with those of Mn, three statistically significant decreases were observed in the relative amounts of CH3 moieties at 1.26 ppm, the products of lipid oxidation above 7 ppm, and the ═CH moieties at 5.2 ppm associated with terpenoids. Conclusions. Loss of the terpenoids could be deleterious to meibum since they exhibit a plethora of mostly positive biological functions and could account for the lower level of cholesterol esters observed in Md compared with Mn. All three changes could account for the higher degree of lipid order of Md compared with age-matched Mn. In addition to the power of NMR spectroscopy to detect differences in the composition of meibum, it is promising that NMR can be used as a diagnostic tool. PMID:22131391
Directory of Open Access Journals (Sweden)
Seloame T. Nyaku
2016-04-01
Full Text Available U.S. cotton production is suffering from the yield loss caused by the reniform nematode (RN, Rotylenchulus reniformis. Management of this devastating pest is of utmost importance because, no upland cotton cultivar exhibits adequate resistance to RN. Nine populations of RN from distinct regions in Alabama and one population from Mississippi were studied and thirteen morphometric features were measured on 20 male and 20 female nematodes from each population. Highly correlated variables (positive in female and male RN morphometric parameters were observed for body length (L and distance of vulva from the lip region (V (r = 0.7 and tail length (TL and c′ (r = 0.8, respectively. The first and second principal components for the female and male populations showed distinct clustering into three groups. These results show pattern of sub-groups within the RN populations in Alabama. A one-way ANOVA on female and male RN populations showed significant differences (p ≤ 0.05 among the variables. Multiple sequence alignment (MSA of 18S rRNA sequences (421 showed lengths of 653 bp. Sites within the aligned sequences were conserved (53%, parsimony-informative (17%, singletons (28%, and indels (2%, respectively. Neighbor-Joining analysis showed intra and inter-nematodal variations within the populations as clone sequences from different nematodes irrespective of the sex of nematode isolate clustered together. Morphologically, the three groups (I, II and III could not be distinctly associated with the molecular data from the 18S rRNA sequences. The three groups may be identified as being non-geographically contiguous.
Directory of Open Access Journals (Sweden)
Shaffer John R
2012-03-01
Full Text Available Abstract Background Dental caries is the result of a complex interplay among environmental, behavioral, and genetic factors, with distinct patterns of decay likely due to specific etiologies. Therefore, global measures of decay, such as the DMFS index, may not be optimal for identifying risk factors that manifest as specific decay patterns, especially if the risk factors such as genetic susceptibility loci have small individual effects. We used two methods to extract patterns of decay from surface-level caries data in order to generate novel phenotypes with which to explore the genetic regulation of caries. Methods The 128 tooth surfaces of the permanent dentition were scored as carious or not by intra-oral examination for 1,068 participants aged 18 to 75 years from 664 biological families. Principal components analysis (PCA and factor analysis (FA, two methods of identifying underlying patterns without a priori surface classifications, were applied to our data. Results The three strongest caries patterns identified by PCA recaptured variation represented by DMFS index (correlation, r = 0.97, pit and fissure surface caries (r = 0.95, and smooth surface caries (r = 0.89. However, together, these three patterns explained only 37% of the variability in the data, indicating that a priori caries measures are insufficient for fully quantifying caries variation. In comparison, the first pattern identified by FA was strongly correlated with pit and fissure surface caries (r = 0.81, but other identified patterns, including a second pattern representing caries of the maxillary incisors, were not representative of any previously defined caries indices. Some patterns identified by PCA and FA were heritable (h2 = 30-65%, p = 0.043-0.006, whereas other patterns were not, indicating both genetic and non-genetic etiologies of individual decay patterns. Conclusions This study demonstrates the use of decay patterns as novel phenotypes to assist in understanding
Nomoto, Yohei; Yamashita, Kazuhiko; Ohya, Tetsuya; Koyama, Hironori; Kawasumi, Masashi
There is the increasing concern of the society to prevent the fall of the aged. The improvement in aged people's the muscular strength of the lower-limb, postural control and walking ability are important for quality of life and fall prevention. The aim of this study was to develop multiple evaluation methods in order to advise for improvement and maintenance of lower limb function between aged and young. The subjects were 16 healthy young volunteers (mean ± S.D: 19.9 ± 0.6 years) and 10 healthy aged volunteers (mean ± S.D: 80.6 ± 6.1 years). Measurement items related to lower limb function were selected from the items which we have ever used. Selected measurement items of function of lower are distance of extroversion of the toe, angle of flexion of the toe, maximum width of step, knee elevation, moving distance of greater trochanter, walking balance, toe-gap force and rotation range of ankle joint. Measurement items summarized by the principal component analysis into lower ability evaluation methods including walking ability and muscle strength of lower limb and flexibility of ankle. The young group demonstrated the factor of 1.6 greater the assessment score of walking ability compared with the aged group. The young group demonstrated the factor of 1.4 greater the assessment score of muscle strength of lower limb compared with the aged group. The young group demonstrated the factor of 1.2 greater the assessment score of flexibility of ankle compared with the aged group. The results suggested that it was possible to assess the lower limb function of aged and young numerically and to advise on their foot function.
A principal component analysis to interpret the spectral electrical behaviour of sediments
Inzoli, Silvia; Giudici, Mauro; Huisman, Johan Alexander
2015-04-01
Spectral Induced Polarization (SIP) measurements provide the opportunity to evaluate both conduction and polarization processes occurring in a porous medium. Conduction properties are related to the pore volume (for coarse grained materials) and also to the pore surface (for fine grained materials), whereas polarization properties are mainly controlled by the pore surface. Thus, SIP is a valuable survey method and its applicability ranges from aquifer characterization to organic and inorganic contaminant detection. However, the high number of factors affecting the spectral electrical behaviour still prevents an easy and unambiguous interpretation of SIP measurements. Controlled laboratory experiments by different research groups have shown that the resistivity phase depends on pore/grain size distribution, clay percentage, specific surface area, water saturation/conductivity and packing, among other factors. In the analysis of natural samples, all these variables are often simultaneously unknown and the direct application of the laboratory-derived empirical relationships between geophysical and sedimentological properties is not trivial. In this framework, we performed SIP laboratory measurements on unconsolidated alluvial samples of the Po river and Lambro river depositional units (Northern Italy). These samples were fully saturated with NaCl solutions with increasing electrical conductivity. SIP measurements were analysed using a Debye Decomposition technique and by fitting two Cole-Cole-type models (i.e. the Cole-Cole and the Generalized Cole-Cole). A principal component analysis was then applied separately on the three different subsets of model parameters. The main aims of this analysis were: i) to cluster the samples according to their spectral properties; ii) to evaluate differences and similarities of the fitting models in terms of the most significant combinations of parameters able to describe the overall variability within the dataset; iii) to analyse
Directory of Open Access Journals (Sweden)
Badaruddoza
2015-09-01
Full Text Available The current study focused to determine significant cardiovascular risk factors through principal component factor analysis (PCFA among three generations on 1827 individuals in three generations including 911 males (378 from offspring, 439 from parental and 94 from grand-parental generations and 916 females (261 from offspring, 515 from parental and 140 from grandparental generations. The study performed PCFA with orthogonal rotation to reduce 12 inter-correlated variables into groups of independent factors. The factors have been identified as 2 for male grandparents, 3 for male offspring, female parents and female grandparents each, 4 for male parents and 5 for female offspring. This data reduction method identified these factors that explained 72%, 84%, 79%, 69%, 70% and 73% for male and female offspring, male and female parents and male and female grandparents respectively, of the variations in original quantitative traits. The factor 1 accounting for the largest portion of variations was strongly loaded with factors related to obesity (body mass index (BMI, waist circumference (WC, waist to hip ratio (WHR, and thickness of skinfolds among all generations with both sexes, which has been known to be an independent predictor for cardiovascular morbidity and mortality. The second largest components, factor 2 and factor 3 for almost all generations reflected traits of blood pressure phenotypes loaded, however, in male offspring generation it was observed that factor 2 was loaded with blood pressure phenotypes as well as obesity. This study not only confirmed but also extended prior work by developing a cumulative risk scale from factor scores. Till today, such a cumulative and extensive scale has not been used in any Indian studies with individuals of three generations. These findings and study highlight the importance of global approach for assessing the risk and need for studies that elucidate how these different cardiovascular risk factors
Algorithms for Sparse Non-negative Tucker Decompositions
DEFF Research Database (Denmark)
Mørup, Morten; Hansen, Lars Kai
2008-01-01
There is a increasing interest in analysis of large scale multi-way data. The concept of multi-way data refers to arrays of data with more than two dimensions, i.e., taking the form of tensors. To analyze such data, decomposition techniques are widely used. The two most common decompositions...... decompositions). To reduce ambiguities of this type of decomposition we develop updates that can impose sparseness in any combination of modalities, hence, proposed algorithms for sparse non-negative Tucker decompositions (SN-TUCKER). We demonstrate how the proposed algorithms are superior to existing algorithms...... for Tucker decompositions when indeed the data and interactions can be considered non-negative. We further illustrate how sparse coding can help identify what model (PARAFAC or Tucker) is the most appropriate for the data as well as to select the number of components by turning off excess components...
Bellomarino, S A; Parker, R M; Conlan, X A; Barnett, N W; Adams, M J
2010-09-23
HPLC with acidic potassium permanganate chemiluminescence detection was employed to analyse 17 Cabernet Sauvignon wines across a range of vintages (1971-2003). Partial least squares regression analysis and principal components analysis was used in order to investigate the relationship between wine composition and vintage. Tartaric acid, vanillic acid, catechin, sinapic acid, ethyl gallate, myricetin, procyanadin B and resveratrol were found to be important components in terms of differences between the vintages.
Doosti, Elham; Shahlaei, Mohsen
2015-01-01
Quantitative relationships between structures of a set of p38 map kinase inhibitors and their activities were investigated by principal component regression (PCR) and principal componentartificial neural network (PC-ANN). Latent variables (called components) generated by principal component analysis procedure were applied as the input of developed Quantitative structure- activity relationships (QSAR) models. An exact study of predictability of PCR and PC-ANN showed that the later model has much higher ability to calculate the biological activity of the investigated molecules. Also, experimental and estimated biological activities of compounds used in model development step have indicated a good correlation. Obtained results show that a non-linear model explaining the relationship between the pIC50s and the calculated principal components (that extract from structural descriptors of the studied molecules) is superior than linear model. Some typical figures of merit for QSAR studies explaining the accuracy and predictability of the suggested models were calculated. Therefore, to design novel inhibitors of p38 map kinase with high potency and low undesired effects the developed QSAR models were used to estimate biological pIC50 of the studied compounds.
Watermarking Based on Principal Component Analysis%基于主分量分析的数字水印
Institute of Scientific and Technical Information of China (English)
王朔中
2000-01-01
A new watermarking scheme using principal component analysis (PCA) is described. The proposed method inserts highly robust watermarks into still images without degrading their visual quality. Experimental results are presented, showing that the PCA-based watermarks can resist malicious attacks including lowpass filtering, re-scaling, and compression coding.
Ceulemans, Eva; Kiers, Henk A.L.
Several three-mode principal component models can be considered for the modelling of three-way, three-mode data, including the Candecomp/Parafac, Tucker3, Tucker2, and Tucker I models. The following question then may be raised: given a specific data set, which of these models should be selected, and
Ceulemans, Eva; Kiers, Henk A.L.
2006-01-01
Several three-mode principal component models can be considered for the modelling of three-way, three-mode data, including the Candecomp/Parafac, Tucker3, Tucker2, and Tucker I models. The following question then may be raised: given a specific data set, which of these models should be selected, and
CSIR Research Space (South Africa)
Nel, W
2009-10-01
Full Text Available to estimate the 3-D position of scatterers as a by-product of the analysis. The technique is based on principal component analysis of accurate scatterer range histories and is shown only in simulation. Future research should focus on practical application....
Buzanskas, M E; Savegnago, R P; Grossi, D A; Venturini, G C; Queiroz, S A; Silva, L O C; Júnior, R A A Torres; Munari, D P; Alencar, M M
2013-01-01
Phenotypic data from female Canchim beef cattle were used to obtain estimates of genetic parameters for reproduction and growth traits using a linear animal mixed model. In addition, relationships among animal estimated breeding values (EBVs) for these traits were explored using principal component analysis. The traits studied in female Canchim cattle were age at first calving (AFC), age at second calving (ASC), calving interval (CI), and bodyweight at 420 days of age (BW420). The heritability estimates for AFC, ASC, CI and BW420 were 0.03±0.01, 0.07±0.01, 0.06±0.02, and 0.24±0.02, respectively. The genetic correlations for AFC with ASC, AFC with CI, AFC with BW420, ASC with CI, ASC with BW420, and CI with BW420 were 0.87±0.07, 0.23±0.02, -0.15±0.01, 0.67±0.13, -0.07±0.13, and 0.02±0.14, respectively. Standardised EBVs for AFC, ASC and CI exhibited a high association with the first principal component, whereas the standardised EBV for BW420 was closely associated with the second principal component. The heritability estimates for AFC, ASC and CI suggest that these traits would respond slowly to selection. However, selection response could be enhanced by constructing selection indices based on the principal components.
Directory of Open Access Journals (Sweden)
Yunfeng Dong
2017-01-01
Full Text Available The weighted sum and genetic algorithm-based hybrid method (WSGA-based HM, which has been applied to multiobjective orbit optimizations, is negatively influenced by human factors through the artificial choice of the weight coefficients in weighted sum method and the slow convergence of GA. To address these two problems, a cluster and principal component analysis-based optimization method (CPC-based OM is proposed, in which many candidate orbits are gradually randomly generated until the optimal orbit is obtained using a data mining method, that is, cluster analysis based on principal components. Then, the second cluster analysis of the orbital elements is introduced into CPC-based OM to improve the convergence, developing a novel double cluster and principal component analysis-based optimization method (DCPC-based OM. In DCPC-based OM, the cluster analysis based on principal components has the advantage of reducing the human influences, and the cluster analysis based on six orbital elements can reduce the search space to effectively accelerate convergence. The test results from a multiobjective numerical benchmark function and the orbit design results of an Earth observation satellite show that DCPC-based OM converges more efficiently than WSGA-based HM. And DCPC-based OM, to some degree, reduces the influence of human factors presented in WSGA-based HM.
Directory of Open Access Journals (Sweden)
Shuai Sun
2014-06-01
Full Text Available Due to the scarcity of resources of Ziziphi spinosae semen (ZSS, many inferior goods and even adulterants are generally found in medicine markets. To strengthen the quality control, HPLC fingerprint common pattern established in this paper showed three main bioactive compounds in one chromatogram simultaneously. Principal component analysis based on DAD signals could discriminate adulterants and inferiorities. Principal component analysis indicated that all samples could be mainly regrouped into two main clusters according to the first principal component (PC1, redefined as Vicenin II and the second principal component (PC2, redefined as zizyphusine. PC1 and PC2 could explain 91.42% of the variance. Content of zizyphusine fluctuated more greatly than that of spinosin, and this result was also confirmed by the HPTLC result. Samples with low content of jujubosides and two common adulterants could not be used equivalently with authenticated ones in clinic, while one reference standard extract could substitute the crude drug in pharmaceutical production. Giving special consideration to the well-known bioactive saponins but with low response by end absorption, a fast and cheap HPTLC method for quality control of ZSS was developed and the result obtained was commensurate well with that of HPLC analysis. Samples having similar fingerprints to HPTLC common pattern targeting at saponins could be regarded as authenticated ones. This work provided a faster and cheaper way for quality control of ZSS and laid foundation for establishing a more effective quality control method for ZSS.
2010-01-01
NAMIC), funded by the National Institutes of Health through the NIH Roadmap for Medical Research, Grant U54 EB005149. REFERENCES [1] Ekman , P., [ Emotion ...principal component analysis 1. INTRODUCTION The human face is a rich medium through which people communicate their emotions . Researchers have identified
Kernel Principal Component Analysis for dimensionality reduction in fMRI-based diagnosis of ADHD
Directory of Open Access Journals (Sweden)
Gagan S Sidhu
2012-11-01
Full Text Available This article explores various preprocessing tools that select/create features to help a learner produce a classifier that can use fMRI data to effectively discriminate Attention-Deficit Hyperactivity Disorder (ADHD patients from healthy controls. We consider four different learning tasks: predicting either two (ADHD vs control or three classes (ADHD-1 vs ADHD-3 vs control, where each use either the imaging data only, or the phenotypic and imaging data. After averaging, BOLD-signal normalization, and masking of the fMRI images, we considered applying Fast Fourier Transform (FFT, possibly followed by some Principal Component Analysis (PCA variant (over time: PCA-t; over space and time: PCA-st or the kernelized variant, kPCA-st, to produce inputs to a learner, to determine which learned classifier performs the best – or at least better than the baseline of 64.2%, which is the proportion of the majority class (here, controls.In the two-class setting, PCA-t and PCA-st did not perform statistically better than baseline, whereas FFT and kPCA-st did (FFT, 68.4%; kPCA-st, 70.3%; when combined with the phenotypic data, which by itself produces 72.9% accuracy, all methods performed statistically better than the baseline, but none did better than using the phenotypic data. In the three-class setting, neither the PCA variants, or the phenotypic data classifiers, performed statistically better than the baseline.We next used the FFT output as input to the PCA variants. In the two-class setting, the PCA variants performed statistically better than the baseline using either the FFTed waveforms only (FFT+PCA-t, 69.6%,; FFT+PCA-st, 69.3% ; FFT+kPCA-st, 68.7%, or combining them with the phenotypic data (FFT+PCA-t, 70.6%; FFT+PCA-st, 70.6%; kPCA-st, 76%. In both settings, combining FFT+kPCA-st’s features with the phenotypic data was better than using only the phenotypic data, with the result in the two-class setting being statistically better.
Richman, Michael B.; Gong, Xiaofeng
1999-06-01
When applying eigenanalysis, one decision analysts make is the determination of what magnitude an eigenvector coefficient (e.g., principal component (PC) loading) must achieve to be considered as physically important. Such coefficients can be displayed on maps or in a time series or tables to gain a fuller understanding of a large array of multivariate data. Previously, such a decision on what value of loading designates a useful signal (hereafter called the loading `cutoff') for each eigenvector has been purely subjective. The importance of selecting such a cutoff is apparent since those loading elements in the range of zero to the cutoff are ignored in the interpretation and naming of PCs since only the absolute values of loadings greater than the cutoff are physically analyzed. This research sets out to objectify the problem of best identifying the cutoff by application of matching between known correlation/covariance structures and their corresponding eigenpatterns, as this cutoff point (known as the hyperplane width) is varied.A Monte Carlo framework is used to resample at five sample sizes. Fourteen different hyperplane cutoff widths are tested, bootstrap resampled 50 times to obtain stable results. The key findings are that the location of an optimal hyperplane cutoff width (one which maximized the information content match between the eigenvector and the parent dispersion matrix from which it was derived) is a well-behaved unimodal function. On an individual eigenvector, this enables the unique determination of a hyperplane cutoff value to be used to separate those loadings that best reflect the relationships from those that do not. The effects of sample size on the matching accuracy are dramatic as the values for all solutions (i.e., unrotated, rotated) rose steadily from 25 through 250 observations and then weakly thereafter. The specific matching coefficients are useful to assess the penalties incurred when one analyzes eigenvector coefficients of a
Improving Cross-Day EEG-Based Emotion Classification Using Robust Principal Component Analysis
Directory of Open Access Journals (Sweden)
Yuan-Pin Lin
2017-07-01
Full Text Available Constructing a robust emotion-aware analytical framework using non-invasively recorded electroencephalogram (EEG signals has gained intensive attentions nowadays. However, as deploying a laboratory-oriented proof-of-concept study toward real-world applications, researchers are now facing an ecological challenge that the EEG patterns recorded in real life substantially change across days (i.e., day-to-day variability, arguably making the pre-defined predictive model vulnerable to the given EEG signals of a separate day. The present work addressed how to mitigate the inter-day EEG variability of emotional responses with an attempt to facilitate cross-day emotion classification, which was less concerned in the literature. This study proposed a robust principal component analysis (RPCA-based signal filtering strategy and validated its neurophysiological validity and machine-learning practicability on a binary emotion classification task (happiness vs. sadness using a five-day EEG dataset of 12 subjects when participated in a music-listening task. The empirical results showed that the RPCA-decomposed sparse signals (RPCA-S enabled filtering off the background EEG activity that contributed more to the inter-day variability, and predominately captured the EEG oscillations of emotional responses that behaved relatively consistent along days. Through applying a realistic add-day-in classification validation scheme, the RPCA-S progressively exploited more informative features (from 12.67 ± 5.99 to 20.83 ± 7.18 and improved the cross-day binary emotion-classification accuracy (from 58.31 ± 12.33% to 64.03 ± 8.40% as trained the EEG signals from one to four recording days and tested against one unseen subsequent day. The original EEG features (prior to RPCA processing neither achieved the cross-day classification (the accuracy was around chance level nor replicated the encouraging improvement due to the inter-day EEG variability. This result
Improving Cross-Day EEG-Based Emotion Classification Using Robust Principal Component Analysis.
Lin, Yuan-Pin; Jao, Ping-Keng; Yang, Yi-Hsuan
2017-01-01
Constructing a robust emotion-aware analytical framework using non-invasively recorded electroencephalogram (EEG) signals has gained intensive attentions nowadays. However, as deploying a laboratory-oriented proof-of-concept study toward real-world applications, researchers are now facing an ecological challenge that the EEG patterns recorded in real life substantially change across days (i.e., day-to-day variability), arguably making the pre-defined predictive model vulnerable to the given EEG signals of a separate day. The present work addressed how to mitigate the inter-day EEG variability of emotional responses with an attempt to facilitate cross-day emotion classification, which was less concerned in the literature. This study proposed a robust principal component analysis (RPCA)-based signal filtering strategy and validated its neurophysiological validity and machine-learning practicability on a binary emotion classification task (happiness vs. sadness) using a five-day EEG dataset of 12 subjects when participated in a music-listening task. The empirical results showed that the RPCA-decomposed sparse signals (RPCA-S) enabled filtering off the background EEG activity that contributed more to the inter-day variability, and predominately captured the EEG oscillations of emotional responses that behaved relatively consistent along days. Through applying a realistic add-day-in classification validation scheme, the RPCA-S progressively exploited more informative features (from 12.67 ± 5.99 to 20.83 ± 7.18) and improved the cross-day binary emotion-classification accuracy (from 58.31 ± 12.33% to 64.03 ± 8.40%) as trained the EEG signals from one to four recording days and tested against one unseen subsequent day. The original EEG features (prior to RPCA processing) neither achieved the cross-day classification (the accuracy was around chance level) nor replicated the encouraging improvement due to the inter-day EEG variability. This result demonstrated the
Directory of Open Access Journals (Sweden)
Yen Ching-Ho
2010-11-01
Full Text Available Abstract Background Metabolic syndrome (MS is an important current public health problem faced worldwide. To prevent an "epidemic" of this syndrome, it is important to develop an easy single-parameter screening technique (such as waist circumference (WC determination recommended by the International Diabetes Federation. Previous studies proved that age is a chief factor corresponding to central obesity. We intended to present a new index based on the linear combination of body mass index, and age, which could enhance the area under the receiver operating characteristic curves (AUCs for assessing the risk of MS. Methods The labour law of the Association of Labor Standard Law, Taiwan, states that employers and employees are respectively obligated to offer and receive routine health examination periodically. Secondary data analysis and subject's biomarkers among five high-tech factories were used in this study between 2007 and 2008 in northern Taiwan. The subjects included 4712 males and 4196 females. The first principal component score (FPCS and equal-weighted average (EWA were determined by statistical analysis. Results Most of the metabolic and clinical characteristics were significantly higher in males than in females, except high-density lipoprotein cholesterol level. The older group (>45 years had significantly lower values for height and high-density lipoprotein cholesterol level than the younger group. The AUCs of FPCS and EWA were significantly larger than those of WC and waist-to-height ratio. The low specificities of EWA and FPCS were compensated for by their substantially high sensitivities. FPCS ≥ 0.914 (15.4% and EWA ≥ 8.8 (6.3% were found to be the most prevalent cut off points in males and females, respectively. Conclusions The Bureau of Health Promotion, Department of Health, Taiwan, had recommended the use of WC ≥ 90 cm for males and ≥ 80 cm for females as singular criteria for the determination of central obesity instead
Lovell, D P
1996-10-01
The multivariate statistical method Principal Component Analysis (PCA) has been applied to a set of data from the ECETOC reference chemical data bank. PCA is a multivariate method that can be used to explore a complex data set. The results of the analysis show that most of the variability in the values for tissue damage scores for the 55 chemicals can be described by a single principal component which explains nearly 80% of the variability. This component is derived by giving approximately equal weight to each of the 18 individual measures made on the tissues over the 24-, 48- and 72-hr observation period. The principal component scores on the first component (PC I) are very highly correlated with the maximum individual weighted Draize scores or total Draize scores (TDS) derived using the Draize scoring method. A second principal component, describing about 7% of the variability, contrasts damage measured on the iris and cornea with that measured on the conjunctiva. Plots of principal component scores show the overall pattern of responses. In general, low measures of the TDS and a positive (PC I) score are associated with iris and conjunctival damage (damage to the iris was never recorded in the absence of damage to the conjunctiva). High TDS and negative PC I scores are associated with corneal and/or iris and conjunctiva damage. Plots of the principal component scores identify some chemicals that appear to cause unusual patterns of damage and identify some individual animals as having outlying or idiosyncratic responses. However, the analysis suggests that (i) there is only limited evidence for differential responses of different tissues and (ii) that attempts to identify alternative tests which predict specific types of tissue damage based on the results collected in a Draize test are likely to be unsuccessful. It indicates that further refinement of the results of the in vivo Draize test will not arise from more detailed analysis of the tissue scores but by
Early detection of dental fluorosis using Raman spectroscopy and principal component analysis.
González-Solís, José Luis; Martínez-Cano, Evelia; Magaña-López, Yolanda
2015-08-01
Raman spectroscopic technique has the potential to provide vibrational spectra of minerals by analyzing scattered light caused by monochromatic laser excitation. In this paper, recent applications of Raman spectroscopy in the study of dental hard tissues are reported. Special attention is given to mineral components in enamel and to calcium fluoride formed in/on enamel. The criteria used to classify the dental hard samples were according to the Dean Index (DI), which consists into healthy or control, mild, moderate, and severe, indicating the amount of dental fluorosis observed on enamel. A total of 39 dental samples (9 control, 9 mild, 10 moderate, and 11 severe) were analyzed in the study. Dental samples were positioned under an Olympus microscope, and around 10 points were chosen for Raman measurement. All spectra were collected by a Horiba Jobin-Yvon LabRAM HR800 Raman Spectrometer with a laser of 830-nm and 17-mW power irradiation. Raw spectra were processed by carrying out baseline correction, smoothing, and normalization to remove noise, florescence, and shot noise and then analyzed using principal component analysis (PCA). In the spectra of dental samples, we observed the main bands as the broad band due to CO[Formula: see text] (240-300 cm (-1)), CaF 2 (322 cm (-1)), PO[Formula: see text] vibrations (437 and 450 cm (-1)), PO[Formula: see text] vibrations (582, 598, and 609 cm (-1)), PO[Formula: see text] vibrations (960 cm (-1)), PO[Formula: see text] vibrations (1,045 cm (-1)), and CO[Formula: see text] vibration (1,073 cm (-1)). Nevertheless, the intensity of the band at 960 cm (-1) associated to symmetric stretch of phosphate, PO[Formula: see text], decreases as the amount of dental fluorosis increases, suggesting that the intensity of this band could be used to quantitatively measure the level of fluorosis on a dental sample. On the other hand, PCA allowed to identify two large clusters discriminating between control, and severe and moderate samples
Chattopadhyay, Goutami; Chattopadhyay, Surajit; Chakraborthy, Parthasarathi
2012-07-01
The present study deals with daily total ozone concentration time series over four metro cities of India namely Kolkata, Mumbai, Chennai, and New Delhi in the multivariate environment. Using the Kaiser-Meyer-Olkin measure, it is established that the data set under consideration are suitable for principal component analysis. Subsequently, by introducing rotated component matrix for the principal components, the predictors suitable for generating artificial neural network (ANN) for daily total ozone prediction are identified. The multicollinearity is removed in this way. Models of ANN in the form of multilayer perceptron trained through backpropagation learning are generated for all of the study zones, and the model outcomes are assessed statistically. Measuring various statistics like Pearson correlation coefficients, Willmott's indices, percentage errors of prediction, and mean absolute errors, it is observed that for Mumbai and Kolkata the proposed ANN model generates very good predictions. The results are supported by the linearly distributed coordinates in the scatterplots.
DEFF Research Database (Denmark)
Vitanis, Viton; Manka, Robert; Giese, Daniel;
2011-01-01
-t principal component analysis reconstructions. Comparison of the two methods based on rest and stress three-dimensional perfusion data acquired with 2.3 × 2.3 × 10 mm(3) during a 225 msec acquisition window in patients confirms the findings and demonstrates the potential of compartment-based k-t principal...... permits three-dimensional perfusion imaging at 10-fold nominal acceleration. Using numerical simulations, it is shown that the compartment-based method results in accurate representations of dynamic signal intensity changes with significant improvements of temporal fidelity in comparison to conventional k...
Wojciechowski, Adam
2017-04-01
In order to assess ecodiversity understood as a comprehensive natural landscape factor (Jedicke 2001), it is necessary to apply research methods which recognize the environment in a holistic way. Principal component analysis may be considered as one of such methods as it allows to distinguish the main factors determining landscape diversity on the one hand, and enables to discover regularities shaping the relationships between various elements of the environment under study on the other hand. The procedure adopted to assess ecodiversity with the use of principal component analysis involves: a) determining and selecting appropriate factors of the assessed environment qualities (hypsometric, geological, hydrographic, plant, and others); b) calculating the absolute value of individual qualities for the basic areas under analysis (e.g. river length, forest area, altitude differences, etc.); c) principal components analysis and obtaining factor maps (maps of selected components); d) generating a resultant, detailed map and isolating several classes of ecodiversity. An assessment of ecodiversity with the use of principal component analysis was conducted in the test area of 299,67 km2 in Debnica Kaszubska commune. The whole commune is situated in the Weichselian glaciation area of high hypsometric and morphological diversity as well as high geo- and biodiversity. The analysis was based on topographical maps of the commune area in scale 1:25000 and maps of forest habitats. Consequently, nine factors reflecting basic environment elements were calculated: maximum height (m), minimum height (m), average height (m), the length of watercourses (km), the area of water reservoirs (m2), total forest area (ha), coniferous forests habitats area (ha), deciduous forest habitats area (ha), alder habitats area (ha). The values for individual factors were analysed for 358 grid cells of 1 km2. Based on the principal components analysis, four major factors affecting commune ecodiversity
Friend, Jennifer; Watson, Robert
2014-01-01
This article examines comparative survey results for 16 principal preparation programs located in the Midwestern state of Missouri across a four-year time period from 2008 to 2012. The authors are founding members of a statewide Higher Education Evaluation Committee (HEEC), which has been meeting on a monthly basis since 2005, comprised of faculty…
Bills, Andrew; Giles, David; Rogers, Bev
2017-01-01
Purpose: The research seeks to capture the "special character" of schools as seen through the eyes of the Principal and to introduce alternative understandings of ideological praxis' to challenge and unsettle the dominant ideology and logics of secondary schooling with consequent school design implications in South Australia.…
Directory of Open Access Journals (Sweden)
Il-Youp Kwak
2016-01-01
Full Text Available We previously proposed a simple regression-based method to map quantitative trait loci underlying function-valued phenotypes. In order to better handle the case of noisy phenotype measurements and accommodate the correlation structure among time points, we propose an alternative approach that maintains much of the simplicity and speed of the regression-based method. We overcome noisy measurements by replacing the observed data with a smooth approximation. We then apply functional principal component analysis, replacing the smoothed phenotype data with a small number of principal components. Quantitative trait locus mapping is applied to these dimension-reduced data, either with a multi-trait method or by considering the traits individually and then taking the average or maximum LOD score across traits. We apply these approaches to root gravitropism data on Arabidopsis recombinant inbred lines and further investigate their performance in computer simulations. Our methods have been implemented in the R package, funqtl.
Segil, Jacob L; Weir, Richard F ff
2014-03-01
An ideal myoelectric prosthetic hand should have the ability to continuously morph between any posture like an anatomical hand. This paper describes the design and validation of a morphing myoelectric hand controller based on principal component analysis of human grasping. The controller commands continuously morphing hand postures including functional grasps using between two and four surface electromyography (EMG) electrodes pairs. Four unique maps were developed to transform the EMG control signals in the principal component domain. A preliminary validation experiment was performed by 10 nonamputee subjects to determine the map with highest performance. The subjects used the myoelectric controller to morph a virtual hand between functional grasps in a series of randomized trials. The number of joints controlled accurately was evaluated to characterize the performance of each map. Additional metrics were studied including completion rate, time to completion, and path efficiency. The highest performing map controlled over 13 out of 15 joints accurately.