Hallin, M.; Hörmann, S.; Piegorsch, W.; El Shaarawi, A.
2012-01-01
Principal Components are probably the best known and most widely used of all multivariate analysis techniques. The essential idea consists in performing a linear transformation of the observed k-dimensional variables in such a way that the new variables are vectors of k mutually orthogonal
Multiscale principal component analysis
International Nuclear Information System (INIS)
Akinduko, A A; Gorban, A N
2014-01-01
Principal component analysis (PCA) is an important tool in exploring data. The conventional approach to PCA leads to a solution which favours the structures with large variances. This is sensitive to outliers and could obfuscate interesting underlying structures. One of the equivalent definitions of PCA is that it seeks the subspaces that maximize the sum of squared pairwise distances between data projections. This definition opens up more flexibility in the analysis of principal components which is useful in enhancing PCA. In this paper we introduce scales into PCA by maximizing only the sum of pairwise distances between projections for pairs of datapoints with distances within a chosen interval of values [l,u]. The resulting principal component decompositions in Multiscale PCA depend on point (l,u) on the plane and for each point we define projectors onto principal components. Cluster analysis of these projectors reveals the structures in the data at various scales. Each structure is described by the eigenvectors at the medoid point of the cluster which represent the structure. We also use the distortion of projections as a criterion for choosing an appropriate scale especially for data with outliers. This method was tested on both artificial distribution of data and real data. For data with multiscale structures, the method was able to reveal the different structures of the data and also to reduce the effect of outliers in the principal component analysis
Euler principal component analysis
Liwicki, Stephan; Tzimiropoulos, Georgios; Zafeiriou, Stefanos; Pantic, Maja
Principal Component Analysis (PCA) is perhaps the most prominent learning tool for dimensionality reduction in pattern recognition and computer vision. However, the ℓ 2-norm employed by standard PCA is not robust to outliers. In this paper, we propose a kernel PCA method for fast and robust PCA,
Putilov, Arcady A; Donskaya, Olga G
2016-01-01
Age-associated changes in different bandwidths of the human electroencephalographic (EEG) spectrum are well documented, but their functional significance is poorly understood. This spectrum seems to represent summation of simultaneous influences of several sleep-wake regulatory processes. Scoring of its orthogonal (uncorrelated) principal components can help in separation of the brain signatures of these processes. In particular, the opposite age-associated changes were documented for scores on the two largest (1st and 2nd) principal components of the sleep EEG spectrum. A decrease of the first score and an increase of the second score can reflect, respectively, the weakening of the sleep drive and disinhibition of the opposing wake drive with age. In order to support the suggestion of age-associated disinhibition of the wake drive from the antagonistic influence of the sleep drive, we analyzed principal component scores of the resting EEG spectra obtained in sleep deprivation experiments with 81 healthy young adults aged between 19 and 26 and 40 healthy older adults aged between 45 and 66 years. At the second day of the sleep deprivation experiments, frontal scores on the 1st principal component of the EEG spectrum demonstrated an age-associated reduction of response to eyes closed relaxation. Scores on the 2nd principal component were either initially increased during wakefulness or less responsive to such sleep-provoking conditions (frontal and occipital scores, respectively). These results are in line with the suggestion of disinhibition of the wake drive with age. They provide an explanation of why older adults are less vulnerable to sleep deprivation than young adults.
2014-01-01
Background The chemical composition of aerosols and particle size distributions are the most significant factors affecting air quality. In particular, the exposure to finer particles can cause short and long-term effects on human health. In the present paper PM10 (particulate matter with aerodynamic diameter lower than 10 μm), CO, NOx (NO and NO2), Benzene and Toluene trends monitored in six monitoring stations of Bari province are shown. The data set used was composed by bi-hourly means for all parameters (12 bi-hourly means per day for each parameter) and it’s referred to the period of time from January 2005 and May 2007. The main aim of the paper is to provide a clear illustration of how large data sets from monitoring stations can give information about the number and nature of the pollutant sources, and mainly to assess the contribution of the traffic source to PM10 concentration level by using multivariate statistical techniques such as Principal Component Analysis (PCA) and Absolute Principal Component Scores (APCS). Results Comparing the night and day mean concentrations (per day) for each parameter it has been pointed out that there is a different night and day behavior for some parameters such as CO, Benzene and Toluene than PM10. This suggests that CO, Benzene and Toluene concentrations are mainly connected with transport systems, whereas PM10 is mostly influenced by different factors. The statistical techniques identified three recurrent sources, associated with vehicular traffic and particulate transport, covering over 90% of variance. The contemporaneous analysis of gas and PM10 has allowed underlining the differences between the sources of these pollutants. Conclusions The analysis of the pollutant trends from large data set and the application of multivariate statistical techniques such as PCA and APCS can give useful information about air quality and pollutant’s sources. These knowledge can provide useful advices to environmental policies in
Qu, Mingkai; Wang, Yan; Huang, Biao; Zhao, Yongcun
2018-06-01
The traditional source apportionment models, such as absolute principal component scores-multiple linear regression (APCS-MLR), are usually susceptible to outliers, which may be widely present in the regional geochemical dataset. Furthermore, the models are merely built on variable space instead of geographical space and thus cannot effectively capture the local spatial characteristics of each source contributions. To overcome the limitations, a new receptor model, robust absolute principal component scores-robust geographically weighted regression (RAPCS-RGWR), was proposed based on the traditional APCS-MLR model. Then, the new method was applied to the source apportionment of soil metal elements in a region of Wuhan City, China as a case study. Evaluations revealed that: (i) RAPCS-RGWR model had better performance than APCS-MLR model in the identification of the major sources of soil metal elements, and (ii) source contributions estimated by RAPCS-RGWR model were more close to the true soil metal concentrations than that estimated by APCS-MLR model. It is shown that the proposed RAPCS-RGWR model is a more effective source apportionment method than APCS-MLR (i.e., non-robust and global model) in dealing with the regional geochemical dataset. Copyright © 2018 Elsevier B.V. All rights reserved.
Interpretable functional principal component analysis.
Lin, Zhenhua; Wang, Liangliang; Cao, Jiguo
2016-09-01
Functional principal component analysis (FPCA) is a popular approach to explore major sources of variation in a sample of random curves. These major sources of variation are represented by functional principal components (FPCs). The intervals where the values of FPCs are significant are interpreted as where sample curves have major variations. However, these intervals are often hard for naïve users to identify, because of the vague definition of "significant values". In this article, we develop a novel penalty-based method to derive FPCs that are only nonzero precisely in the intervals where the values of FPCs are significant, whence the derived FPCs possess better interpretability than the FPCs derived from existing methods. To compute the proposed FPCs, we devise an efficient algorithm based on projection deflation techniques. We show that the proposed interpretable FPCs are strongly consistent and asymptotically normal under mild conditions. Simulation studies confirm that with a competitive performance in explaining variations of sample curves, the proposed FPCs are more interpretable than the traditional counterparts. This advantage is demonstrated by analyzing two real datasets, namely, electroencephalography data and Canadian weather data. © 2015, The International Biometric Society.
On Bayesian Principal Component Analysis
Czech Academy of Sciences Publication Activity Database
Šmídl, Václav; Quinn, A.
2007-01-01
Roč. 51, č. 9 (2007), s. 4101-4123 ISSN 0167-9473 R&D Projects: GA MŠk(CZ) 1M0572 Institutional research plan: CEZ:AV0Z10750506 Keywords : Principal component analysis ( PCA ) * Variational bayes (VB) * von-Mises–Fisher distribution Subject RIV: BC - Control Systems Theory Impact factor: 1.029, year: 2007 http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6V8V-4MYD60N-6&_user=10&_coverDate=05%2F15%2F2007&_rdoc=1&_fmt=&_orig=search&_sort=d&view=c&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=b8ea629d48df926fe18f9e5724c9003a
Teaching Principal Components Using Correlations.
Westfall, Peter H; Arias, Andrea L; Fulton, Lawrence V
2017-01-01
Introducing principal components (PCs) to students is difficult. First, the matrix algebra and mathematical maximization lemmas are daunting, especially for students in the social and behavioral sciences. Second, the standard motivation involving variance maximization subject to unit length constraint does not directly connect to the "variance explained" interpretation. Third, the unit length and uncorrelatedness constraints of the standard motivation do not allow re-scaling or oblique rotations, which are common in practice. Instead, we propose to motivate the subject in terms of optimizing (weighted) average proportions of variance explained in the original variables; this approach may be more intuitive, and hence easier to understand because it links directly to the familiar "R-squared" statistic. It also removes the need for unit length and uncorrelatedness constraints, provides a direct interpretation of "variance explained," and provides a direct answer to the question of whether to use covariance-based or correlation-based PCs. Furthermore, the presentation can be made without matrix algebra or optimization proofs. Modern tools from data science, including heat maps and text mining, provide further help in the interpretation and application of PCs; examples are given. Together, these techniques may be used to revise currently used methods for teaching and learning PCs in the behavioral sciences.
A principal components model of soundscape perception.
Axelsson, Östen; Nilsson, Mats E; Berglund, Birgitta
2010-11-01
There is a need for a model that identifies underlying dimensions of soundscape perception, and which may guide measurement and improvement of soundscape quality. With the purpose to develop such a model, a listening experiment was conducted. One hundred listeners measured 50 excerpts of binaural recordings of urban outdoor soundscapes on 116 attribute scales. The average attribute scale values were subjected to principal components analysis, resulting in three components: Pleasantness, eventfulness, and familiarity, explaining 50, 18 and 6% of the total variance, respectively. The principal-component scores were correlated with physical soundscape properties, including categories of dominant sounds and acoustic variables. Soundscape excerpts dominated by technological sounds were found to be unpleasant, whereas soundscape excerpts dominated by natural sounds were pleasant, and soundscape excerpts dominated by human sounds were eventful. These relationships remained after controlling for the overall soundscape loudness (Zwicker's N(10)), which shows that 'informational' properties are substantial contributors to the perception of soundscape. The proposed principal components model provides a framework for future soundscape research and practice. In particular, it suggests which basic dimensions are necessary to measure, how to measure them by a defined set of attribute scales, and how to promote high-quality soundscapes.
Principal component regression analysis with SPSS.
Liu, R X; Kuang, J; Gong, Q; Hou, X L
2003-06-01
The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.
Multilevel sparse functional principal component analysis.
Di, Chongzhi; Crainiceanu, Ciprian M; Jank, Wolfgang S
2014-01-29
We consider analysis of sparsely sampled multilevel functional data, where the basic observational unit is a function and data have a natural hierarchy of basic units. An example is when functions are recorded at multiple visits for each subject. Multilevel functional principal component analysis (MFPCA; Di et al. 2009) was proposed for such data when functions are densely recorded. Here we consider the case when functions are sparsely sampled and may contain only a few observations per function. We exploit the multilevel structure of covariance operators and achieve data reduction by principal component decompositions at both between and within subject levels. We address inherent methodological differences in the sparse sampling context to: 1) estimate the covariance operators; 2) estimate the functional principal component scores; 3) predict the underlying curves. Through simulations the proposed method is able to discover dominating modes of variations and reconstruct underlying curves well even in sparse settings. Our approach is illustrated by two applications, the Sleep Heart Health Study and eBay auctions.
Integrating Data Transformation in Principal Components Analysis
Maadooliat, Mehdi; Huang, Jianhua Z.; Hu, Jianhua
2015-01-01
Principal component analysis (PCA) is a popular dimension reduction method to reduce the complexity and obtain the informative aspects of high-dimensional datasets. When the data distribution is skewed, data transformation is commonly used prior
Principal component regression for crop yield estimation
Suryanarayana, T M V
2016-01-01
This book highlights the estimation of crop yield in Central Gujarat, especially with regard to the development of Multiple Regression Models and Principal Component Regression (PCR) models using climatological parameters as independent variables and crop yield as a dependent variable. It subsequently compares the multiple linear regression (MLR) and PCR results, and discusses the significance of PCR for crop yield estimation. In this context, the book also covers Principal Component Analysis (PCA), a statistical procedure used to reduce a number of correlated variables into a smaller number of uncorrelated variables called principal components (PC). This book will be helpful to the students and researchers, starting their works on climate and agriculture, mainly focussing on estimation models. The flow of chapters takes the readers in a smooth path, in understanding climate and weather and impact of climate change, and gradually proceeds towards downscaling techniques and then finally towards development of ...
COPD phenotype description using principal components analysis
DEFF Research Database (Denmark)
Roy, Kay; Smith, Jacky; Kolsum, Umme
2009-01-01
BACKGROUND: Airway inflammation in COPD can be measured using biomarkers such as induced sputum and Fe(NO). This study set out to explore the heterogeneity of COPD using biomarkers of airway and systemic inflammation and pulmonary function by principal components analysis (PCA). SUBJECTS...... AND METHODS: In 127 COPD patients (mean FEV1 61%), pulmonary function, Fe(NO), plasma CRP and TNF-alpha, sputum differential cell counts and sputum IL8 (pg/ml) were measured. Principal components analysis as well as multivariate analysis was performed. RESULTS: PCA identified four main components (% variance...... associations between the variables within components 1 and 2. CONCLUSION: COPD is a multi dimensional disease. Unrelated components of disease were identified, including neutrophilic airway inflammation which was associated with systemic inflammation, and sputum eosinophils which were related to increased Fe...
Constrained principal component analysis and related techniques
Takane, Yoshio
2013-01-01
In multivariate data analysis, regression techniques predict one set of variables from another while principal component analysis (PCA) finds a subspace of minimal dimensionality that captures the largest variability in the data. How can regression analysis and PCA be combined in a beneficial way? Why and when is it a good idea to combine them? What kind of benefits are we getting from them? Addressing these questions, Constrained Principal Component Analysis and Related Techniques shows how constrained PCA (CPCA) offers a unified framework for these approaches.The book begins with four concre
Probabilistic Principal Component Analysis for Metabolomic Data.
LENUS (Irish Health Repository)
Nyamundanda, Gift
2010-11-23
Abstract Background Data from metabolomic studies are typically complex and high-dimensional. Principal component analysis (PCA) is currently the most widely used statistical technique for analyzing metabolomic data. However, PCA is limited by the fact that it is not based on a statistical model. Results Here, probabilistic principal component analysis (PPCA) which addresses some of the limitations of PCA, is reviewed and extended. A novel extension of PPCA, called probabilistic principal component and covariates analysis (PPCCA), is introduced which provides a flexible approach to jointly model metabolomic data and additional covariate information. The use of a mixture of PPCA models for discovering the number of inherent groups in metabolomic data is demonstrated. The jackknife technique is employed to construct confidence intervals for estimated model parameters throughout. The optimal number of principal components is determined through the use of the Bayesian Information Criterion model selection tool, which is modified to address the high dimensionality of the data. Conclusions The methods presented are illustrated through an application to metabolomic data sets. Jointly modeling metabolomic data and covariates was successfully achieved and has the potential to provide deeper insight to the underlying data structure. Examination of confidence intervals for the model parameters, such as loadings, allows for principled and clear interpretation of the underlying data structure. A software package called MetabolAnalyze, freely available through the R statistical software, has been developed to facilitate implementation of the presented methods in the metabolomics field.
Experimental and principal component analysis of waste ...
African Journals Online (AJOL)
The present study is aimed at determining through principal component analysis the most important variables affecting bacterial degradation in ponds. Data were collected from literature. In addition, samples were also collected from the waste stabilization ponds at the University of Nigeria, Nsukka and analyzed to ...
Principal Component Analysis as an Efficient Performance ...
African Journals Online (AJOL)
This paper uses the principal component analysis (PCA) to examine the possibility of using few explanatory variables (X's) to explain the variation in Y. It applied PCA to assess the performance of students in Abia State Polytechnic, Aba, Nigeria. This was done by estimating the coefficients of eight explanatory variables in a ...
Principal component analysis of psoriasis lesions images
DEFF Research Database (Denmark)
Maletti, Gabriela Mariel; Ersbøll, Bjarne Kjær
2003-01-01
A set of RGB images of psoriasis lesions is used. By visual examination of these images, there seem to be no common pattern that could be used to find and align the lesions within and between sessions. It is expected that the principal components of the original images could be useful during future...
Principal components analysis in clinical studies.
Zhang, Zhongheng; Castelló, Adela
2017-09-01
In multivariate analysis, independent variables are usually correlated to each other which can introduce multicollinearity in the regression models. One approach to solve this problem is to apply principal components analysis (PCA) over these variables. This method uses orthogonal transformation to represent sets of potentially correlated variables with principal components (PC) that are linearly uncorrelated. PCs are ordered so that the first PC has the largest possible variance and only some components are selected to represent the correlated variables. As a result, the dimension of the variable space is reduced. This tutorial illustrates how to perform PCA in R environment, the example is a simulated dataset in which two PCs are responsible for the majority of the variance in the data. Furthermore, the visualization of PCA is highlighted.
PCA: Principal Component Analysis for spectra modeling
Hurley, Peter D.; Oliver, Seb; Farrah, Duncan; Wang, Lingyu; Efstathiou, Andreas
2012-07-01
The mid-infrared spectra of ultraluminous infrared galaxies (ULIRGs) contain a variety of spectral features that can be used as diagnostics to characterize the spectra. However, such diagnostics are biased by our prior prejudices on the origin of the features. Moreover, by using only part of the spectrum they do not utilize the full information content of the spectra. Blind statistical techniques such as principal component analysis (PCA) consider the whole spectrum, find correlated features and separate them out into distinct components. This code, written in IDL, classifies principal components of IRS spectra to define a new classification scheme using 5D Gaussian mixtures modelling. The five PCs and average spectra for the four classifications to classify objects are made available with the code.
A Genealogical Interpretation of Principal Components Analysis
McVean, Gil
2009-01-01
Principal components analysis, PCA, is a statistical method commonly used in population genetics to identify structure in the distribution of genetic variation across geographical location and ethnic background. However, while the method is often used to inform about historical demographic processes, little is known about the relationship between fundamental demographic parameters and the projection of samples onto the primary axes. Here I show that for SNP data the projection of samples onto the principal components can be obtained directly from considering the average coalescent times between pairs of haploid genomes. The result provides a framework for interpreting PCA projections in terms of underlying processes, including migration, geographical isolation, and admixture. I also demonstrate a link between PCA and Wright's fst and show that SNP ascertainment has a largely simple and predictable effect on the projection of samples. Using examples from human genetics, I discuss the application of these results to empirical data and the implications for inference. PMID:19834557
Radar fall detection using principal component analysis
Jokanovic, Branka; Amin, Moeness; Ahmad, Fauzia; Boashash, Boualem
2016-05-01
Falls are a major cause of fatal and nonfatal injuries in people aged 65 years and older. Radar has the potential to become one of the leading technologies for fall detection, thereby enabling the elderly to live independently. Existing techniques for fall detection using radar are based on manual feature extraction and require significant parameter tuning in order to provide successful detections. In this paper, we employ principal component analysis for fall detection, wherein eigen images of observed motions are employed for classification. Using real data, we demonstrate that the PCA based technique provides performance improvement over the conventional feature extraction methods.
Integrating Data Transformation in Principal Components Analysis
Maadooliat, Mehdi
2015-01-02
Principal component analysis (PCA) is a popular dimension reduction method to reduce the complexity and obtain the informative aspects of high-dimensional datasets. When the data distribution is skewed, data transformation is commonly used prior to applying PCA. Such transformation is usually obtained from previous studies, prior knowledge, or trial-and-error. In this work, we develop a model-based method that integrates data transformation in PCA and finds an appropriate data transformation using the maximum profile likelihood. Extensions of the method to handle functional data and missing values are also developed. Several numerical algorithms are provided for efficient computation. The proposed method is illustrated using simulated and real-world data examples.
Nonlinear principal component analysis and its applications
Mori, Yuichi; Makino, Naomichi
2016-01-01
This book expounds the principle and related applications of nonlinear principal component analysis (PCA), which is useful method to analyze mixed measurement levels data. In the part dealing with the principle, after a brief introduction of ordinary PCA, a PCA for categorical data (nominal and ordinal) is introduced as nonlinear PCA, in which an optimal scaling technique is used to quantify the categorical variables. The alternating least squares (ALS) is the main algorithm in the method. Multiple correspondence analysis (MCA), a special case of nonlinear PCA, is also introduced. All formulations in these methods are integrated in the same manner as matrix operations. Because any measurement levels data can be treated consistently as numerical data and ALS is a very powerful tool for estimations, the methods can be utilized in a variety of fields such as biometrics, econometrics, psychometrics, and sociology. In the applications part of the book, four applications are introduced: variable selection for mixed...
Dong, Jianghu J; Wang, Liangliang; Gill, Jagbir; Cao, Jiguo
2017-01-01
This article is motivated by some longitudinal clinical data of kidney transplant recipients, where kidney function progression is recorded as the estimated glomerular filtration rates at multiple time points post kidney transplantation. We propose to use the functional principal component analysis method to explore the major source of variations of glomerular filtration rate curves. We find that the estimated functional principal component scores can be used to cluster glomerular filtration rate curves. Ordering functional principal component scores can detect abnormal glomerular filtration rate curves. Finally, functional principal component analysis can effectively estimate missing glomerular filtration rate values and predict future glomerular filtration rate values.
Principal component analysis of FDG PET in amnestic MCI
International Nuclear Information System (INIS)
Nobili, Flavio; Girtler, Nicola; Brugnolo, Andrea; Dessi, Barbara; Rodriguez, Guido; Salmaso, Dario; Morbelli, Silvia; Piccardo, Arnoldo; Larsson, Stig A.; Pagani, Marco
2008-01-01
The purpose of the study is to evaluate the combined accuracy of episodic memory performance and 18 F-FDG PET in identifying patients with amnestic mild cognitive impairment (aMCI) converting to Alzheimer's disease (AD), aMCI non-converters, and controls. Thirty-three patients with aMCI and 15 controls (CTR) were followed up for a mean of 21 months. Eleven patients developed AD (MCI/AD) and 22 remained with aMCI (MCI/MCI). 18 F-FDG PET volumetric regions of interest underwent principal component analysis (PCA) that identified 12 principal components (PC), expressed by coarse component scores (CCS). Discriminant analysis was performed using the significant PCs and episodic memory scores. PCA highlighted relative hypometabolism in PC5, including bilateral posterior cingulate and left temporal pole, and in PC7, including the bilateral orbitofrontal cortex, both in MCI/MCI and MCI/AD vs CTR. PC5 itself plus PC12, including the left lateral frontal cortex (LFC: BAs 44, 45, 46, 47), were significantly different between MCI/AD and MCI/MCI. By a three-group discriminant analysis, CTR were more accurately identified by PET-CCS + delayed recall score (100%), MCI/MCI by PET-CCS + either immediate or delayed recall scores (91%), while MCI/AD was identified by PET-CCS alone (82%). PET increased by 25% the correct allocations achieved by memory scores, while memory scores increased by 15% the correct allocations achieved by PET. Combining memory performance and 18 F-FDG PET yielded a higher accuracy than each single tool in identifying CTR and MCI/MCI. The PC containing bilateral posterior cingulate and left temporal pole was the hallmark of MCI/MCI patients, while the PC including the left LFC was the hallmark of conversion to AD. (orig.)
Principal component analysis of FDG PET in amnestic MCI
Energy Technology Data Exchange (ETDEWEB)
Nobili, Flavio; Girtler, Nicola; Brugnolo, Andrea; Dessi, Barbara; Rodriguez, Guido [University of Genoa, Clinical Neurophysiology, Department of Endocrinological and Medical Sciences, Genoa (Italy); S. Martino Hospital, Alzheimer Evaluation Unit, Genoa (Italy); S. Martino Hospital, Head-Neck Department, Genoa (Italy); Salmaso, Dario [CNR, Institute of Cognitive Sciences and Technologies, Rome (Italy); CNR, Institute of Cognitive Sciences and Technologies, Padua (Italy); Morbelli, Silvia [University of Genoa, Nuclear Medicine Unit, Department of Internal Medicine, Genoa (Italy); Piccardo, Arnoldo [Galliera Hospital, Nuclear Medicine Unit, Department of Imaging Diagnostics, Genoa (Italy); Larsson, Stig A. [Karolinska Hospital, Department of Nuclear Medicine, Stockholm (Sweden); Pagani, Marco [CNR, Institute of Cognitive Sciences and Technologies, Rome (Italy); CNR, Institute of Cognitive Sciences and Technologies, Padua (Italy); Karolinska Hospital, Department of Nuclear Medicine, Stockholm (Sweden)
2008-12-15
The purpose of the study is to evaluate the combined accuracy of episodic memory performance and {sup 18}F-FDG PET in identifying patients with amnestic mild cognitive impairment (aMCI) converting to Alzheimer's disease (AD), aMCI non-converters, and controls. Thirty-three patients with aMCI and 15 controls (CTR) were followed up for a mean of 21 months. Eleven patients developed AD (MCI/AD) and 22 remained with aMCI (MCI/MCI). {sup 18}F-FDG PET volumetric regions of interest underwent principal component analysis (PCA) that identified 12 principal components (PC), expressed by coarse component scores (CCS). Discriminant analysis was performed using the significant PCs and episodic memory scores. PCA highlighted relative hypometabolism in PC5, including bilateral posterior cingulate and left temporal pole, and in PC7, including the bilateral orbitofrontal cortex, both in MCI/MCI and MCI/AD vs CTR. PC5 itself plus PC12, including the left lateral frontal cortex (LFC: BAs 44, 45, 46, 47), were significantly different between MCI/AD and MCI/MCI. By a three-group discriminant analysis, CTR were more accurately identified by PET-CCS + delayed recall score (100%), MCI/MCI by PET-CCS + either immediate or delayed recall scores (91%), while MCI/AD was identified by PET-CCS alone (82%). PET increased by 25% the correct allocations achieved by memory scores, while memory scores increased by 15% the correct allocations achieved by PET. Combining memory performance and {sup 18}F-FDG PET yielded a higher accuracy than each single tool in identifying CTR and MCI/MCI. The PC containing bilateral posterior cingulate and left temporal pole was the hallmark of MCI/MCI patients, while the PC including the left LFC was the hallmark of conversion to AD. (orig.)
Multistage principal component analysis based method for abdominal ECG decomposition
International Nuclear Information System (INIS)
Petrolis, Robertas; Krisciukaitis, Algimantas; Gintautas, Vladas
2015-01-01
Reflection of fetal heart electrical activity is present in registered abdominal ECG signals. However this signal component has noticeably less energy than concurrent signals, especially maternal ECG. Therefore traditionally recommended independent component analysis, fails to separate these two ECG signals. Multistage principal component analysis (PCA) is proposed for step-by-step extraction of abdominal ECG signal components. Truncated representation and subsequent subtraction of cardio cycles of maternal ECG are the first steps. The energy of fetal ECG component then becomes comparable or even exceeds energy of other components in the remaining signal. Second stage PCA concentrates energy of the sought signal in one principal component assuring its maximal amplitude regardless to the orientation of the fetus in multilead recordings. Third stage PCA is performed on signal excerpts representing detected fetal heart beats in aim to perform their truncated representation reconstructing their shape for further analysis. The algorithm was tested with PhysioNet Challenge 2013 signals and signals recorded in the Department of Obstetrics and Gynecology, Lithuanian University of Health Sciences. Results of our method in PhysioNet Challenge 2013 on open data set were: average score: 341.503 bpm 2 and 32.81 ms. (paper)
Use of Sparse Principal Component Analysis (SPCA) for Fault Detection
DEFF Research Database (Denmark)
Gajjar, Shriram; Kulahci, Murat; Palazoglu, Ahmet
2016-01-01
Principal component analysis (PCA) has been widely used for data dimension reduction and process fault detection. However, interpreting the principal components and the outcomes of PCA-based monitoring techniques is a challenging task since each principal component is a linear combination of the ...
Mapping ash properties using principal components analysis
Pereira, Paulo; Brevik, Eric; Cerda, Artemi; Ubeda, Xavier; Novara, Agata; Francos, Marcos; Rodrigo-Comino, Jesus; Bogunovic, Igor; Khaledian, Yones
2017-04-01
In post-fire environments ash has important benefits for soils, such as protection and source of nutrients, crucial for vegetation recuperation (Jordan et al., 2016; Pereira et al., 2015a; 2016a,b). The thickness and distribution of ash are fundamental aspects for soil protection (Cerdà and Doerr, 2008; Pereira et al., 2015b) and the severity at which was produced is important for the type and amount of elements that is released in soil solution (Bodi et al., 2014). Ash is very mobile material, and it is important were it will be deposited. Until the first rainfalls are is very mobile. After it, bind in the soil surface and is harder to erode. Mapping ash properties in the immediate period after fire is complex, since it is constantly moving (Pereira et al., 2015b). However, is an important task, since according the amount and type of ash produced we can identify the degree of soil protection and the nutrients that will be dissolved. The objective of this work is to apply to map ash properties (CaCO3, pH, and select extractable elements) using a principal component analysis (PCA) in the immediate period after the fire. Four days after the fire we established a grid in a 9x27 m area and took ash samples every 3 meters for a total of 40 sampling points (Pereira et al., 2017). The PCA identified 5 different factors. Factor 1 identified high loadings in electrical conductivity, calcium, and magnesium and negative with aluminum and iron, while Factor 3 had high positive loadings in total phosphorous and silica. Factor 3 showed high positive loadings in sodium and potassium, factor 4 high negative loadings in CaCO3 and pH, and factor 5 high loadings in sodium and potassium. The experimental variograms of the extracted factors showed that the Gaussian model was the most precise to model factor 1, the linear to model factor 2 and the wave hole effect to model factor 3, 4 and 5. The maps produced confirm the patternd observed in the experimental variograms. Factor 1 and 2
Incremental Tensor Principal Component Analysis for Handwritten Digit Recognition
Directory of Open Access Journals (Sweden)
Chang Liu
2014-01-01
Full Text Available To overcome the shortcomings of traditional dimensionality reduction algorithms, incremental tensor principal component analysis (ITPCA based on updated-SVD technique algorithm is proposed in this paper. This paper proves the relationship between PCA, 2DPCA, MPCA, and the graph embedding framework theoretically and derives the incremental learning procedure to add single sample and multiple samples in detail. The experiments on handwritten digit recognition have demonstrated that ITPCA has achieved better recognition performance than that of vector-based principal component analysis (PCA, incremental principal component analysis (IPCA, and multilinear principal component analysis (MPCA algorithms. At the same time, ITPCA also has lower time and space complexity.
An Introductory Application of Principal Components to Cricket Data
Manage, Ananda B. W.; Scariano, Stephen M.
2013-01-01
Principal Component Analysis is widely used in applied multivariate data analysis, and this article shows how to motivate student interest in this topic using cricket sports data. Here, principal component analysis is successfully used to rank the cricket batsmen and bowlers who played in the 2012 Indian Premier League (IPL) competition. In…
Foch, Eric; Milner, Clare E
2014-01-03
Iliotibial band syndrome (ITBS) is a common knee overuse injury among female runners. Atypical discrete trunk and lower extremity biomechanics during running may be associated with the etiology of ITBS. Examining discrete data points limits the interpretation of a waveform to a single value. Characterizing entire kinematic and kinetic waveforms may provide additional insight into biomechanical factors associated with ITBS. Therefore, the purpose of this cross-sectional investigation was to determine whether female runners with previous ITBS exhibited differences in kinematics and kinetics compared to controls using a principal components analysis (PCA) approach. Forty participants comprised two groups: previous ITBS and controls. Principal component scores were retained for the first three principal components and were analyzed using independent t-tests. The retained principal components accounted for 93-99% of the total variance within each waveform. Runners with previous ITBS exhibited low principal component one scores for frontal plane hip angle. Principal component one accounted for the overall magnitude in hip adduction which indicated that runners with previous ITBS assumed less hip adduction throughout stance. No differences in the remaining retained principal component scores for the waveforms were detected among groups. A smaller hip adduction angle throughout the stance phase of running may be a compensatory strategy to limit iliotibial band strain. This running strategy may have persisted after ITBS symptoms subsided. © 2013 Published by Elsevier Ltd.
Principal Component Analysis In Radar Polarimetry
Directory of Open Access Journals (Sweden)
A. Danklmayer
2005-01-01
Full Text Available Second order moments of multivariate (often Gaussian joint probability density functions can be described by the covariance or normalised correlation matrices or by the Kennaugh matrix (Kronecker matrix. In Radar Polarimetry the application of the covariance matrix is known as target decomposition theory, which is a special application of the extremely versatile Principle Component Analysis (PCA. The basic idea of PCA is to convert a data set, consisting of correlated random variables into a new set of uncorrelated variables and order the new variables according to the value of their variances. It is important to stress that uncorrelatedness does not necessarily mean independent which is used in the much stronger concept of Independent Component Analysis (ICA. Both concepts agree for multivariate Gaussian distribution functions, representing the most random and least structured distribution. In this contribution, we propose a new approach in applying the concept of PCA to Radar Polarimetry. Therefore, new uncorrelated random variables will be introduced by means of linear transformations with well determined loading coefficients. This in turn, will allow the decomposition of the original random backscattering target variables into three point targets with new random uncorrelated variables whose variances agree with the eigenvalues of the covariance matrix. This allows a new interpretation of existing decomposition theorems.
Principal Components as a Data Reduction and Noise Reduction Technique
Imhoff, M. L.; Campbell, W. J.
1982-01-01
The potential of principal components as a pipeline data reduction technique for thematic mapper data was assessed and principal components analysis and its transformation as a noise reduction technique was examined. Two primary factors were considered: (1) how might data reduction and noise reduction using the principal components transformation affect the extraction of accurate spectral classifications; and (2) what are the real savings in terms of computer processing and storage costs of using reduced data over the full 7-band TM complement. An area in central Pennsylvania was chosen for a study area. The image data for the project were collected using the Earth Resources Laboratory's thematic mapper simulator (TMS) instrument.
Longitudinal functional principal component modelling via Stochastic Approximation Monte Carlo
Martinez, Josue G.; Liang, Faming; Zhou, Lan; Carroll, Raymond J.
2010-01-01
model averaging using a Bayesian formulation. A relatively straightforward reversible jump Markov Chain Monte Carlo formulation has poor mixing properties and in simulated data often becomes trapped at the wrong number of principal components. In order
Nonparametric inference in nonlinear principal components analysis : exploration and beyond
Linting, Mariëlle
2007-01-01
In the social and behavioral sciences, data sets often do not meet the assumptions of traditional analysis methods. Therefore, nonlinear alternatives to traditional methods have been developed. This thesis starts with a didactic discussion of nonlinear principal components analysis (NLPCA),
Assessment of drinking water quality using principal component ...
African Journals Online (AJOL)
Assessment of drinking water quality using principal component analysis and partial least square discriminant analysis: a case study at water treatment plants, ... water and to detect the source of pollution for the most revealing parameters.
Water quality of the Chhoti Gandak River using principal component ...
Indian Academy of Sciences (India)
; therefore water samples were collected to analyse its quality along the entire length of Chhoti Gandak. River. The principal components of water quality are controlled by lithology, gentle slope gradient, poor drainage, long residence of water, ...
EXAFS and principal component analysis : a new shell game
International Nuclear Information System (INIS)
Wasserman, S.
1998-01-01
The use of principal component (factor) analysis in the analysis EXAFS spectra is described. The components derived from EXAFS spectra share mathematical properties with the original spectra. As a result, the abstract components can be analyzed using standard EXAFS methodology to yield the bond distances and other coordination parameters. The number of components that must be analyzed is usually less than the number of original spectra. The method is demonstrated using a series of spectra from aqueous solutions of uranyl ions
Principal Component Analysis of Body Measurements In Three ...
African Journals Online (AJOL)
This study was conducted to explore the relationship among body measurements in 3 strains of broilers chicken (Arbor Acre, Marshal and Ross) using principal component analysis with the view of identifying those components that define body conformation in broilers. A total of 180 birds were used, 60 per strain.
Topographical characteristics and principal component structure of the hypnagogic EEG.
Tanaka, H; Hayashi, M; Hori, T
1997-07-01
The purpose of the present study was to identify the dominant topographic components of electroencephalographs (EEG) and their behavior during the waking-sleeping transition period. Somnography of nocturnal sleep was recorded on 10 male subjects. Each recording, from "lights-off" to 5 minutes after the appearance of the first sleep spindle, was analyzed. The typical EEG patterns during hypnagogic period were classified into nine EEG stages. Topographic maps demonstrated that the dominant areas of alpha-band activity moved from the posterior areas to anterior areas along the midline of the scalp. In delta-, theta-, and sigma-band activities, the differences of EEG amplitude between the focus areas (the dominant areas) and the surrounding areas increased as a function of EEG stage. To identify the dominant topographic components, a principal component analysis was carried out on a 12-channel EEG data set for each of six frequency bands. The dominant areas of alpha 2- (9.6-11.4 Hz) and alpha 3- (11.6-13.4 Hz) band activities moved from the posterior to anterior areas, respectively. The distribution of alpha 2-band activity on the scalp clearly changed just after EEG stage 3 (alpha intermittent, < 50%). On the other hand, alpha 3-band activity became dominant in anterior areas after the appearance of vertex sharp-wave bursts (EEG stage 7). For the sigma band, the amplitude of extensive areas from the frontal pole to the parietal showed a rapid rise after the onset of stage 7 (the appearance of vertex sharp-wave bursts). Based on the results, sleep onset process probably started before the onset of sleep stage 1 in standard criteria. On the other hand, the basic sleep process may start before the onset of sleep stage 2 or the manually scored spindles.
Longitudinal functional principal component modelling via Stochastic Approximation Monte Carlo
Martinez, Josue G.
2010-06-01
The authors consider the analysis of hierarchical longitudinal functional data based upon a functional principal components approach. In contrast to standard frequentist approaches to selecting the number of principal components, the authors do model averaging using a Bayesian formulation. A relatively straightforward reversible jump Markov Chain Monte Carlo formulation has poor mixing properties and in simulated data often becomes trapped at the wrong number of principal components. In order to overcome this, the authors show how to apply Stochastic Approximation Monte Carlo (SAMC) to this problem, a method that has the potential to explore the entire space and does not become trapped in local extrema. The combination of reversible jump methods and SAMC in hierarchical longitudinal functional data is simplified by a polar coordinate representation of the principal components. The approach is easy to implement and does well in simulated data in determining the distribution of the number of principal components, and in terms of its frequentist estimation properties. Empirical applications are also presented.
Sparse logistic principal components analysis for binary data
Lee, Seokho
2010-09-01
We develop a new principal components analysis (PCA) type dimension reduction method for binary data. Different from the standard PCA which is defined on the observed data, the proposed PCA is defined on the logit transform of the success probabilities of the binary observations. Sparsity is introduced to the principal component (PC) loading vectors for enhanced interpretability and more stable extraction of the principal components. Our sparse PCA is formulated as solving an optimization problem with a criterion function motivated from a penalized Bernoulli likelihood. A Majorization-Minimization algorithm is developed to efficiently solve the optimization problem. The effectiveness of the proposed sparse logistic PCA method is illustrated by application to a single nucleotide polymorphism data set and a simulation study. © Institute ol Mathematical Statistics, 2010.
Sparse Principal Component Analysis in Medical Shape Modeling
DEFF Research Database (Denmark)
Sjöstrand, Karl; Stegmann, Mikkel Bille; Larsen, Rasmus
2006-01-01
Principal component analysis (PCA) is a widely used tool in medical image analysis for data reduction, model building, and data understanding and exploration. While PCA is a holistic approach where each new variable is a linear combination of all original variables, sparse PCA (SPCA) aims...... analysis in medicine. Results for three different data sets are given in relation to standard PCA and sparse PCA by simple thresholding of sufficiently small loadings. Focus is on a recent algorithm for computing sparse principal components, but a review of other approaches is supplied as well. The SPCA...
Jones, James M.
2013-01-01
Principal leadership studies have indicated that leadership can play an important role in augmenting students' achievement scores. One significant influence that can affect achievement scores is the leadership style of the principal. This study focuses on fourth-grade achievement scores within urban elementary schools and explores the relationship…
Fault Localization for Synchrophasor Data using Kernel Principal Component Analysis
Directory of Open Access Journals (Sweden)
CHEN, R.
2017-11-01
Full Text Available In this paper, based on Kernel Principal Component Analysis (KPCA of Phasor Measurement Units (PMU data, a nonlinear method is proposed for fault location in complex power systems. Resorting to the scaling factor, the derivative for a polynomial kernel is obtained. Then, the contribution of each variable to the T2 statistic is derived to determine whether a bus is the fault component. Compared to the previous Principal Component Analysis (PCA based methods, the novel version can combat the characteristic of strong nonlinearity, and provide the precise identification of fault location. Computer simulations are conducted to demonstrate the improved performance in recognizing the fault component and evaluating its propagation across the system based on the proposed method.
The analysis of multivariate group differences using common principal components
Bechger, T.M.; Blanca, M.J.; Maris, G.
2014-01-01
Although it is simple to determine whether multivariate group differences are statistically significant or not, such differences are often difficult to interpret. This article is about common principal components analysis as a tool for the exploratory investigation of multivariate group differences
Principal Component Analysis: Most Favourite Tool in Chemometrics
Indian Academy of Sciences (India)
Abstract. Principal component analysis (PCA) is the most commonlyused chemometric technique. It is an unsupervised patternrecognition technique. PCA has found applications in chemistry,biology, medicine and economics. The present work attemptsto understand how PCA work and how can we interpretits results.
Scalable Robust Principal Component Analysis Using Grassmann Averages
DEFF Research Database (Denmark)
Hauberg, Søren; Feragen, Aasa; Enficiaud, Raffi
2016-01-01
In large datasets, manual data verification is impossible, and we must expect the number of outliers to increase with data size. While principal component analysis (PCA) can reduce data size, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortu...
Principal Component Surface (2011) for Fish Bay, St. John
National Oceanic and Atmospheric Administration, Department of Commerce — This image represents a 0.3x0.3 meter principal component analysis (PCA) surface for areas inside Fish Bay, St. John in the U.S. Virgin Islands (USVI). It was...
Principal Component Surface (2011) for Coral Bay, St. John
National Oceanic and Atmospheric Administration, Department of Commerce — This image represents a 0.3x0.3 meter principal component analysis (PCA) surface for areas inside Coral Bay, St. John in the U.S. Virgin Islands (USVI). It was...
Incremental principal component pursuit for video background modeling
Rodriquez-Valderrama, Paul A.; Wohlberg, Brendt
2017-03-14
An incremental Principal Component Pursuit (PCP) algorithm for video background modeling that is able to process one frame at a time while adapting to changes in background, with a computational complexity that allows for real-time processing, having a low memory footprint and is robust to translational and rotational jitter.
Group-wise Principal Component Analysis for Exploratory Data Analysis
Camacho, J.; Rodriquez-Gomez, Rafael A.; Saccenti, E.
2017-01-01
In this paper, we propose a new framework for matrix factorization based on Principal Component Analysis (PCA) where sparsity is imposed. The structure to impose sparsity is defined in terms of groups of correlated variables found in correlation matrices or maps. The framework is based on three new
Principal component analysis of image gradient orientations for face recognition
Tzimiropoulos, Georgios; Zafeiriou, Stefanos; Pantic, Maja
We introduce the notion of Principal Component Analysis (PCA) of image gradient orientations. As image data is typically noisy, but noise is substantially different from Gaussian, traditional PCA of pixel intensities very often fails to estimate reliably the low-dimensional subspace of a given data
Efficacy of the Principal Components Analysis Techniques Using ...
African Journals Online (AJOL)
Second, the paper reports results of principal components analysis after the artificial data were submitted to three commonly used procedures; scree plot, Kaiser rule, and modified Horn's parallel analysis, and demonstrate the pedagogical utility of using artificial data in teaching advanced quantitative concepts. The results ...
Principal Component Clustering Approach to Teaching Quality Discriminant Analysis
Xian, Sidong; Xia, Haibo; Yin, Yubo; Zhai, Zhansheng; Shang, Yan
2016-01-01
Teaching quality is the lifeline of the higher education. Many universities have made some effective achievement about evaluating the teaching quality. In this paper, we establish the Students' evaluation of teaching (SET) discriminant analysis model and algorithm based on principal component clustering analysis. Additionally, we classify the SET…
PEMBUATAN PERANGKAT LUNAK PENGENALAN WAJAH MENGGUNAKAN PRINCIPAL COMPONENTS ANALYSIS
Directory of Open Access Journals (Sweden)
Kartika Gunadi
2001-01-01
Full Text Available Face recognition is one of many important researches, and today, many applications have implemented it. Through development of techniques like Principal Components Analysis (PCA, computers can now outperform human in many face recognition tasks, particularly those in which large database of faces must be searched. Principal Components Analysis was used to reduce facial image dimension into fewer variables, which are easier to observe and handle. Those variables then fed into artificial neural networks using backpropagation method to recognise the given facial image. The test results show that PCA can provide high face recognition accuracy. For the training faces, a correct identification of 100% could be obtained. From some of network combinations that have been tested, a best average correct identification of 91,11% could be obtained for the test faces while the worst average result is 46,67 % correct identification Abstract in Bahasa Indonesia : Pengenalan wajah manusia merupakan salah satu bidang penelitian yang penting, dan dewasa ini banyak aplikasi yang dapat menerapkannya. Melalui pengembangan suatu teknik seperti Principal Components Analysis (PCA, komputer sekarang dapat melebihi kemampuan otak manusia dalam berbagai tugas pengenalan wajah, terutama tugas-tugas yang membutuhkan pencarian pada database wajah yang besar. Principal Components Analysis digunakan untuk mereduksi dimensi gambar wajah sehingga menghasilkan variabel yang lebih sedikit yang lebih mudah untuk diobsevasi dan ditangani. Hasil yang diperoleh kemudian akan dimasukkan ke suatu jaringan saraf tiruan dengan metode Backpropagation untuk mengenali gambar wajah yang telah diinputkan ke dalam sistem. Hasil pengujian sistem menunjukkan bahwa penggunaan PCA untuk pengenalan wajah dapat memberikan tingkat akurasi yang cukup tinggi. Untuk gambar wajah yang diikutsertakankan dalam latihan, dapat diperoleh 100% identifikasi yang benar. Dari beberapa kombinasi jaringan yang
Principal Component Analysis - A Powerful Tool in Computing Marketing Information
Directory of Open Access Journals (Sweden)
Constantin C.
2014-12-01
Full Text Available This paper is about an instrumental research regarding a powerful multivariate data analysis method which can be used by the researchers in order to obtain valuable information for decision makers that need to solve the marketing problem a company face with. The literature stresses the need to avoid the multicollinearity phenomenon in multivariate analysis and the features of Principal Component Analysis (PCA in reducing the number of variables that could be correlated with each other to a small number of principal components that are uncorrelated. In this respect, the paper presents step-by-step the process of applying the PCA in marketing research when we use a large number of variables that naturally are collinear.
Aeromagnetic Compensation Algorithm Based on Principal Component Analysis
Directory of Open Access Journals (Sweden)
Peilin Wu
2018-01-01
Full Text Available Aeromagnetic exploration is an important exploration method in geophysics. The data is typically measured by optically pumped magnetometer mounted on an aircraft. But any aircraft produces significant levels of magnetic interference. Therefore, aeromagnetic compensation is important in aeromagnetic exploration. However, multicollinearity of the aeromagnetic compensation model degrades the performance of the compensation. To address this issue, a novel aeromagnetic compensation method based on principal component analysis is proposed. Using the algorithm, the correlation in the feature matrix is eliminated and the principal components are using to construct the hyperplane to compensate the platform-generated magnetic fields. The algorithm was tested using a helicopter, and the obtained improvement ratio is 9.86. The compensated quality is almost the same or slightly better than the ridge regression. The validity of the proposed method was experimentally demonstrated.
Fast principal component analysis for stacking seismic data
Wu, Juan; Bai, Min
2018-04-01
Stacking seismic data plays an indispensable role in many steps of the seismic data processing and imaging workflow. Optimal stacking of seismic data can help mitigate seismic noise and enhance the principal components to a great extent. Traditional average-based seismic stacking methods cannot obtain optimal performance when the ambient noise is extremely strong. We propose a principal component analysis (PCA) algorithm for stacking seismic data without being sensitive to noise level. Considering the computational bottleneck of the classic PCA algorithm in processing massive seismic data, we propose an efficient PCA algorithm to make the proposed method readily applicable for industrial applications. Two numerically designed examples and one real seismic data are used to demonstrate the performance of the presented method.
Research on Air Quality Evaluation based on Principal Component Analysis
Wang, Xing; Wang, Zilin; Guo, Min; Chen, Wei; Zhang, Huan
2018-01-01
Economic growth has led to environmental capacity decline and the deterioration of air quality. Air quality evaluation as a fundamental of environmental monitoring and air pollution control has become increasingly important. Based on the principal component analysis (PCA), this paper evaluates the air quality of a large city in Beijing-Tianjin-Hebei Area in recent 10 years and identifies influencing factors, in order to provide reference to air quality management and air pollution control.
Extraction of Independent Structural Images for Principal Component Thermography
Directory of Open Access Journals (Sweden)
Dmitry Gavrilov
2018-03-01
Full Text Available Thermography is a powerful tool for non-destructive testing of a wide range of materials. Thermography has a number of approaches differing in both experiment setup and the way the collected data are processed. Among such approaches is the Principal Component Thermography (PCT method, which is based on the statistical processing of raw thermal images collected by thermal camera. The processed images (principal components or empirical orthogonal functions form an orthonormal basis, and often look like a superposition of all possible structural features found in the object under inspection—i.e., surface heating non-uniformity, internal defects and material structure. At the same time, from practical point of view it is desirable to have images representing independent structural features. The work presented in this paper proposes an approach for separation of independent image patterns (archetypes from a set of principal component images. The approach is demonstrated in the application of inspection of composite materials as well as the non-invasive analysis of works of art.
Selecting the Number of Principal Components in Functional Data
Li, Yehua
2013-12-01
Functional principal component analysis (FPCA) has become the most widely used dimension reduction tool for functional data analysis. We consider functional data measured at random, subject-specific time points, contaminated with measurement error, allowing for both sparse and dense functional data, and propose novel information criteria to select the number of principal component in such data. We propose a Bayesian information criterion based on marginal modeling that can consistently select the number of principal components for both sparse and dense functional data. For dense functional data, we also develop an Akaike information criterion based on the expected Kullback-Leibler information under a Gaussian assumption. In connecting with the time series literature, we also consider a class of information criteria proposed for factor analysis of multivariate time series and show that they are still consistent for dense functional data, if a prescribed undersmoothing scheme is undertaken in the FPCA algorithm. We perform intensive simulation studies and show that the proposed information criteria vastly outperform existing methods for this type of data. Surprisingly, our empirical evidence shows that our information criteria proposed for dense functional data also perform well for sparse functional data. An empirical example using colon carcinogenesis data is also provided to illustrate the results. Supplementary materials for this article are available online. © 2013 American Statistical Association.
Principal modes of rupture encountered in expertise of advanced components
International Nuclear Information System (INIS)
Tavassoli, A.A.; Bougault, A.
1986-10-01
Failure of many metallic components investigated can be classified into two categories: intergranular or transgranular according to their principal mode of rupture. Intergranular ruptures are often provoked by segregation of impurities at the grain boundaries. Three examples are cited where this phenomenon occured, one of them is a steel (A 508 cl 3) used for PWR vessel. Intergranular failures are in general induced by fatigue in the advanced components operating under thermal or load transients. One example concerning a sodium mixer which was subjected to thermal loadings is presented. Examples of stress corrosion and intergranular sensitization failures are cited. These examples show the importance of fractography for the determination of rupture causes [fr
Functional Principal Components Analysis of Shanghai Stock Exchange 50 Index
Directory of Open Access Journals (Sweden)
Zhiliang Wang
2014-01-01
Full Text Available The main purpose of this paper is to explore the principle components of Shanghai stock exchange 50 index by means of functional principal component analysis (FPCA. Functional data analysis (FDA deals with random variables (or process with realizations in the smooth functional space. One of the most popular FDA techniques is functional principal component analysis, which was introduced for the statistical analysis of a set of financial time series from an explorative point of view. FPCA is the functional analogue of the well-known dimension reduction technique in the multivariate statistical analysis, searching for linear transformations of the random vector with the maximal variance. In this paper, we studied the monthly return volatility of Shanghai stock exchange 50 index (SSE50. Using FPCA to reduce dimension to a finite level, we extracted the most significant components of the data and some relevant statistical features of such related datasets. The calculated results show that regarding the samples as random functions is rational. Compared with the ordinary principle component analysis, FPCA can solve the problem of different dimensions in the samples. And FPCA is a convenient approach to extract the main variance factors.
Directory of Open Access Journals (Sweden)
Peterson Mark D
2012-11-01
Full Text Available Abstract Background The purpose of this study was to determine the sex-specific pattern of pediatric cardiometabolic risk with principal component analysis, using several biological, behavioral and parental variables in a large cohort (n = 2866 of 6th grade students. Methods Cardiometabolic risk components included waist circumference, fasting glucose, blood pressure, plasma triglycerides levels and HDL-cholesterol. Principal components analysis was used to determine the pattern of risk clustering and to derive a continuous aggregate score (MetScore. Stratified risk components and MetScore were analyzed for association with age, body mass index (BMI, cardiorespiratory fitness (CRF, physical activity (PA, and parental factors. Results In both boys and girls, BMI and CRF were associated with multiple risk components, and overall MetScore. Maternal smoking was associated with multiple risk components in girls and boys, as well as MetScore in boys, even after controlling for children’s BMI. Paternal family history of early cardiovascular disease (CVD and parental age were associated with increased blood pressure and MetScore for girls. Children’s PA levels, maternal history of early CVD, and paternal BMI were also indicative for various risk components, but not MetScore. Conclusions Several biological and behavioral factors were independently associated with children’s cardiometabolic disease risk, and thus represent a unique gender-specific risk profile. These data serve to bolster the independent contribution of CRF, PA, and family-oriented healthy lifestyles for improving children’s health.
The Use of Principal Components in Long-Range Forecasting
Chern, Jonq-Gong
Large-scale modes of the global sea surface temperatures and the Northern Hemisphere tropospheric circulation are described by principal component analysis. The first and the second SST components well describe the El Nino episodes, and the El Nino index (ENI), suggested in this study, is consistent with the winter Southern Oscillation index (SOI), where this ENI is a composite component of the weighted first and second SST components. The large-scale interactive modes of the coupling ocean-atmosphere system are identified by cross-correlation analysis The result shows that the first SST component is strongly correlated with the first component of geopotential height in lead time of 6 months. In the El Nino-Southern Oscillation (ENSO) evolution, the El Nino mode strongly influences the winter tropospheric circulation in the mid -latitudes for up to three leading seasons. The regional long-range variation of climate is investigated with these major components of the SST and the tropospheric circulation. In the mid-latitude, the climate of the central United States shows a weak linkage with these large-scale circulations, and the climate of the western United States appears to be consistently associated with the ENSO modes. These El Nino modes also show a dominant influence on Eastern Asia as evidenced in Taiwan Mei-Yu patterns. Possible regional long-range forecasting schemes, utilizing the complementary characteristics of the winter El Nino mode and SST anomalies, are examined with the Taiwan Mei-Yu.
Principal and secondary luminescence lifetime components in annealed natural quartz
International Nuclear Information System (INIS)
Chithambo, M.L.; Ogundare, F.O.; Feathers, J.
2008-01-01
Time-resolved luminescence spectra from quartz can be separated into components with distinct principal and secondary lifetimes depending on certain combinations of annealing and measurement temperature. The influence of annealing on properties of the lifetimes related to irradiation dose and temperature of measurement has been investigated in sedimentary quartz annealed at various temperatures up to 900 deg. C. Time-resolved luminescence for use in the analysis was pulse stimulated from samples at 470 nm between 20 and 200 deg. C. Luminescence lifetimes decrease with measurement temperature due to increasing thermal effect on the associated luminescence with an activation energy of thermal quenching equal to 0.68±0.01eV for the secondary lifetime but only qualitatively so for the principal lifetime component. Concerning the influence of annealing temperature, luminescence lifetimes measured at 20 deg. C are constant at about 33μs for annealing temperatures up to 600 0 C but decrease to about 29μs when the annealing temperature is increased to 900 deg. C. In addition, it was found that lifetime components in samples annealed at 800 deg. C are independent of radiation dose in the range 85-1340 Gy investigated. The dependence of lifetimes on both the annealing temperature and magnitude of radiation dose is described as being due to the increasing importance of a particular recombination centre in the luminescence emission process as a result of dynamic hole transfer between non-radiative and radiative luminescence centres
Source Signals Separation and Reconstruction Following Principal Component Analysis
Directory of Open Access Journals (Sweden)
WANG Cheng
2014-02-01
Full Text Available For separation and reconstruction of source signals from observed signals problem, the physical significance of blind source separation modal and independent component analysis is not very clear, and its solution is not unique. Aiming at these disadvantages, a new linear and instantaneous mixing model and a novel source signals separation reconstruction solving method from observed signals based on principal component analysis (PCA are put forward. Assumption of this new model is statistically unrelated rather than independent of source signals, which is different from the traditional blind source separation model. A one-to-one relationship between linear and instantaneous mixing matrix of new model and linear compound matrix of PCA, and a one-to-one relationship between unrelated source signals and principal components are demonstrated using the concept of linear separation matrix and unrelated of source signals. Based on this theoretical link, source signals separation and reconstruction problem is changed into PCA of observed signals then. The theoretical derivation and numerical simulation results show that, in despite of Gauss measurement noise, wave form and amplitude information of unrelated source signal can be separated and reconstructed by PCA when linear mixing matrix is column orthogonal and normalized; only wave form information of unrelated source signal can be separated and reconstructed by PCA when linear mixing matrix is column orthogonal but not normalized, unrelated source signal cannot be separated and reconstructed by PCA when mixing matrix is not column orthogonal or linear.
Quality Aware Compression of Electrocardiogram Using Principal Component Analysis.
Gupta, Rajarshi
2016-05-01
Electrocardiogram (ECG) compression finds wide application in various patient monitoring purposes. Quality control in ECG compression ensures reconstruction quality and its clinical acceptance for diagnostic decision making. In this paper, a quality aware compression method of single lead ECG is described using principal component analysis (PCA). After pre-processing, beat extraction and PCA decomposition, two independent quality criteria, namely, bit rate control (BRC) or error control (EC) criteria were set to select optimal principal components, eigenvectors and their quantization level to achieve desired bit rate or error measure. The selected principal components and eigenvectors were finally compressed using a modified delta and Huffman encoder. The algorithms were validated with 32 sets of MIT Arrhythmia data and 60 normal and 30 sets of diagnostic ECG data from PTB Diagnostic ECG data ptbdb, all at 1 kHz sampling. For BRC with a CR threshold of 40, an average Compression Ratio (CR), percentage root mean squared difference normalized (PRDN) and maximum absolute error (MAE) of 50.74, 16.22 and 0.243 mV respectively were obtained. For EC with an upper limit of 5 % PRDN and 0.1 mV MAE, the average CR, PRDN and MAE of 9.48, 4.13 and 0.049 mV respectively were obtained. For mitdb data 117, the reconstruction quality could be preserved up to CR of 68.96 by extending the BRC threshold. The proposed method yields better results than recently published works on quality controlled ECG compression.
Principal Component Analysis Based Measure of Structural Holes
Deng, Shiguo; Zhang, Wenqing; Yang, Huijie
2013-02-01
Based upon principal component analysis, a new measure called compressibility coefficient is proposed to evaluate structural holes in networks. This measure incorporates a new effect from identical patterns in networks. It is found that compressibility coefficient for Watts-Strogatz small-world networks increases monotonically with the rewiring probability and saturates to that for the corresponding shuffled networks. While compressibility coefficient for extended Barabasi-Albert scale-free networks decreases monotonically with the preferential effect and is significantly large compared with that for corresponding shuffled networks. This measure is helpful in diverse research fields to evaluate global efficiency of networks.
PRINCIPAL COMPONENT ANALYSIS (PCA DAN APLIKASINYA DENGAN SPSS
Directory of Open Access Journals (Sweden)
Hermita Bus Umar
2009-03-01
Full Text Available PCA (Principal Component Analysis are statistical techniques applied to a single set of variables when the researcher is interested in discovering which variables in the setform coherent subset that are relativity independent of one another.Variables that are correlated with one another but largely independent of other subset of variables are combined into factors. The Coals of PCA to which each variables is explained by each dimension. Step in PCA include selecting and mean measuring a set of variables, preparing the correlation matrix, extracting a set offactors from the correlation matrixs. Rotating the factor to increase interpretabilitv and interpreting the result.
Efficient training of multilayer perceptrons using principal component analysis
International Nuclear Information System (INIS)
Bunzmann, Christoph; Urbanczik, Robert; Biehl, Michael
2005-01-01
A training algorithm for multilayer perceptrons is discussed and studied in detail, which relates to the technique of principal component analysis. The latter is performed with respect to a correlation matrix computed from the example inputs and their target outputs. Typical properties of the training procedure are investigated by means of a statistical physics analysis in models of learning regression and classification tasks. We demonstrate that the procedure requires by far fewer examples for good generalization than traditional online training. For networks with a large number of hidden units we derive the training prescription which achieves, within our model, the optimal generalization behavior
APPLICATION OF PRINCIPAL COMPONENT ANALYSIS TO RELAXOGRAPHIC IMAGES
International Nuclear Information System (INIS)
STOYANOVA, R.S.; OCHS, M.F.; BROWN, T.R.; ROONEY, W.D.; LI, X.; LEE, J.H.; SPRINGER, C.S.
1999-01-01
Standard analysis methods for processing inversion recovery MR images traditionally have used single pixel techniques. In these techniques each pixel is independently fit to an exponential recovery, and spatial correlations in the data set are ignored. By analyzing the image as a complete dataset, improved error analysis and automatic segmentation can be achieved. Here, the authors apply principal component analysis (PCA) to a series of relaxographic images. This procedure decomposes the 3-dimensional data set into three separate images and corresponding recovery times. They attribute the 3 images to be spatial representations of gray matter (GM), white matter (WM) and cerebrospinal fluid (CSF) content
Nonlinear Principal Component Analysis Using Strong Tracking Filter
Institute of Scientific and Technical Information of China (English)
无
2007-01-01
The paper analyzes the problem of blind source separation (BSS) based on the nonlinear principal component analysis (NPCA) criterion. An adaptive strong tracking filter (STF) based algorithm was developed, which is immune to system model mismatches. Simulations demonstrate that the algorithm converges quickly and has satisfactory steady-state accuracy. The Kalman filtering algorithm and the recursive leastsquares type algorithm are shown to be special cases of the STF algorithm. Since the forgetting factor is adaptively updated by adjustment of the Kalman gain, the STF scheme provides more powerful tracking capability than the Kalman filtering algorithm and recursive least-squares algorithm.
Application of principal component and factor analyses in electron spectroscopy
International Nuclear Information System (INIS)
Siuda, R.; Balcerowska, G.
1998-01-01
Fundamentals of two methods, taken from multivariate analysis and known as principal component analysis (PCA) and factor analysis (FA), are presented. Both methods are well known in chemometrics. Since 1979, when application of the methods to electron spectroscopy was reported for the first time, they became to be more and more popular in different branches of electron spectroscopy. The paper presents examples of standard applications of the method of Auger electron spectroscopy (AES), X-ray photoelectron spectroscopy (XPS), and electron energy loss spectroscopy (EELS). Advantages one can take from application of the methods, their potentialities as well as their limitations are pointed out. (author)
Principal Component Analysis of Working Memory Variables during Child and Adolescent Development.
Barriga-Paulino, Catarina I; Rodríguez-Martínez, Elena I; Rojas-Benjumea, María Ángeles; Gómez, Carlos M
2016-10-03
Correlation and Principal Component Analysis (PCA) of behavioral measures from two experimental tasks (Delayed Match-to-Sample and Oddball), and standard scores from a neuropsychological test battery (Working Memory Test Battery for Children) was performed on data from participants between 6-18 years old. The correlation analysis (p 1), the scores of the first extracted component were significantly correlated (p < .05) to most behavioral measures, suggesting some commonalities of the processes of age-related changes in the measured variables. The results suggest that this first component would be related to age but also to individual differences during the cognitive maturation process across childhood and adolescence stages. The fourth component would represent the speed-accuracy trade-off phenomenon as it presents loading components with different signs for reaction times and errors.
Manisera, M.; Kooij, A.J. van der; Dusseldorp, E.
2010-01-01
The component structure of 14 Likert-type items measuring different aspects of job satisfaction was investigated using nonlinear Principal Components Analysis (NLPCA). NLPCA allows for analyzing these items at an ordinal or interval level. The participants were 2066 workers from five types of social
ANOVA-principal component analysis and ANOVA-simultaneous component analysis: a comparison.
Zwanenburg, G.; Hoefsloot, H.C.J.; Westerhuis, J.A.; Jansen, J.J.; Smilde, A.K.
2011-01-01
ANOVA-simultaneous component analysis (ASCA) is a recently developed tool to analyze multivariate data. In this paper, we enhance the explorative capability of ASCA by introducing a projection of the observations on the principal component subspace to visualize the variation among the measurements.
Demixed principal component analysis of neural population data.
Kobak, Dmitry; Brendel, Wieland; Constantinidis, Christos; Feierstein, Claudia E; Kepecs, Adam; Mainen, Zachary F; Qi, Xue-Lian; Romo, Ranulfo; Uchida, Naoshige; Machens, Christian K
2016-04-12
Neurons in higher cortical areas, such as the prefrontal cortex, are often tuned to a variety of sensory and motor variables, and are therefore said to display mixed selectivity. This complexity of single neuron responses can obscure what information these areas represent and how it is represented. Here we demonstrate the advantages of a new dimensionality reduction technique, demixed principal component analysis (dPCA), that decomposes population activity into a few components. In addition to systematically capturing the majority of the variance of the data, dPCA also exposes the dependence of the neural representation on task parameters such as stimuli, decisions, or rewards. To illustrate our method we reanalyze population data from four datasets comprising different species, different cortical areas and different experimental tasks. In each case, dPCA provides a concise way of visualizing the data that summarizes the task-dependent features of the population response in a single figure.
Water reuse systems: A review of the principal components
Lucchetti, G.; Gray, G.A.
1988-01-01
Principal components of water reuse systems include ammonia removal, disease control, temperature control, aeration, and particulate filtration. Effective ammonia removal techniques include air stripping, ion exchange, and biofiltration. Selection of a particular technique largely depends on site-specific requirements (e.g., space, existing water quality, and fish densities). Disease control, although often overlooked, is a major problem in reuse systems. Pathogens can be controlled most effectively with ultraviolet radiation, ozone, or chlorine. Simple and inexpensive methods are available to increase oxygen concentration and eliminate gas supersaturation, these include commercial aerators, air injectors, and packed columns. Temperature control is a major advantage of reuse systems, but the equipment required can be expensive, particularly if water temperature must be rigidly controlled and ambient air temperature fluctuates. Filtration can be readily accomplished with a hydrocyclone or sand filter that increases overall system efficiency. Based on criteria of adaptability, efficiency, and reasonable cost, we recommend components for a small water reuse system.
Fast grasping of unknown objects using principal component analysis
Lei, Qujiang; Chen, Guangming; Wisse, Martijn
2017-09-01
Fast grasping of unknown objects has crucial impact on the efficiency of robot manipulation especially subjected to unfamiliar environments. In order to accelerate grasping speed of unknown objects, principal component analysis is utilized to direct the grasping process. In particular, a single-view partial point cloud is constructed and grasp candidates are allocated along the principal axis. Force balance optimization is employed to analyze possible graspable areas. The obtained graspable area with the minimal resultant force is the best zone for the final grasping execution. It is shown that an unknown object can be more quickly grasped provided that the component analysis principle axis is determined using single-view partial point cloud. To cope with the grasp uncertainty, robot motion is assisted to obtain a new viewpoint. Virtual exploration and experimental tests are carried out to verify this fast gasping algorithm. Both simulation and experimental tests demonstrated excellent performances based on the results of grasping a series of unknown objects. To minimize the grasping uncertainty, the merits of the robot hardware with two 3D cameras can be utilized to suffice the partial point cloud. As a result of utilizing the robot hardware, the grasping reliance is highly enhanced. Therefore, this research demonstrates practical significance for increasing grasping speed and thus increasing robot efficiency under unpredictable environments.
Sparse principal component analysis in medical shape modeling
Sjöstrand, Karl; Stegmann, Mikkel B.; Larsen, Rasmus
2006-03-01
Principal component analysis (PCA) is a widely used tool in medical image analysis for data reduction, model building, and data understanding and exploration. While PCA is a holistic approach where each new variable is a linear combination of all original variables, sparse PCA (SPCA) aims at producing easily interpreted models through sparse loadings, i.e. each new variable is a linear combination of a subset of the original variables. One of the aims of using SPCA is the possible separation of the results into isolated and easily identifiable effects. This article introduces SPCA for shape analysis in medicine. Results for three different data sets are given in relation to standard PCA and sparse PCA by simple thresholding of small loadings. Focus is on a recent algorithm for computing sparse principal components, but a review of other approaches is supplied as well. The SPCA algorithm has been implemented using Matlab and is available for download. The general behavior of the algorithm is investigated, and strengths and weaknesses are discussed. The original report on the SPCA algorithm argues that the ordering of modes is not an issue. We disagree on this point and propose several approaches to establish sensible orderings. A method that orders modes by decreasing variance and maximizes the sum of variances for all modes is presented and investigated in detail.
Characteristic gene selection via weighting principal components by singular values.
Directory of Open Access Journals (Sweden)
Jin-Xing Liu
Full Text Available Conventional gene selection methods based on principal component analysis (PCA use only the first principal component (PC of PCA or sparse PCA to select characteristic genes. These methods indeed assume that the first PC plays a dominant role in gene selection. However, in a number of cases this assumption is not satisfied, so the conventional PCA-based methods usually provide poor selection results. In order to improve the performance of the PCA-based gene selection method, we put forward the gene selection method via weighting PCs by singular values (WPCS. Because different PCs have different importance, the singular values are exploited as the weights to represent the influence on gene selection of different PCs. The ROC curves and AUC statistics on artificial data show that our method outperforms the state-of-the-art methods. Moreover, experimental results on real gene expression data sets show that our method can extract more characteristic genes in response to abiotic stresses than conventional gene selection methods.
Nonlinear Process Fault Diagnosis Based on Serial Principal Component Analysis.
Deng, Xiaogang; Tian, Xuemin; Chen, Sheng; Harris, Chris J
2018-03-01
Many industrial processes contain both linear and nonlinear parts, and kernel principal component analysis (KPCA), widely used in nonlinear process monitoring, may not offer the most effective means for dealing with these nonlinear processes. This paper proposes a new hybrid linear-nonlinear statistical modeling approach for nonlinear process monitoring by closely integrating linear principal component analysis (PCA) and nonlinear KPCA using a serial model structure, which we refer to as serial PCA (SPCA). Specifically, PCA is first applied to extract PCs as linear features, and to decompose the data into the PC subspace and residual subspace (RS). Then, KPCA is performed in the RS to extract the nonlinear PCs as nonlinear features. Two monitoring statistics are constructed for fault detection, based on both the linear and nonlinear features extracted by the proposed SPCA. To effectively perform fault identification after a fault is detected, an SPCA similarity factor method is built for fault recognition, which fuses both the linear and nonlinear features. Unlike PCA and KPCA, the proposed method takes into account both linear and nonlinear PCs simultaneously, and therefore, it can better exploit the underlying process's structure to enhance fault diagnosis performance. Two case studies involving a simulated nonlinear process and the benchmark Tennessee Eastman process demonstrate that the proposed SPCA approach is more effective than the existing state-of-the-art approach based on KPCA alone, in terms of nonlinear process fault detection and identification.
CMB constraints on principal components of the inflaton potential
International Nuclear Information System (INIS)
Dvorkin, Cora; Hu, Wayne
2010-01-01
We place functional constraints on the shape of the inflaton potential from the cosmic microwave background through a variant of the generalized slow-roll approximation that allows large amplitude, rapidly changing deviations from scale-free conditions. Employing a principal component decomposition of the source function G ' ≅3(V ' /V) 2 -2V '' /V and keeping only those measured to better than 10% results in 5 nearly independent Gaussian constraints that may be used to test any single-field inflationary model where such deviations are expected. The first component implies <3% variations at the 100 Mpc scale. One component shows a 95% CL preference for deviations around the 300 Mpc scale at the ∼10% level but the global significance is reduced considering the 5 components examined. This deviation also requires a change in the cold dark matter density which in a flat ΛCDM model is disfavored by current supernova and Hubble constant data and can be tested with future polarization or high multipole temperature data. Its impact resembles a local running of the tilt from multipoles 30-800 but is only marginally consistent with a constant running beyond this range. For this analysis, we have implemented a ∼40x faster WMAP7 likelihood method which we have made publicly available.
Principal semantic components of language and the measurement of meaning.
Samsonovich, Alexei V; Samsonovic, Alexei V; Ascoli, Giorgio A
2010-06-11
Metric systems for semantics, or semantic cognitive maps, are allocations of words or other representations in a metric space based on their meaning. Existing methods for semantic mapping, such as Latent Semantic Analysis and Latent Dirichlet Allocation, are based on paradigms involving dissimilarity metrics. They typically do not take into account relations of antonymy and yield a large number of domain-specific semantic dimensions. Here, using a novel self-organization approach, we construct a low-dimensional, context-independent semantic map of natural language that represents simultaneously synonymy and antonymy. Emergent semantics of the map principal components are clearly identifiable: the first three correspond to the meanings of "good/bad" (valence), "calm/excited" (arousal), and "open/closed" (freedom), respectively. The semantic map is sufficiently robust to allow the automated extraction of synonyms and antonyms not originally in the dictionaries used to construct the map and to predict connotation from their coordinates. The map geometric characteristics include a limited number ( approximately 4) of statistically significant dimensions, a bimodal distribution of the first component, increasing kurtosis of subsequent (unimodal) components, and a U-shaped maximum-spread planar projection. Both the semantic content and the main geometric features of the map are consistent between dictionaries (Microsoft Word and Princeton's WordNet), among Western languages (English, French, German, and Spanish), and with previously established psychometric measures. By defining the semantics of its dimensions, the constructed map provides a foundational metric system for the quantitative analysis of word meaning. Language can be viewed as a cumulative product of human experiences. Therefore, the extracted principal semantic dimensions may be useful to characterize the general semantic dimensions of the content of mental states. This is a fundamental step toward a
Principal semantic components of language and the measurement of meaning.
Directory of Open Access Journals (Sweden)
Alexei V Samsonovich
Full Text Available Metric systems for semantics, or semantic cognitive maps, are allocations of words or other representations in a metric space based on their meaning. Existing methods for semantic mapping, such as Latent Semantic Analysis and Latent Dirichlet Allocation, are based on paradigms involving dissimilarity metrics. They typically do not take into account relations of antonymy and yield a large number of domain-specific semantic dimensions. Here, using a novel self-organization approach, we construct a low-dimensional, context-independent semantic map of natural language that represents simultaneously synonymy and antonymy. Emergent semantics of the map principal components are clearly identifiable: the first three correspond to the meanings of "good/bad" (valence, "calm/excited" (arousal, and "open/closed" (freedom, respectively. The semantic map is sufficiently robust to allow the automated extraction of synonyms and antonyms not originally in the dictionaries used to construct the map and to predict connotation from their coordinates. The map geometric characteristics include a limited number ( approximately 4 of statistically significant dimensions, a bimodal distribution of the first component, increasing kurtosis of subsequent (unimodal components, and a U-shaped maximum-spread planar projection. Both the semantic content and the main geometric features of the map are consistent between dictionaries (Microsoft Word and Princeton's WordNet, among Western languages (English, French, German, and Spanish, and with previously established psychometric measures. By defining the semantics of its dimensions, the constructed map provides a foundational metric system for the quantitative analysis of word meaning. Language can be viewed as a cumulative product of human experiences. Therefore, the extracted principal semantic dimensions may be useful to characterize the general semantic dimensions of the content of mental states. This is a fundamental step
Priority of VHS Development Based in Potential Area using Principal Component Analysis
Meirawan, D.; Ana, A.; Saripudin, S.
2018-02-01
The current condition of VHS is still inadequate in quality, quantity and relevance. The purpose of this research is to analyse the development of VHS based on the development of regional potential by using principal component analysis (PCA) in Bandung, Indonesia. This study used descriptive qualitative data analysis using the principle of secondary data reduction component. The method used is Principal Component Analysis (PCA) analysis with Minitab Statistics Software tool. The results of this study indicate the value of the lowest requirement is a priority of the construction of development VHS with a program of majors in accordance with the development of regional potential. Based on the PCA score found that the main priority in the development of VHS in Bandung is in Saguling, which has the lowest PCA value of 416.92 in area 1, Cihampelas with the lowest PCA value in region 2 and Padalarang with the lowest PCA value.
Salvatore, Stefania; Røislien, Jo; Baz-Lomba, Jose A; Bramness, Jørgen G
2017-03-01
Wastewater-based epidemiology is an alternative method for estimating the collective drug use in a community. We applied functional data analysis, a statistical framework developed for analysing curve data, to investigate weekly temporal patterns in wastewater measurements of three prescription drugs with known abuse potential: methadone, oxazepam and methylphenidate, comparing them to positive and negative control drugs. Sewage samples were collected in February 2014 from a wastewater treatment plant in Oslo, Norway. The weekly pattern of each drug was extracted by fitting of generalized additive models, using trigonometric functions to model the cyclic behaviour. From the weekly component, the main temporal features were then extracted using functional principal component analysis. Results are presented through the functional principal components (FPCs) and corresponding FPC scores. Clinically, the most important weekly feature of the wastewater-based epidemiology data was the second FPC, representing the difference between average midweek level and a peak during the weekend, representing possible recreational use of a drug in the weekend. Estimated scores on this FPC indicated recreational use of methylphenidate, with a high weekend peak, but not for methadone and oxazepam. The functional principal component analysis uncovered clinically important temporal features of the weekly patterns of the use of prescription drugs detected from wastewater analysis. This may be used as a post-marketing surveillance method to monitor prescription drugs with abuse potential. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Recursive Principal Components Analysis Using Eigenvector Matrix Perturbation
Directory of Open Access Journals (Sweden)
Deniz Erdogmus
2004-10-01
Full Text Available Principal components analysis is an important and well-studied subject in statistics and signal processing. The literature has an abundance of algorithms for solving this problem, where most of these algorithms could be grouped into one of the following three approaches: adaptation based on Hebbian updates and deflation, optimization of a second-order statistical criterion (like reconstruction error or output variance, and fixed point update rules with deflation. In this paper, we take a completely different approach that avoids deflation and the optimization of a cost function using gradients. The proposed method updates the eigenvector and eigenvalue matrices simultaneously with every new sample such that the estimates approximately track their true values as would be calculated from the current sample estimate of the data covariance matrix. The performance of this algorithm is compared with that of traditional methods like Sanger's rule and APEX, as well as a structurally similar matrix perturbation-based method.
Preliminary study of soil permeability properties using principal component analysis
Yulianti, M.; Sudriani, Y.; Rustini, H. A.
2018-02-01
Soil permeability measurement is undoubtedly important in carrying out soil-water research such as rainfall-runoff modelling, irrigation water distribution systems, etc. It is also known that acquiring reliable soil permeability data is rather laborious, time-consuming, and costly. Therefore, it is desirable to develop the prediction model. Several studies of empirical equations for predicting permeability have been undertaken by many researchers. These studies derived the models from areas which soil characteristics are different from Indonesian soil, which suggest a possibility that these permeability models are site-specific. The purpose of this study is to identify which soil parameters correspond strongly to soil permeability and propose a preliminary model for permeability prediction. Principal component analysis (PCA) was applied to 16 parameters analysed from 37 sites consist of 91 samples obtained from Batanghari Watershed. Findings indicated five variables that have strong correlation with soil permeability, and we recommend a preliminary permeability model, which is potential for further development.
Development of motion image prediction method using principal component analysis
International Nuclear Information System (INIS)
Chhatkuli, Ritu Bhusal; Demachi, Kazuyuki; Kawai, Masaki; Sakakibara, Hiroshi; Kamiaka, Kazuma
2012-01-01
Respiratory motion can induce the limit in the accuracy of area irradiated during lung cancer radiation therapy. Many methods have been introduced to minimize the impact of healthy tissue irradiation due to the lung tumor motion. The purpose of this research is to develop an algorithm for the improvement of image guided radiation therapy by the prediction of motion images. We predict the motion images by using principal component analysis (PCA) and multi-channel singular spectral analysis (MSSA) method. The images/movies were successfully predicted and verified using the developed algorithm. With the proposed prediction method it is possible to forecast the tumor images over the next breathing period. The implementation of this method in real time is believed to be significant for higher level of tumor tracking including the detection of sudden abdominal changes during radiation therapy. (author)
Principal Component Analysis for Normal-Distribution-Valued Symbolic Data.
Wang, Huiwen; Chen, Meiling; Shi, Xiaojun; Li, Nan
2016-02-01
This paper puts forward a new approach to principal component analysis (PCA) for normal-distribution-valued symbolic data, which has a vast potential of applications in the economic and management field. We derive a full set of numerical characteristics and variance-covariance structure for such data, which forms the foundation for our analytical PCA approach. Our approach is able to use all of the variance information in the original data than the prevailing representative-type approach in the literature which only uses centers, vertices, etc. The paper also provides an accurate approach to constructing the observations in a PC space based on the linear additivity property of normal distribution. The effectiveness of the proposed method is illustrated by simulated numerical experiments. At last, our method is applied to explain the puzzle of risk-return tradeoff in China's stock market.
Iris recognition based on robust principal component analysis
Karn, Pradeep; He, Xiao Hai; Yang, Shuai; Wu, Xiao Hong
2014-11-01
Iris images acquired under different conditions often suffer from blur, occlusion due to eyelids and eyelashes, specular reflection, and other artifacts. Existing iris recognition systems do not perform well on these types of images. To overcome these problems, we propose an iris recognition method based on robust principal component analysis. The proposed method decomposes all training images into a low-rank matrix and a sparse error matrix, where the low-rank matrix is used for feature extraction. The sparsity concentration index approach is then applied to validate the recognition result. Experimental results using CASIA V4 and IIT Delhi V1iris image databases showed that the proposed method achieved competitive performances in both recognition accuracy and computational efficiency.
A novel principal component analysis for spatially misaligned multivariate air pollution data.
Jandarov, Roman A; Sheppard, Lianne A; Sampson, Paul D; Szpiro, Adam A
2017-01-01
We propose novel methods for predictive (sparse) PCA with spatially misaligned data. These methods identify principal component loading vectors that explain as much variability in the observed data as possible, while also ensuring the corresponding principal component scores can be predicted accurately by means of spatial statistics at locations where air pollution measurements are not available. This will make it possible to identify important mixtures of air pollutants and to quantify their health effects in cohort studies, where currently available methods cannot be used. We demonstrate the utility of predictive (sparse) PCA in simulated data and apply the approach to annual averages of particulate matter speciation data from national Environmental Protection Agency (EPA) regulatory monitors.
International Nuclear Information System (INIS)
Waheed, S.; Rahman, S.; Siddique, N.
2013-01-01
Different types of Ca supplements are available in the local markets of Pakistan. It is sometimes difficult to classify these with respect to their composition. In the present work principal component analysis (PCA) technique was applied to classify different Ca supplements on the basis of their elemental data obtained using instrumental neutron activation analysis (INAA) and atomic absorption spectrometry (AAS) techniques. The graphical representation of principal component analysis (PCA) scores utilizing intricate analytical data successfully generated four different types of Ca supplements with compatible samples grouped together. These included Ca supplements with CaCO/sub 3/as Ca source along with vitamin C, the supplements with CaCO/sub 3/ as Ca source along with vitamin D, Supplements with Ca from bone meal and supplements with chelated calcium. (author)
PRINCIPAL COMPONENT ANALYSIS STUDIES OF TURBULENCE IN OPTICALLY THICK GAS
Energy Technology Data Exchange (ETDEWEB)
Correia, C.; Medeiros, J. R. De [Departamento de Física Teórica e Experimental, Universidade Federal do Rio Grande do Norte, 59072-970, Natal (Brazil); Lazarian, A. [Astronomy Department, University of Wisconsin, Madison, 475 N. Charter St., WI 53711 (United States); Burkhart, B. [Harvard-Smithsonian Center for Astrophysics, 60 Garden St, MS-20, Cambridge, MA 02138 (United States); Pogosyan, D., E-mail: caioftc@dfte.ufrn.br [Canadian Institute for Theoretical Astrophysics, University of Toronto, Toronto, ON (Canada)
2016-02-20
In this work we investigate the sensitivity of principal component analysis (PCA) to the velocity power spectrum in high-opacity regimes of the interstellar medium (ISM). For our analysis we use synthetic position–position–velocity (PPV) cubes of fractional Brownian motion and magnetohydrodynamics (MHD) simulations, post-processed to include radiative transfer effects from CO. We find that PCA analysis is very different from the tools based on the traditional power spectrum of PPV data cubes. Our major finding is that PCA is also sensitive to the phase information of PPV cubes and this allows PCA to detect the changes of the underlying velocity and density spectra at high opacities, where the spectral analysis of the maps provides the universal −3 spectrum in accordance with the predictions of the Lazarian and Pogosyan theory. This makes PCA a potentially valuable tool for studies of turbulence at high opacities, provided that proper gauging of the PCA index is made. However, we found the latter to not be easy, as the PCA results change in an irregular way for data with high sonic Mach numbers. This is in contrast to synthetic Brownian noise data used for velocity and density fields that show monotonic PCA behavior. We attribute this difference to the PCA's sensitivity to Fourier phase information.
Integrative sparse principal component analysis of gene expression data.
Liu, Mengque; Fan, Xinyan; Fang, Kuangnan; Zhang, Qingzhao; Ma, Shuangge
2017-12-01
In the analysis of gene expression data, dimension reduction techniques have been extensively adopted. The most popular one is perhaps the PCA (principal component analysis). To generate more reliable and more interpretable results, the SPCA (sparse PCA) technique has been developed. With the "small sample size, high dimensionality" characteristic of gene expression data, the analysis results generated from a single dataset are often unsatisfactory. Under contexts other than dimension reduction, integrative analysis techniques, which jointly analyze the raw data of multiple independent datasets, have been developed and shown to outperform "classic" meta-analysis and other multidatasets techniques and single-dataset analysis. In this study, we conduct integrative analysis by developing the iSPCA (integrative SPCA) method. iSPCA achieves the selection and estimation of sparse loadings using a group penalty. To take advantage of the similarity across datasets and generate more accurate results, we further impose contrasted penalties. Different penalties are proposed to accommodate different data conditions. Extensive simulations show that iSPCA outperforms the alternatives under a wide spectrum of settings. The analysis of breast cancer and pancreatic cancer data further shows iSPCA's satisfactory performance. © 2017 WILEY PERIODICALS, INC.
A Principal Component Analysis of 39 Scientific Impact Measures
Bollen, Johan; Van de Sompel, Herbert
2009-01-01
Background The impact of scientific publications has traditionally been expressed in terms of citation counts. However, scientific activity has moved online over the past decade. To better capture scientific impact in the digital era, a variety of new impact measures has been proposed on the basis of social network analysis and usage log data. Here we investigate how these new measures relate to each other, and how accurately and completely they express scientific impact. Methodology We performed a principal component analysis of the rankings produced by 39 existing and proposed measures of scholarly impact that were calculated on the basis of both citation and usage log data. Conclusions Our results indicate that the notion of scientific impact is a multi-dimensional construct that can not be adequately measured by any single indicator, although some measures are more suitable than others. The commonly used citation Impact Factor is not positioned at the core of this construct, but at its periphery, and should thus be used with caution. PMID:19562078
Principal component analysis of 1/fα noise
International Nuclear Information System (INIS)
Gao, J.B.; Cao Yinhe; Lee, J.-M.
2003-01-01
Principal component analysis (PCA) is a popular data analysis method. One of the motivations for using PCA in practice is to reduce the dimension of the original data by projecting the raw data onto a few dominant eigenvectors with large variance (energy). Due to the ubiquity of 1/f α noise in science and engineering, in this Letter we study the prototypical stochastic model for 1/f α processes--the fractional Brownian motion (fBm) processes using PCA, and find that the eigenvalues from PCA of fBm processes follow a power-law, with the exponent being the key parameter defining the fBm processes. We also study random-walk-type processes constructed from DNA sequences, and find that the eigenvalue spectrum from PCA of those random-walk processes also follow power-law relations, with the exponent characterizing the correlation structures of the DNA sequence. In fact, it is observed that PCA can automatically remove linear trends induced by patchiness in the DNA sequence, hence, PCA has a similar capability to the detrended fluctuation analysis. Implications of the power-law distributed eigenvalue spectrum are discussed
PRINCIPAL COMPONENT ANALYSIS STUDIES OF TURBULENCE IN OPTICALLY THICK GAS
International Nuclear Information System (INIS)
Correia, C.; Medeiros, J. R. De; Lazarian, A.; Burkhart, B.; Pogosyan, D.
2016-01-01
In this work we investigate the sensitivity of principal component analysis (PCA) to the velocity power spectrum in high-opacity regimes of the interstellar medium (ISM). For our analysis we use synthetic position–position–velocity (PPV) cubes of fractional Brownian motion and magnetohydrodynamics (MHD) simulations, post-processed to include radiative transfer effects from CO. We find that PCA analysis is very different from the tools based on the traditional power spectrum of PPV data cubes. Our major finding is that PCA is also sensitive to the phase information of PPV cubes and this allows PCA to detect the changes of the underlying velocity and density spectra at high opacities, where the spectral analysis of the maps provides the universal −3 spectrum in accordance with the predictions of the Lazarian and Pogosyan theory. This makes PCA a potentially valuable tool for studies of turbulence at high opacities, provided that proper gauging of the PCA index is made. However, we found the latter to not be easy, as the PCA results change in an irregular way for data with high sonic Mach numbers. This is in contrast to synthetic Brownian noise data used for velocity and density fields that show monotonic PCA behavior. We attribute this difference to the PCA's sensitivity to Fourier phase information
Sensor Failure Detection of FASSIP System using Principal Component Analysis
Sudarno; Juarsa, Mulya; Santosa, Kussigit; Deswandri; Sunaryo, Geni Rina
2018-02-01
In the nuclear reactor accident of Fukushima Daiichi in Japan, the damages of core and pressure vessel were caused by the failure of its active cooling system (diesel generator was inundated by tsunami). Thus researches on passive cooling system for Nuclear Power Plant are performed to improve the safety aspects of nuclear reactors. The FASSIP system (Passive System Simulation Facility) is an installation used to study the characteristics of passive cooling systems at nuclear power plants. The accuracy of sensor measurement of FASSIP system is essential, because as the basis for determining the characteristics of a passive cooling system. In this research, a sensor failure detection method for FASSIP system is developed, so the indication of sensor failures can be detected early. The method used is Principal Component Analysis (PCA) to reduce the dimension of the sensor, with the Squarred Prediction Error (SPE) and statistic Hotteling criteria for detecting sensor failure indication. The results shows that PCA method is capable to detect the occurrence of a failure at any sensor.
Principal Component Analysis of Process Datasets with Missing Values
Directory of Open Access Journals (Sweden)
Kristen A. Severson
2017-07-01
Full Text Available Datasets with missing values arising from causes such as sensor failure, inconsistent sampling rates, and merging data from different systems are common in the process industry. Methods for handling missing data typically operate during data pre-processing, but can also occur during model building. This article considers missing data within the context of principal component analysis (PCA, which is a method originally developed for complete data that has widespread industrial application in multivariate statistical process control. Due to the prevalence of missing data and the success of PCA for handling complete data, several PCA algorithms that can act on incomplete data have been proposed. Here, algorithms for applying PCA to datasets with missing values are reviewed. A case study is presented to demonstrate the performance of the algorithms and suggestions are made with respect to choosing which algorithm is most appropriate for particular settings. An alternating algorithm based on the singular value decomposition achieved the best results in the majority of test cases involving process datasets.
A principal component analysis of 39 scientific impact measures.
Directory of Open Access Journals (Sweden)
Johan Bollen
Full Text Available BACKGROUND: The impact of scientific publications has traditionally been expressed in terms of citation counts. However, scientific activity has moved online over the past decade. To better capture scientific impact in the digital era, a variety of new impact measures has been proposed on the basis of social network analysis and usage log data. Here we investigate how these new measures relate to each other, and how accurately and completely they express scientific impact. METHODOLOGY: We performed a principal component analysis of the rankings produced by 39 existing and proposed measures of scholarly impact that were calculated on the basis of both citation and usage log data. CONCLUSIONS: Our results indicate that the notion of scientific impact is a multi-dimensional construct that can not be adequately measured by any single indicator, although some measures are more suitable than others. The commonly used citation Impact Factor is not positioned at the core of this construct, but at its periphery, and should thus be used with caution.
Kelly, Michael D.
2016-01-01
This study compares School Leaders Licensure Assessment (SLLA) sub-scores with principal interns' self-assessment sub-scores (ISA) for a principal internship evaluation instrument in one educational leadership graduate program. The results of the study will be used to help establish the effectiveness of the current principal internship program,…
Principal component approach in variance component estimation for international sire evaluation
Directory of Open Access Journals (Sweden)
Jakobsen Jette
2011-05-01
Full Text Available Abstract Background The dairy cattle breeding industry is a highly globalized business, which needs internationally comparable and reliable breeding values of sires. The international Bull Evaluation Service, Interbull, was established in 1983 to respond to this need. Currently, Interbull performs multiple-trait across country evaluations (MACE for several traits and breeds in dairy cattle and provides international breeding values to its member countries. Estimating parameters for MACE is challenging since the structure of datasets and conventional use of multiple-trait models easily result in over-parameterized genetic covariance matrices. The number of parameters to be estimated can be reduced by taking into account only the leading principal components of the traits considered. For MACE, this is readily implemented in a random regression model. Methods This article compares two principal component approaches to estimate variance components for MACE using real datasets. The methods tested were a REML approach that directly estimates the genetic principal components (direct PC and the so-called bottom-up REML approach (bottom-up PC, in which traits are sequentially added to the analysis and the statistically significant genetic principal components are retained. Furthermore, this article evaluates the utility of the bottom-up PC approach to determine the appropriate rank of the (covariance matrix. Results Our study demonstrates the usefulness of both approaches and shows that they can be applied to large multi-country models considering all concerned countries simultaneously. These strategies can thus replace the current practice of estimating the covariance components required through a series of analyses involving selected subsets of traits. Our results support the importance of using the appropriate rank in the genetic (covariance matrix. Using too low a rank resulted in biased parameter estimates, whereas too high a rank did not result in
Nesakumar, Noel; Baskar, Chanthini; Kesavan, Srinivasan; Rayappan, John Bosco Balaguru; Alwarappan, Subbiah
2018-05-22
The moisture content of beetroot varies during long-term cold storage. In this work, we propose a strategy to identify the moisture content and age of beetroot using principal component analysis coupled Fourier transform infrared spectroscopy (FTIR). Frequent FTIR measurements were recorded directly from the beetroot sample surface over a period of 34 days for analysing its moisture content employing attenuated total reflectance in the spectral ranges of 2614-4000 and 1465-1853 cm -1 with a spectral resolution of 8 cm -1 . In order to estimate the transmittance peak height (T p ) and area under the transmittance curve [Formula: see text] over the spectral ranges of 2614-4000 and 1465-1853 cm -1 , Gaussian curve fitting algorithm was performed on FTIR data. Principal component and nonlinear regression analyses were utilized for FTIR data analysis. Score plot over the ranges of 2614-4000 and 1465-1853 cm -1 allowed beetroot quality discrimination. Beetroot quality predictive models were developed by employing biphasic dose response function. Validation experiment results confirmed that the accuracy of the beetroot quality predictive model reached 97.5%. This research work proves that FTIR spectroscopy in combination with principal component analysis and beetroot quality predictive models could serve as an effective tool for discriminating moisture content in fresh, half and completely spoiled stages of beetroot samples and for providing status alerts.
Dong, Fengxia; Mitchell, Paul D; Colquhoun, Jed
2015-01-01
Measuring farm sustainability performance is a crucial component for improving agricultural sustainability. While extensive assessments and indicators exist that reflect the different facets of agricultural sustainability, because of the relatively large number of measures and interactions among them, a composite indicator that integrates and aggregates over all variables is particularly useful. This paper describes and empirically evaluates a method for constructing a composite sustainability indicator that individually scores and ranks farm sustainability performance. The method first uses non-negative polychoric principal component analysis to reduce the number of variables, to remove correlation among variables and to transform categorical variables to continuous variables. Next the method applies common-weight data envelope analysis to these principal components to individually score each farm. The method solves weights endogenously and allows identifying important practices in sustainability evaluation. An empirical application to Wisconsin cranberry farms finds heterogeneity in sustainability practice adoption, implying that some farms could adopt relevant practices to improve the overall sustainability performance of the industry. Copyright © 2014 Elsevier Ltd. All rights reserved.
Principal Component Analysis: Most Favourite Tool in Chemometrics
Indian Academy of Sciences (India)
GENERAL ARTICLE. Principal ... Chemometrics is a discipline that combines mathematics, statis- ... workers have used PCA for air quality monitoring [8]. ..... J S Verbeke, Handbook of Chemometrics and Qualimetrics, Elsevier, New York,.
Saccenti, E.; Camacho, J.
2015-01-01
Principal component analysis is one of the most commonly used multivariate tools to describe and summarize data. Determining the optimal number of components in a principal component model is a fundamental problem in many fields of application. In this paper we compare the performance of several
Mining gene expression data by interpreting principal components
Directory of Open Access Journals (Sweden)
Mortazavi Ali
2006-04-01
Full Text Available Abstract Background There are many methods for analyzing microarray data that group together genes having similar patterns of expression over all conditions tested. However, in many instances the biologically important goal is to identify relatively small sets of genes that share coherent expression across only some conditions, rather than all or most conditions as required in traditional clustering; e.g. genes that are highly up-regulated and/or down-regulated similarly across only a subset of conditions. Equally important is the need to learn which conditions are the decisive ones in forming such gene sets of interest, and how they relate to diverse conditional covariates, such as disease diagnosis or prognosis. Results We present a method for automatically identifying such candidate sets of biologically relevant genes using a combination of principal components analysis and information theoretic metrics. To enable easy use of our methods, we have developed a data analysis package that facilitates visualization and subsequent data mining of the independent sources of significant variation present in gene microarray expression datasets (or in any other similarly structured high-dimensional dataset. We applied these tools to two public datasets, and highlight sets of genes most affected by specific subsets of conditions (e.g. tissues, treatments, samples, etc.. Statistically significant associations for highlighted gene sets were shown via global analysis for Gene Ontology term enrichment. Together with covariate associations, the tool provides a basis for building testable hypotheses about the biological or experimental causes of observed variation. Conclusion We provide an unsupervised data mining technique for diverse microarray expression datasets that is distinct from major methods now in routine use. In test uses, this method, based on publicly available gene annotations, appears to identify numerous sets of biologically relevant genes. It
He, Shiyuan; Wang, Lifan; Huang, Jianhua Z.
2018-04-01
With growing data from ongoing and future supernova surveys, it is possible to empirically quantify the shapes of SNIa light curves in more detail, and to quantitatively relate the shape parameters with the intrinsic properties of SNIa. Building such relationships is critical in controlling systematic errors associated with supernova cosmology. Based on a collection of well-observed SNIa samples accumulated in the past years, we construct an empirical SNIa light curve model using a statistical method called the functional principal component analysis (FPCA) for sparse and irregularly sampled functional data. Using this method, the entire light curve of an SNIa is represented by a linear combination of principal component functions, and the SNIa is represented by a few numbers called “principal component scores.” These scores are used to establish relations between light curve shapes and physical quantities such as intrinsic color, interstellar dust reddening, spectral line strength, and spectral classes. These relations allow for descriptions of some critical physical quantities based purely on light curve shape parameters. Our study shows that some important spectral feature information is being encoded in the broad band light curves; for instance, we find that the light curve shapes are correlated with the velocity and velocity gradient of the Si II λ6355 line. This is important for supernova surveys (e.g., LSST and WFIRST). Moreover, the FPCA light curve model is used to construct the entire light curve shape, which in turn is used in a functional linear form to adjust intrinsic luminosity when fitting distance models.
Principal component analysis of dynamic fluorescence images for diagnosis of diabetic vasculopathy
Seo, Jihye; An, Yuri; Lee, Jungsul; Ku, Taeyun; Kang, Yujung; Ahn, Chulwoo; Choi, Chulhee
2016-04-01
Indocyanine green (ICG) fluorescence imaging has been clinically used for noninvasive visualizations of vascular structures. We have previously developed a diagnostic system based on dynamic ICG fluorescence imaging for sensitive detection of vascular disorders. However, because high-dimensional raw data were used, the analysis of the ICG dynamics proved difficult. We used principal component analysis (PCA) in this study to extract important elements without significant loss of information. We examined ICG spatiotemporal profiles and identified critical features related to vascular disorders. PCA time courses of the first three components showed a distinct pattern in diabetic patients. Among the major components, the second principal component (PC2) represented arterial-like features. The explained variance of PC2 in diabetic patients was significantly lower than in normal controls. To visualize the spatial pattern of PCs, pixels were mapped with red, green, and blue channels. The PC2 score showed an inverse pattern between normal controls and diabetic patients. We propose that PC2 can be used as a representative bioimaging marker for the screening of vascular diseases. It may also be useful in simple extractions of arterial-like features.
Energy Technology Data Exchange (ETDEWEB)
Christensen, J.H. [Royal Veterinary and Agricultural Univ., Thorvaldsensvej (Denmark). Dept. of Natural Sciences; Hansen, A.B. [National Environmental Research Inst., Roskilde (Denmark). Dept. of Environmental Chemistry and Microbiology; Andersen, O. [Roskilde Univ., Roskilde (Denmark). Dept. of Life Sciences and Chemistry
2005-07-01
Biomarkers such as steranes and terpanes are abundant in crude oils, particularly in heavy distillate petroleum products. They are useful for matching highly weathered oil samples when other groups of petroleum hydrocarbons fail to distinguish oil samples. In this study, time warping and principal component analysis (PCA) were applied for oil hydrocarbon fingerprinting based on relative amounts of terpane and sterane isomers analyzed by gas chromatography and mass spectrometry. The 4 principal components were boiling point range, clay content, marine or organic terrestrial matter, and maturity based on differences in the terpane and sterane isomer patterns. This study is an extension of a previous fingerprinting study for identifying the sources of oil spill samples based only on the profiles of sterane isomers. Spill samples from the Baltic Carrier oil spill were correctly identified by inspection of score plots. The interpretation of the loading and score plots offered further chemical information about correlations between changes in the amounts of sterane and terpane isomers. It was concluded that this method is an objective procedure for analyzing chromatograms with more comprehensive data usage compared to other fingerprinting methods. 20 refs., 4 figs.
International Nuclear Information System (INIS)
Christensen, J.H.; Hansen, A.B.; Andersen, O.
2005-01-01
Biomarkers such as steranes and terpanes are abundant in crude oils, particularly in heavy distillate petroleum products. They are useful for matching highly weathered oil samples when other groups of petroleum hydrocarbons fail to distinguish oil samples. In this study, time warping and principal component analysis (PCA) were applied for oil hydrocarbon fingerprinting based on relative amounts of terpane and sterane isomers analyzed by gas chromatography and mass spectrometry. The 4 principal components were boiling point range, clay content, marine or organic terrestrial matter, and maturity based on differences in the terpane and sterane isomer patterns. This study is an extension of a previous fingerprinting study for identifying the sources of oil spill samples based only on the profiles of sterane isomers. Spill samples from the Baltic Carrier oil spill were correctly identified by inspection of score plots. The interpretation of the loading and score plots offered further chemical information about correlations between changes in the amounts of sterane and terpane isomers. It was concluded that this method is an objective procedure for analyzing chromatograms with more comprehensive data usage compared to other fingerprinting methods. 20 refs., 4 figs
Roopwani, Rahul; Buckner, Ira S
2011-10-14
Principal component analysis (PCA) was applied to pharmaceutical powder compaction. A solid fraction parameter (SF(c/d)) and a mechanical work parameter (W(c/d)) representing irreversible compression behavior were determined as functions of applied load. Multivariate analysis of the compression data was carried out using PCA. The first principal component (PC1) showed loadings for the solid fraction and work values that agreed with changes in the relative significance of plastic deformation to consolidation at different pressures. The PC1 scores showed the same rank order as the relative plasticity ranking derived from the literature for common pharmaceutical materials. The utility of PC1 in understanding deformation was extended to binary mixtures using a subset of the original materials. Combinations of brittle and plastic materials were characterized using the PCA method. The relationships between PC1 scores and the weight fractions of the mixtures were typically linear showing ideal mixing in their deformation behaviors. The mixture consisting of two plastic materials was the only combination to show a consistent positive deviation from ideality. The application of PCA to solid fraction and mechanical work data appears to be an effective means of predicting deformation behavior during compaction of simple powder mixtures. Copyright © 2011 Elsevier B.V. All rights reserved.
International Nuclear Information System (INIS)
Jung, Young Mee
2003-01-01
Principal component analysis based two-dimensional (PCA-2D) correlation analysis is applied to FTIR spectra of polystyrene/methyl ethyl ketone/toluene solution mixture during the solvent evaporation. Substantial amount of artificial noise were added to the experimental data to demonstrate the practical noise-suppressing benefit of PCA-2D technique. 2D correlation analysis of the reconstructed data matrix from PCA loading vectors and scores successfully extracted only the most important features of synchronicity and asynchronicity without interference from noise or insignificant minor components. 2D correlation spectra constructed with only one principal component yield strictly synchronous response with no discernible a asynchronous features, while those involving at least two or more principal components generated meaningful asynchronous 2D correlation spectra. Deliberate manipulation of the rank of the reconstructed data matrix, by choosing the appropriate number and type of PCs, yields potentially more refined 2D correlation spectra
Corriveau, H; Arsenault, A B; Dutil, E; Lepage, Y
1992-01-01
An evaluation based on the Bobath approach to treatment has previously been developed and partially validated. The purpose of the present study was to verify the content validity of this evaluation with the use of a statistical approach known as principal components analysis. Thirty-eight hemiplegic subjects participated in the study. Analysis of the scores on each of six parameters (sensorium, active movements, muscle tone, reflex activity, postural reactions, and pain) was evaluated on three occasions across a 2-month period. Each time this produced three factors that contained 70% of the variation in the data set. The first component mainly reflected variations in mobility, the second mainly variations in muscle tone, and the third mainly variations in sensorium and pain. The results of such exploratory analysis highlight the fact that some of the parameters are not only important but also interrelated. These results seem to partially support the conceptual framework substantiating the Bobath approach to treatment.
Use of Principal Components Analysis to Explain Controls on Nutrient Fluxes to the Chesapeake Bay
Rice, K. C.; Mills, A. L.
2017-12-01
The Chesapeake Bay watershed, on the east coast of the United States, encompasses about 166,000-square kilometers (km2) of diverse land use, which includes a mixture of forested, agricultural, and developed land. The watershed is now managed under a Total Daily Maximum Load (TMDL), which requires implementation of management actions by 2025 that are sufficient to reduce nitrogen, phosphorus, and suspended-sediment fluxes to the Chesapeake Bay and restore the bay's water quality. We analyzed nutrient and sediment data along with land-use and climatic variables in nine sub watersheds to better understand the drivers of flux within the watershed and to provide relevant management implications. The nine sub watersheds range in area from 300 to 30,000 km2, and the analysis period was 1985-2014. The 31 variables specific to each sub watershed were highly statistically significantly correlated, so Principal Components Analysis was used to reduce the dimensionality of the dataset. The analysis revealed that about 80% of the variability in the whole dataset can be explained by discharge, flux, and concentration of nutrients and sediment. The first two principal components (PCs) explained about 68% of the total variance. PC1 loaded strongly on discharge and flux, and PC2 loaded on concentration. The PC scores of both PC1 and PC2 varied by season. Subsequent analysis of PC1 scores versus PC2 scores, broken out by sub watershed, revealed management implications. Some of the largest sub watersheds are largely driven by discharge, and consequently large fluxes. In contrast, some of the smaller sub watersheds are more variable in nutrient concentrations than discharge and flux. Our results suggest that, given no change in discharge, a reduction in nutrient flux to the streams in the smaller watersheds could result in a proportionately larger decrease in fluxes of nutrients down the river to the bay, than in the larger watersheds.
Selective principal component regression analysis (SPCR) uses a subset of the original image bands for principal component transformation and regression. For optimal band selection before the transformation, this paper used genetic algorithms (GA). In this case, the GA process used the regression co...
Identifying apple surface defects using principal components analysis and artifical neural networks
Artificial neural networks and principal components were used to detect surface defects on apples in near-infrared images. Neural networks were trained and tested on sets of principal components derived from columns of pixels from images of apples acquired at two wavelengths (740 nm and 950 nm). I...
Detecting Market Transitions and Energy Futures Risk Management Using Principal Components
Borovkova, S.A.
2006-01-01
An empirical approach to analysing the forward curve dynamics of energy futures is presented. For non-seasonal commodities-such as crude oil-the forward curve is well described by the first three principal components: the level, slope and curvature. A principal component indicator is described that
Principal Components Analysis of Job Burnout and Coping ...
African Journals Online (AJOL)
The key component structure of job burnout were feelings of disgust, insomnia, headaches, weight loss or gain feeling of omniscient, pain of unexplained origin, hopelessness, agitation and workaholics, while the factor structure of coping strategies were development of self realistic picture, retaining hope, asking for help ...
Local Prediction Models on Mid-Atlantic Ridge MORB by Principal Component Regression
Ling, X.; Snow, J. E.; Chin, W.
2017-12-01
The isotopic compositions of the daughter isotopes of long-lived radioactive systems (Sr, Nd, Hf and Pb ) can be used to map the scale and history of mantle heterogeneities beneath mid-ocean ridges. Our goal is to relate the multidimensional structure in the existing isotopic dataset with an underlying physical reality of mantle sources. The numerical technique of Principal Component Analysis is useful to reduce the linear dependence of the data to a minimum set of orthogonal eigenvectors encapsulating the information contained (cf Agranier et al 2005). The dataset used for this study covers almost all the MORBs along mid-Atlantic Ridge (MAR), from 54oS to 77oN and 8.8oW to -46.7oW, including replicating the dataset of Agranier et al., 2005 published plus 53 basalt samples dredged and analyzed since then (data from PetDB). The principal components PC1 and PC2 account for 61.56% and 29.21%, respectively, of the total isotope ratios variability. The samples with similar compositions to HIMU and EM and DM are identified to better understand the PCs. PC1 and PC2 are accountable for HIMU and EM whereas PC2 has limited control over the DM source. PC3 is more strongly controlled by the depleted mantle source than PC2. What this means is that all three principal components have a high degree of significance relevant to the established mantle sources. We also tested the relationship between mantle heterogeneity and sample locality. K-means clustering algorithm is a type of unsupervised learning to find groups in the data based on feature similarity. The PC factor scores of each sample are clustered into three groups. Cluster one and three are alternating on the north and south MAR. Cluster two exhibits on 45.18oN to 0.79oN and -27.9oW to -30.40oW alternating with cluster one. The ridge has been preliminarily divided into 16 sections considering both the clusters and ridge segments. The principal component regression models the section based on 6 isotope ratios and PCs. The
Automatic ECG analysis using principal component analysis and wavelet transformation
Khawaja, Antoun
2007-01-01
The main objective of this book is to analyse and detect small changes in ECG waves and complexes that indicate cardiac diseases and disorders. Detecting predisposition to Torsade de Points (TDP) by analysing the beat-to-beat variability in T wave morphology is the main core of this work. The second main topic is detecting small changes in QRS complex and predicting future QRS complexes of patients. Moreover, the last main topic is clustering similar ECG components in different groups.
Directory of Open Access Journals (Sweden)
Gheorghe Gîlcă
2015-06-01
Full Text Available This article deals with a recognition system using an algorithm based on the Principal Component Analysis (PCA technique. The recognition system consists only of a PC and an integrated video camera. The algorithm is developed in MATLAB language and calculates the eigenfaces considered as features of the face. The PCA technique is based on the matching between the facial test image and the training prototype vectors. The mathcing score between the facial test image and the training prototype vectors is calculated between their coefficient vectors. If the matching is high, we have the best recognition. The results of the algorithm based on the PCA technique are very good, even if the person looks from one side at the video camera.
Compressive Online Robust Principal Component Analysis with Multiple Prior Information
DEFF Research Database (Denmark)
Van Luong, Huynh; Deligiannis, Nikos; Seiler, Jürgen
-rank components. Unlike conventional batch RPCA, which processes all the data directly, our method considers a small set of measurements taken per data vector (frame). Moreover, our method incorporates multiple prior information signals, namely previous reconstructed frames, to improve these paration...... and thereafter, update the prior information for the next frame. Using experiments on synthetic data, we evaluate the separation performance of the proposed algorithm. In addition, we apply the proposed algorithm to online video foreground and background separation from compressive measurements. The results show...
Robust LOD scores for variance component-based linkage analysis.
Blangero, J; Williams, J T; Almasy, L
2000-01-01
The variance component method is now widely used for linkage analysis of quantitative traits. Although this approach offers many advantages, the importance of the underlying assumption of multivariate normality of the trait distribution within pedigrees has not been studied extensively. Simulation studies have shown that traits with leptokurtic distributions yield linkage test statistics that exhibit excessive Type I error when analyzed naively. We derive analytical formulae relating the deviation from the expected asymptotic distribution of the lod score to the kurtosis and total heritability of the quantitative trait. A simple correction constant yields a robust lod score for any deviation from normality and for any pedigree structure, and effectively eliminates the problem of inflated Type I error due to misspecification of the underlying probability model in variance component-based linkage analysis.
Using principal component analysis for selecting network behavioral anomaly metrics
Gregorio-de Souza, Ian; Berk, Vincent; Barsamian, Alex
2010-04-01
This work addresses new approaches to behavioral analysis of networks and hosts for the purposes of security monitoring and anomaly detection. Most commonly used approaches simply implement anomaly detectors for one, or a few, simple metrics and those metrics can exhibit unacceptable false alarm rates. For instance, the anomaly score of network communication is defined as the reciprocal of the likelihood that a given host uses a particular protocol (or destination);this definition may result in an unrealistically high threshold for alerting to avoid being flooded by false positives. We demonstrate that selecting and adapting the metrics and thresholds, on a host-by-host or protocol-by-protocol basis can be done by established multivariate analyses such as PCA. We will show how to determine one or more metrics, for each network host, that records the highest available amount of information regarding the baseline behavior, and shows relevant deviances reliably. We describe the methodology used to pick from a large selection of available metrics, and illustrate a method for comparing the resulting classifiers. Using our approach we are able to reduce the resources required to properly identify misbehaving hosts, protocols, or networks, by dedicating system resources to only those metrics that actually matter in detecting network deviations.
Radioactive background in principal components of the Jihlava River ecosystem
International Nuclear Information System (INIS)
Stanek, Z.; Penaz, M.; Trnkova, J.; Wohlgemuth, E.
1980-01-01
In 1976 through to 1978, the radioactive background was investigated in the various components of the Jihlava River ecosystem. The investigations involved total β-activity, 40 K, residual β-activity, sup(nat)U, 226 Ra and, in some of the samples, also 210 Pb, 90 Sr and 137 Cs. The analyses included water, bottom sediments, samples of aquatic macrophytes (Batrachium fluitans), samples of aquatic invertebrates (Herpobdella octoculata, Anodonta cygnea, Asellus aquaticus, larval Ephemeroptera, larval Trichoptera, exuviae of pupae of Chironomidae) and samples of the tissues of 8 species of fishes (Salmo trutta m. fario, Cyprinus carpio, Rutilus rutilus, Leuciscus cephalus, Leuciscus leuciscus, Chondrostoma nasus, Gobio gobio, Barbus barbus). The activity of the radionuclides under study corresponded to the values reported for uncontaminated streams. (author)
Nanni, Arthur Schmidt; Roisenberg, Ari; de Hollanda, Maria Helena Bezerra Maia; Marimon, Maria Paula Casagrande; Viero, Antonio Pedro; Scheibe, Luiz Fernando
2013-01-01
Groundwater with anomalous fluoride content and water mixture patterns were studied in the fractured Serra Geral Aquifer System, a basaltic to rhyolitic geological unit, using a principal component analysis interpretation of groundwater chemical data from 309 deep wells distributed in the Rio Grande do Sul State, Southern Brazil. A four-component model that explains 81% of the total variance in the Principal Component Analysis is suggested. Six hydrochemical groups were identified. δ18O and δ...
Lawson, J. S.; Inglis, James
1984-01-01
A learning disability index (LDI) for the assessment of intellectual deficits on the Wechsler Intelligence Scale for Children-Revised (WISC-R) is described. The Factor II score coefficients derived from an unrotated principal components analysis of the WISC-R normative data, in combination with the individual's scaled scores, are used for this…
Revealing the equivalence of two clonal survival models by principal component analysis
International Nuclear Information System (INIS)
Lachet, Bernard; Dufour, Jacques
1976-01-01
The principal component analysis of 21 chlorella cell survival curves, adjusted by one-hit and two-hit target models, lead to quite similar projections on the principal plan: the homologous parameters of these models are linearly correlated; the reason for the statistical equivalence of these two models, in the present state of experimental inaccuracy, is revealed [fr
Wavelet decomposition based principal component analysis for face recognition using MATLAB
Sharma, Mahesh Kumar; Sharma, Shashikant; Leeprechanon, Nopbhorn; Ranjan, Aashish
2016-03-01
For the realization of face recognition systems in the static as well as in the real time frame, algorithms such as principal component analysis, independent component analysis, linear discriminate analysis, neural networks and genetic algorithms are used for decades. This paper discusses an approach which is a wavelet decomposition based principal component analysis for face recognition. Principal component analysis is chosen over other algorithms due to its relative simplicity, efficiency, and robustness features. The term face recognition stands for identifying a person from his facial gestures and having resemblance with factor analysis in some sense, i.e. extraction of the principal component of an image. Principal component analysis is subjected to some drawbacks, mainly the poor discriminatory power and the large computational load in finding eigenvectors, in particular. These drawbacks can be greatly reduced by combining both wavelet transform decomposition for feature extraction and principal component analysis for pattern representation and classification together, by analyzing the facial gestures into space and time domain, where, frequency and time are used interchangeably. From the experimental results, it is envisaged that this face recognition method has made a significant percentage improvement in recognition rate as well as having a better computational efficiency.
Directory of Open Access Journals (Sweden)
Peter Haščík
2017-01-01
Full Text Available The objective of the present study was to examine the effect of different dietary supplements (bee pollen, propolis, and probiotic on sensory quality of chicken breast muscle. The experiment was performed with 180 one day-old Ross 308 broiler chicks of mixed sex. The dietary treatments were as follows: 1. basal diet with no supplementation as control (C; 2. basal diet plus 400 mg bee pollen extract per 1 kg of feed mixture (E1; 3. basal diet plus 400 mg propolis extract per 1 kg of feed mixture (E2; 4. basal diet plus 3.3 g probiotic preparation based on Lactobacillus fermentum added to drinking water (E3. Sensory properties of chicken breast muscle were assessed by a five-member panel that rated the meat for aroma, taste, juiciness, tenderness and overall acceptability. The ANOVA results for each attribute showed that at least one mean score for any group differs significantly (p ≤0.05. Subsequent Tukey's HSD revealed that only C group had significantly higher mean score (p ≤0.05 for each attribute compared with E2 group. As regards the E1 and E3 groups, there were not significant differences (p >0.05 in aroma, taste and tenderness when compared to C group, with the significantly lowest juiciness value (p ≤0.05 found in E3 group and significantly lower values of overall acceptability in both groups (p ≤0.05. In addition, it is noteworthy that control group received the highest raking scores for each sensory attribute, i.e. the supplements did not influence positively the sensory quality of chicken breast meat. Principal component analysis (PCA of the sensory data showed that the first 3 principal components (PCs explained 69.82% of the total variation in 5 variables. Visualisation of extracted PCs has shown that groups were very well represented, with E2 group clearly distinguished from the others. Normal 0 21 false false false SK X-NONE X-NONE
International Nuclear Information System (INIS)
Zeng, J.; Li, G.; Sun, J.
2013-01-01
Principal components analysis and cluster analysis were used to investigate the properties of different corn varieties. The chemical compositions and some properties of corn flour which processed by drying milling were determined. The results showed that the chemical compositions and physicochemical properties were significantly different among twenty six corn varieties. The quality of corn flour was concerned with five principal components from principal component analysis and the contribution rate of starch pasting properties was important, which could account for 48.90%. Twenty six corn varieties could be classified into four groups by cluster analysis. The consistency between principal components analysis and cluster analysis indicated that multivariate analyses were feasible in the study of corn variety properties. (author)
DEFF Research Database (Denmark)
Malmquist, Linus M.V.; Olsen, Rasmus R.; Hansen, Asger B.
2007-01-01
weathering state and to distinguish between various weathering processes is investigated and discussed. The method is based on comprehensive and objective chromatographic data processing followed by principal component analysis (PCA) of concatenated sections of gas chromatography–mass spectrometry...
Tripathy, Manoj
2012-01-01
This paper describes a new approach for power transformer differential protection which is based on the wave-shape recognition technique. An algorithm based on neural network principal component analysis (NNPCA) with back-propagation learning is proposed for digital differential protection of power transformer. The principal component analysis is used to preprocess the data from power system in order to eliminate redundant information and enhance hidden pattern of differential current to disc...
Geroukis, Asterios; Brorson, Erik
2014-01-01
In this study, we compare the two statistical techniques logistic regression and discriminant analysis to see how well they classify companies based on clusters – made from the solvency ratio – using principal components as independent variables. The principal components are made with different financial ratios. We use cluster analysis to find groups with low, medium and high solvency ratio of 1200 different companies found on the NASDAQ stock market and use this as an apriori definition of ...
Directory of Open Access Journals (Sweden)
Glogovac Svetlana
2012-01-01
Full Text Available This study investigates variability of tomato genotypes based on morphological and biochemical fruit traits. Experimental material is a part of tomato genetic collection from Institute of Filed and Vegetable Crops in Novi Sad, Serbia. Genotypes were analyzed for fruit mass, locule number, index of fruit shape, fruit colour, dry matter content, total sugars, total acidity, lycopene and vitamin C. Minimum, maximum and average values and main indicators of variability (CV and σ were calculated. Principal component analysis was performed to determinate variability source structure. Four principal components, which contribute 93.75% of the total variability, were selected for analysis. The first principal component is defined by vitamin C, locule number and index of fruit shape. The second component is determined by dry matter content, and total acidity, the third by lycopene, fruit mass and fruit colour. Total sugars had the greatest part in the fourth component.
Liang L; Hayashi K; Bennett P; Johnson T. J; Aten J. D
2015-01-01
To understand the relationship between the structure of resource loss and depression after disaster exposure, the components of resource loss and the impact of these resource loss components on depression was examined among college students (N=654) at two universities who were affected by Hurricane Katrina. The component of resource loss was analyzed by principal component analysis first. Gender, social relationship loss, and financial loss were then examined with the regression model on depr...
Puri, Ritika; Khamrui, Kaushik; Khetra, Yogesh; Malhotra, Ravinder; Devraja, H C
2016-02-01
Promising development and expansion in the market of cham-cham, a traditional Indian dairy product is expected in the coming future with the organized production of this milk product by some large dairies. The objective of this study was to document the extent of variation in sensory properties of market samples of cham-cham collected from four different locations known for their excellence in cham-cham production and to find out the attributes that govern much of variation in sensory scores of this product using quantitative descriptive analysis (QDA) and principal component analysis (PCA). QDA revealed significant (p sensory attributes of cham-cham among the market samples. PCA identified four significant principal components that accounted for 72.4 % of the variation in the sensory data. Factor scores of each of the four principal components which primarily correspond to sweetness/shape/dryness of interior, surface appearance/surface dryness, rancid and firmness attributes specify the location of each market sample along each of the axes in 3-D graphs. These findings demonstrate the utility of quantitative descriptive analysis for identifying and measuring attributes of cham-cham that contribute most to its sensory acceptability.
Principal component structure and sport-specific differences in the running one-leg vertical jump.
Laffaye, G; Bardy, B G; Durey, A
2007-05-01
The aim of this study is to identify the kinetic principal components involved in one-leg running vertical jumps, as well as the potential differences between specialists from different sports. The sample was composed of 25 regional skilled athletes who play different jumping sports (volleyball players, handball players, basketball players, high jumpers and novices), who performed a running one-leg jump. A principal component analysis was performed on the data obtained from the 200 tested jumps in order to identify the principal components summarizing the six variables extracted from the force-time curve. Two principal components including six variables accounted for 78 % of the variance in jump height. Running one-leg vertical jump performance was predicted by a temporal component (that brings together impulse time, eccentric time and vertical displacement of the center of mass) and a force component (who brings together relative peak of force and power, and rate of force development). A comparison made among athletes revealed a temporal-prevailing profile for volleyball players, and a force-dominant profile for Fosbury high jumpers. Novices showed an ineffective utilization of the force component, while handball and basketball players showed heterogeneous and neutral component profiles. Participants will use a jumping strategy in which variables related to either the magnitude or timing of force production will be closely coupled; athletes from different sporting backgrounds will use a jumping strategy that reflects the inherent demands of their chosen sport.
A Note on McDonald's Generalization of Principal Components Analysis
Shine, Lester C., II
1972-01-01
It is shown that McDonald's generalization of Classical Principal Components Analysis to groups of variables maximally channels the totalvariance of the original variables through the groups of variables acting as groups. An equation is obtained for determining the vectors of correlations of the L2 components with the original variables.…
Combined principal component preprocessing and n-tuple neural networks for improved classification
DEFF Research Database (Denmark)
Høskuldsson, Agnar; Linneberg, Christian
2000-01-01
We present a combined principal component analysis/neural network scheme for classification. The data used to illustrate the method consist of spectral fluorescence recordings from seven different production facilities, and the task is to relate an unknown sample to one of these seven factories....... The data are first preprocessed by performing an individual principal component analysis on each of the seven groups of data. The components found are then used for classifying the data, but instead of making a single multiclass classifier, we follow the ideas of turning a multiclass problem into a number...... of two-class problems. For each possible pair of classes we further apply a transformation to the calculated principal components in order to increase the separation between the classes. Finally we apply the so-called n-tuple neural network to the transformed data in order to give the classification...
PRINCIPAL COMPONENT ANALYSIS OF FACTORS DETERMINING PHOSPHATE ROCK DISSOLUTION ON ACID SOILS
Directory of Open Access Journals (Sweden)
Yusdar Hilman
2016-10-01
Full Text Available Many of the agricultural soils in Indonesia are acidic and low in both total and available phosphorus which severely limits their potential for crops production. These problems can be corrected by application of chemical fertilizers. However, these fertilizers are expensive, and cheaper alternatives such as phosphate rock (PR have been considered. Several soil factors may influence the dissolution of PR in soils, including both chemical and physical properties. The study aimed to identify PR dissolution factors and evaluate their relative magnitude. The experiment was conducted in Soil Chemical Laboratory, Universiti Putra Malaysia and Indonesian Center for Agricultural Land Resources Research and Development from January to April 2002. The principal component analysis (PCA was used to characterize acid soils in an incubation system into a number of factors that may affect PR dissolution. Three major factors selected were soil texture, soil acidity, and fertilization. Using the scores of individual factors as independent variables, stepwise regression analysis was performed to derive a PR dissolution function. The factors influencing PR dissolution in order of importance were soil texture, soil acidity, then fertilization. Soil texture factors including clay content and organic C, and soil acidity factor such as P retention capacity interacted positively with P dissolution and promoted PR dissolution effectively. Soil texture factors, such as sand and silt content, soil acidity factors such as pH, and exchangeable Ca decreased PR dissolution.
Identification of Tibicen cicada species by a Principal Components Analysis of their songs
Directory of Open Access Journals (Sweden)
Eiji Ohya
2004-06-01
Full Text Available Specific identification of three Tibicen cicadas, T. japonicus, T. flammatus and T. bihamatus, by their chirping sounds was carried out using Principal Components Analysis (PCA. High quality recordings of each species were used as the standards. The peak and mean frequencies and the pulse rate were used as the variables. Out of 12 samples recorded in the fields one fell in the vicinity of T. japonicus and all other were positioned near T. bihamatus. Then the cluster analysis of the PCA scores clearly separated each species and allocated the samples in the same way.A identificação de três espécies de cigarras do gênero Tibicen, T. japonicus, T. flammatus e T. bihamatus, através de seus sons estridentes foi realizada por meio da Análise de Componentes Principais (PCA. Gravações de alta fidelidade de cada espécie foram usadas como referencias. As variáveis usadas foram as freqüências máxima e média e a taxa de pulsos. Das 12 amostras gravadas no campo, uma foi colocada perto de T. japonicus e as outras perto de T. bihamatus. A análise de conglomerados dos valores da PCA separou claramente cada espécie e posicionou as amostras da mesma maneira.
The use of principal component, discriminate and rough sets analysis methods of radiological data
International Nuclear Information System (INIS)
Seddeek, M.K.; Kozae, A.M.; Sharshar, T.; Badran, H.M.
2006-01-01
In this work, computational methods of finding clusters of multivariate data points were explored using principal component analysis (PCA), discriminate analysis (DA) and rough set analysis (RSA) methods. The variables were the concentrations of four natural isotopes and the texture characteristics of 100 sand samples from the coast of North Sinai, Egypt. Beach and dune sands are the two types of samples included. These methods were used to reduce the dimensionality of multivariate data and as classification and clustering methods. The results showed that the classification of sands in the environment of North Sinai is dependent upon the radioactivity contents of the naturally occurring radioactive materials and not upon the characteristics of the sand. The application of DA enables the creation of a classification rule for sand type and it revealed that samples with high negatively values of the first score have the highest contamination of black sand. PCA revealed that radioactivity concentrations alone can be considered to predict the classification of other samples. The results of RSA showed that only one of the concentrations of 238 U, 226 Ra and 232 Th with 40 K content, can characterize the clusters together with characteristics of the sand. Both PCA and RSA result in the following conclusion: 238 U, 226 Ra and 232 Th behave similarly. RSA revealed that one/two of them may not be considered without affecting the body of knowledge
Directory of Open Access Journals (Sweden)
Dongwoo Jang
2018-03-01
Full Text Available Leaks in a water distribution network (WDS constitute losses of water supply caused by pipeline failure, operational loss, and physical factors. This has raised the need for studies on the factors affecting the leakage ratio and estimation of leakage volume in a water supply system. In this study, principal component analysis (PCA and artificial neural network (ANN were used to estimate the volume of water leakage in a WDS. For the study, six main effective parameters were selected and standardized data obtained through the Z-score method. The PCA-ANN model was devised and the leakage ratio was estimated. An accuracy assessment was performed to compare the measured leakage ratio to that of the simulated model. The results showed that the PCA-ANN method was more accurate for estimating the leakage ratio than a single ANN simulation. In addition, the estimation results differed according to the number of neurons in the ANN model’s hidden layers. In this study, an ANN with multiple hidden layers was found to be the best method for estimating the leakage ratio with 12–12 neurons. This suggested approaches to improve the accuracy of leakage ratio estimation, as well as a scientific approach toward the sustainable management of water distribution systems.
Manojlovic, D; Lenhardt, L; Milićević, B; Antonov, M; Miletic, V; Dramićanin, M D
2015-10-09
Colour changes in Gradia Direct™ composite after immersion in tea, coffee, red wine, Coca-Cola, Colgate mouthwash, and distilled water were evaluated using principal component analysis (PCA) and the CIELAB colour coordinates. The reflection spectra of the composites were used as input data for the PCA. The output data (scores and loadings) provided information about the magnitude and origin of the surface reflection changes after exposure to the staining solutions. The reflection spectra of the stained samples generally exhibited lower reflection in the blue spectral range, which was manifested in the lower content of the blue shade for the samples. Both analyses demonstrated the high staining abilities of tea, coffee, and red wine, which produced total colour changes of 4.31, 6.61, and 6.22, respectively, according to the CIELAB analysis. PCA revealed subtle changes in the reflection spectra of composites immersed in Coca-Cola, demonstrating Coca-Cola's ability to stain the composite to a small degree.
Quantifying Individual Brain Connectivity with Functional Principal Component Analysis for Networks.
Petersen, Alexander; Zhao, Jianyang; Carmichael, Owen; Müller, Hans-Georg
2016-09-01
In typical functional connectivity studies, connections between voxels or regions in the brain are represented as edges in a network. Networks for different subjects are constructed at a given graph density and are summarized by some network measure such as path length. Examining these summary measures for many density values yields samples of connectivity curves, one for each individual. This has led to the adoption of basic tools of functional data analysis, most commonly to compare control and disease groups through the average curves in each group. Such group differences, however, neglect the variability in the sample of connectivity curves. In this article, the use of functional principal component analysis (FPCA) is demonstrated to enrich functional connectivity studies by providing increased power and flexibility for statistical inference. Specifically, individual connectivity curves are related to individual characteristics such as age and measures of cognitive function, thus providing a tool to relate brain connectivity with these variables at the individual level. This individual level analysis opens a new perspective that goes beyond previous group level comparisons. Using a large data set of resting-state functional magnetic resonance imaging scans, relationships between connectivity and two measures of cognitive function-episodic memory and executive function-were investigated. The group-based approach was implemented by dichotomizing the continuous cognitive variable and testing for group differences, resulting in no statistically significant findings. To demonstrate the new approach, FPCA was implemented, followed by linear regression models with cognitive scores as responses, identifying significant associations of connectivity in the right middle temporal region with both cognitive scores.
On the structure of dynamic principal component analysis used in statistical process monitoring
DEFF Research Database (Denmark)
Vanhatalo, Erik; Kulahci, Murat; Bergquist, Bjarne
2017-01-01
When principal component analysis (PCA) is used for statistical process monitoring it relies on the assumption that data are time independent. However, industrial data will often exhibit serial correlation. Dynamic PCA (DPCA) has been suggested as a remedy for high-dimensional and time...... for determining the number of principal components to retain. The number of retained principal components is determined by visual inspection of the serial correlation in the squared prediction error statistic, Q (SPE), together with the cumulative explained variance of the model. The methods are illustrated using...... driven method to determine the maximum number of lags in DPCA with a foundation in multivariate time series analysis. The method is based on the behavior of the eigenvalues of the lagged autocorrelation and partial autocorrelation matrices. Given a specific lag structure we also propose a method...
Ghosh, Debasree; Chattopadhyay, Parimal
2012-06-01
The objective of the work was to use the method of quantitative descriptive analysis (QDA) to describe the sensory attributes of the fermented food products prepared with the incorporation of lactic cultures. Panellists were selected and trained to evaluate various attributes specially color and appearance, body texture, flavor, overall acceptability and acidity of the fermented food products like cow milk curd and soymilk curd, idli, sauerkraut and probiotic ice cream. Principal component analysis (PCA) identified the six significant principal components that accounted for more than 90% of the variance in the sensory attribute data. Overall product quality was modelled as a function of principal components using multiple least squares regression (R (2) = 0.8). The result from PCA was statistically analyzed by analysis of variance (ANOVA). These findings demonstrate the utility of quantitative descriptive analysis for identifying and measuring the fermented food product attributes that are important for consumer acceptability.
Directory of Open Access Journals (Sweden)
Paul Robert Martin Werfette
2010-06-01
Full Text Available Analysis of quantitative structure - activity relationship (QSAR for a series of antimalarial compound artemisinin derivatives has been done using principal component regression. The descriptors for QSAR study were representation of electronic structure i.e. atomic net charges of the artemisinin skeleton calculated by AM1 semi-empirical method. The antimalarial activity of the compound was expressed in log 1/IC50 which is an experimental data. The main purpose of the principal component analysis approach is to transform a large data set of atomic net charges to simplify into a data set which known as latent variables. The best QSAR equation to analyze of log 1/IC50 can be obtained from the regression method as a linear function of several latent variables i.e. x1, x2, x3, x4 and x5. The best QSAR model is expressed in the following equation, (;; Keywords: QSAR, antimalarial, artemisinin, principal component regression
Directory of Open Access Journals (Sweden)
Stefania Salvatore
2016-07-01
Full Text Available Abstract Background Wastewater-based epidemiology (WBE is a novel approach in drug use epidemiology which aims to monitor the extent of use of various drugs in a community. In this study, we investigate functional principal component analysis (FPCA as a tool for analysing WBE data and compare it to traditional principal component analysis (PCA and to wavelet principal component analysis (WPCA which is more flexible temporally. Methods We analysed temporal wastewater data from 42 European cities collected daily over one week in March 2013. The main temporal features of ecstasy (MDMA were extracted using FPCA using both Fourier and B-spline basis functions with three different smoothing parameters, along with PCA and WPCA with different mother wavelets and shrinkage rules. The stability of FPCA was explored through bootstrapping and analysis of sensitivity to missing data. Results The first three principal components (PCs, functional principal components (FPCs and wavelet principal components (WPCs explained 87.5-99.6 % of the temporal variation between cities, depending on the choice of basis and smoothing. The extracted temporal features from PCA, FPCA and WPCA were consistent. FPCA using Fourier basis and common-optimal smoothing was the most stable and least sensitive to missing data. Conclusion FPCA is a flexible and analytically tractable method for analysing temporal changes in wastewater data, and is robust to missing data. WPCA did not reveal any rapid temporal changes in the data not captured by FPCA. Overall the results suggest FPCA with Fourier basis functions and common-optimal smoothing parameter as the most accurate approach when analysing WBE data.
Directory of Open Access Journals (Sweden)
Khuat Thanh Tung
2016-11-01
Full Text Available Optical Character Recognition plays an important role in data storage and data mining when the number of documents stored as images is increasing. It is expected to find the ways to convert images of typewritten or printed text into machine-encoded text effectively in order to support for the process of information handling effectively. In this paper, therefore, the techniques which are being used to convert image into editable text in the computer such as principal component analysis, multilayer perceptron network, self-organizing maps, and improved multilayer neural network using principal component analysis are experimented. The obtained results indicated the effectiveness and feasibility of the proposed methods.
Optimal pattern synthesis for speech recognition based on principal component analysis
Korsun, O. N.; Poliyev, A. V.
2018-02-01
The algorithm for building an optimal pattern for the purpose of automatic speech recognition, which increases the probability of correct recognition, is developed and presented in this work. The optimal pattern forming is based on the decomposition of an initial pattern to principal components, which enables to reduce the dimension of multi-parameter optimization problem. At the next step the training samples are introduced and the optimal estimates for principal components decomposition coefficients are obtained by a numeric parameter optimization algorithm. Finally, we consider the experiment results that show the improvement in speech recognition introduced by the proposed optimization algorithm.
Directory of Open Access Journals (Sweden)
Suryakant B. Chandgude
2015-09-01
Full Text Available The optimum selection of process parameters has played an important role for improving the surface finish, minimizing tool wear, increasing material removal rate and reducing machining time of any machining process. In this paper, optimum parameters while machining AISI D2 hardened steel using solid carbide TiAlN coated end mill has been investigated. For optimization of process parameters along with multiple quality characteristics, principal components analysis method has been adopted in this work. The confirmation experiments have revealed that to improve performance of cutting; principal components analysis method would be a useful tool.
McLeod, Lianne; Bharadwaj, Lalita; Epp, Tasha; Waldner, Cheryl L
2017-09-15
Groundwater drinking water supply surveillance data were accessed to summarize water quality delivered as public and private water supplies in southern Saskatchewan as part of an exposure assessment for epidemiologic analyses of associations between water quality and type 2 diabetes or cardiovascular disease. Arsenic in drinking water has been linked to a variety of chronic diseases and previous studies have identified multiple wells with arsenic above the drinking water standard of 0.01 mg/L; therefore, arsenic concentrations were of specific interest. Principal components analysis was applied to obtain principal component (PC) scores to summarize mixtures of correlated parameters identified as health standards and those identified as aesthetic objectives in the Saskatchewan Drinking Water Quality Standards and Objective. Ordinary, universal, and empirical Bayesian kriging were used to interpolate arsenic concentrations and PC scores in southern Saskatchewan, and the results were compared. Empirical Bayesian kriging performed best across all analyses, based on having the greatest number of variables for which the root mean square error was lowest. While all of the kriging methods appeared to underestimate high values of arsenic and PC scores, empirical Bayesian kriging was chosen to summarize large scale geographic trends in groundwater-sourced drinking water quality and assess exposure to mixtures of trace metals and ions.
Fault Diagnosis Method Based on Information Entropy and Relative Principal Component Analysis
Directory of Open Access Journals (Sweden)
Xiaoming Xu
2017-01-01
Full Text Available In traditional principle component analysis (PCA, because of the neglect of the dimensions influence between different variables in the system, the selected principal components (PCs often fail to be representative. While the relative transformation PCA is able to solve the above problem, it is not easy to calculate the weight for each characteristic variable. In order to solve it, this paper proposes a kind of fault diagnosis method based on information entropy and Relative Principle Component Analysis. Firstly, the algorithm calculates the information entropy for each characteristic variable in the original dataset based on the information gain algorithm. Secondly, it standardizes every variable’s dimension in the dataset. And, then, according to the information entropy, it allocates the weight for each standardized characteristic variable. Finally, it utilizes the relative-principal-components model established for fault diagnosis. Furthermore, the simulation experiments based on Tennessee Eastman process and Wine datasets demonstrate the feasibility and effectiveness of the new method.
Principal component analysis for neural electron/jet discrimination in highly segmented calorimeters
International Nuclear Information System (INIS)
Vassali, M.R.; Seixas, J.M.
2001-01-01
A neural electron/jet discriminator based on calorimetry is developed for the second-level trigger system of the ATLAS detector. As preprocessing of the calorimeter information, a principal component analysis is performed on each segment of the two sections (electromagnetic and hadronic) of the calorimeter system, in order to reduce significantly the dimension of the input data space and fully explore the detailed energy deposition profile, which is provided by the highly-segmented calorimeter system. It is shown that projecting calorimeter data onto 33 segmented principal components, the discrimination efficiency of the neural classifier reaches 98.9% for electrons (with only 1% of false alarm probability). Furthermore, restricting data projection onto only 9 components, an electron efficiency of 99.1% is achieved (with 3% of false alarm), which confirms that a fast triggering system may be designed using few components
International Nuclear Information System (INIS)
Sengupta, S.K.; Boyle, J.S.
1993-05-01
Variables describing atmospheric circulation and other climate parameters derived from various GCMs and obtained from observations can be represented on a spatio-temporal grid (lattice) structure. The primary objective of this paper is to explore existing as well as some new statistical methods to analyze such data structures for the purpose of model diagnostics and intercomparison from a statistical perspective. Among the several statistical methods considered here, a new method based on common principal components appears most promising for the purpose of intercomparison of spatio-temporal data structures arising in the task of model/model and model/data intercomparison. A complete strategy for such an intercomparison is outlined. The strategy includes two steps. First, the commonality of spatial structures in two (or more) fields is captured in the common principal vectors. Second, the corresponding principal components obtained as time series are then compared on the basis of similarities in their temporal evolution
Wojciechowski, Adam
2017-04-01
were distinguished: hypsometric component (PC1), deciduous forest habitats component (PC2), river valleys and alder habitats component (PC3), and lakes component (PC4). The distinguished factors characterise natural qualities of postglacial area and reflect well the role of the four most important groups of environment components in shaping ecodiversity of the area under study. The map of ecodiversity of Debnica Kaszubska commune was created on the basis of the first four principal component scores and then five classes of diversity were isolated: very low, low, average, high and very high. As a result of the assessment, five commune regions of very high ecodiversity were separated. These regions are also very attractive for tourists and valuable in terms of their rich nature which include protected areas such as Slupia Valley Landscape Park. The suggested method of ecodiversity assessment with the use of principal component analysis may constitute an alternative methodological proposition to other research methods used so far. Literature Jedicke E., 2001. Biodiversität, Geodiversität, Ökodiversität. Kriterien zur Analyse der Landschaftsstruktur - ein konzeptioneller Diskussionsbeitrag. Naturschutz und Landschaftsplanung, 33(2/3), 59-68.
Hongjuan Yu; Jinyun Guo; Jiulong Li; Dapeng Mu; Qiaoli Kong
2015-01-01
Zero drift and solid Earth tide corrections to static relative gravimetric data cannot be ignored. In this paper, a new principal component analysis (PCA) algorithm is presented to extract the zero drift and the solid Earth tide, as signals, from static relative gravimetric data assuming that the components contained in the relative gravimetric data are uncorrelated. Static relative gravity observations from Aug. 15 to Aug. 23, 2014 are used as statistical variables to separate the signal and...
ten Berge, Jos M.F.; Kiers, Henk A.L.
When r Principal Components are available for k variables, the correlation matrix is approximated in the least squares sense by the loading matrix times its transpose. The approximation is generally not perfect unless r = k. In the present paper it is shown that, when r is at or above the Ledermann
Energy Technology Data Exchange (ETDEWEB)
Kang, Ho Yang [Korea Research Institute of Standards and Science, Daejeon (Korea, Republic of); Kim, Ki Bok [Chungnam National University, Daejeon (Korea, Republic of)
2003-06-15
In this study, acoustic emission (AE) signals due to surface cracking and moisture movement in the flat-sawn boards of oak (Quercus Variablilis) during drying under the ambient conditions were analyzed and classified using the principal component analysis. The AE signals corresponding to surface cracking showed higher in peak amplitude and peak frequency, and shorter in rise time than those corresponding to moisture movement. To reduce the multicollinearity among AE features and to extract the significant AE parameters, correlation analysis was performed. Over 99% of the variance of AE parameters could be accounted for by the first to the fourth principal components. The classification feasibility and success rate were investigated in terms of two statistical classifiers having six independent variables (AE parameters) and six principal components. As a result, the statistical classifier having AE parameters showed the success rate of 70.0%. The statistical classifier having principal components showed the success rate of 87.5% which was considerably than that of the statistical classifier having AE parameters
International Nuclear Information System (INIS)
Kang, Ho Yang; Kim, Ki Bok
2003-01-01
In this study, acoustic emission (AE) signals due to surface cracking and moisture movement in the flat-sawn boards of oak (Quercus Variablilis) during drying under the ambient conditions were analyzed and classified using the principal component analysis. The AE signals corresponding to surface cracking showed higher in peak amplitude and peak frequency, and shorter in rise time than those corresponding to moisture movement. To reduce the multicollinearity among AE features and to extract the significant AE parameters, correlation analysis was performed. Over 99% of the variance of AE parameters could be accounted for by the first to the fourth principal components. The classification feasibility and success rate were investigated in terms of two statistical classifiers having six independent variables (AE parameters) and six principal components. As a result, the statistical classifier having AE parameters showed the success rate of 70.0%. The statistical classifier having principal components showed the success rate of 87.5% which was considerably than that of the statistical classifier having AE parameters
On combining principal components with Fisher's linear discriminants for supervised learning
Pechenizkiy, M.; Tsymbal, A.; Puuronen, S.
2006-01-01
"The curse of dimensionality" is pertinent to many learning algorithms, and it denotes the drastic increase of computational complexity and classification error in high dimensions. In this paper, principal component analysis (PCA), parametric feature extraction (FE) based on Fisher’s linear
Gao, Yang; Chen, Maomao; Wu, Junyu; Zhou, Yuan; Cai, Chuangjian; Wang, Daliang; Luo, Jianwen
2017-09-01
Fluorescence molecular imaging has been used to target tumors in mice with xenograft tumors. However, tumor imaging is largely distorted by the aggregation of fluorescent probes in the liver. A principal component analysis (PCA)-based strategy was applied on the in vivo dynamic fluorescence imaging results of three mice with xenograft tumors to facilitate tumor imaging, with the help of a tumor-specific fluorescent probe. Tumor-relevant features were extracted from the original images by PCA and represented by the principal component (PC) maps. The second principal component (PC2) map represented the tumor-related features, and the first principal component (PC1) map retained the original pharmacokinetic profiles, especially of the liver. The distribution patterns of the PC2 map of the tumor-bearing mice were in good agreement with the actual tumor location. The tumor-to-liver ratio and contrast-to-noise ratio were significantly higher on the PC2 map than on the original images, thus distinguishing the tumor from its nearby fluorescence noise of liver. The results suggest that the PC2 map could serve as a bioimaging marker to facilitate in vivo tumor localization, and dynamic fluorescence molecular imaging with PCA could be a valuable tool for future studies of in vivo tumor metabolism and progression.
Hendrix, Dean
2010-01-01
This study analyzed 2005-2006 Web of Science bibliometric data from institutions belonging to the Association of Research Libraries (ARL) and corresponding ARL statistics to find any associations between indicators from the two data sets. Principal components analysis on 36 variables from 103 universities revealed obvious associations between…
Efficient real time OD matrix estimation based on principal component analysis
Djukic, T.; Flötteröd, G.; Van Lint, H.; Hoogendoorn, S.P.
2012-01-01
In this paper we explore the idea of dimensionality reduction and approximation of OD demand based on principal component analysis (PCA). First, we show how we can apply PCA to linearly transform the high dimensional OD matrices into the lower dimensional space without significant loss of accuracy.
Application of principal component analysis to time series of daily air pollution and mortality
Quant C; Fischer P; Buringh E; Ameling C; Houthuijs D; Cassee F; MGO
2004-01-01
We investigated whether cause-specific daily mortality can be attributed to specific sources of air pollution. To construct indicators of source-specific air pollution, we applied a principal component analysis (PCA) on routinely collected air pollution data in the Netherlands during the period
Evaluation of skin melanoma in spectral range 450-950 nm using principal component analysis
Jakovels, D.; Lihacova, I.; Kuzmina, I.; Spigulis, J.
2013-06-01
Diagnostic potential of principal component analysis (PCA) of multi-spectral imaging data in the wavelength range 450- 950 nm for distant skin melanoma recognition is discussed. Processing of the measured clinical data by means of PCA resulted in clear separation between malignant melanomas and pigmented nevi.
Fall detection in walking robots by multi-way principal component analysis
Karssen, J.G.; Wisse, M.
2008-01-01
Large disturbances can cause a biped to fall. If an upcoming fall can be detected, damage can be minimized or the fall can be prevented. We introduce the multi-way principal component analysis (MPCA) method for the detection of upcoming falls. We study the detection capability of the MPCA method in
Directory of Open Access Journals (Sweden)
Mohebodini Mehdi
2017-08-01
Full Text Available Landraces of spinach in Iran have not been sufficiently characterised for their morpho-agronomic traits. Such characterisation would be helpful in the development of new genetically improved cultivars. In this study 54 spinach accessions collected from the major spinach growing areas of Iran were evaluated to determine their phenotypic diversity profile of spinach genotypes on the basis of 10 quantitative and 9 qualitative morpho-agronomic traits. High coefficients of variation were recorded in some quantitative traits (dry yield and leaf area and all of the qualitative traits. Using principal component analysis, the first four principal components with eigen-values more than 1 contributed 87% of the variability among accessions for quantitative traits, whereas the first four principal components with eigen-values more than 0.8 contributed 79% of the variability among accessions for qualitative traits. The most important relations observed on the first two principal components were a strong positive association between leaf width and petiole length; between leaf length and leaf numbers in flowering; and among fresh yield, dry yield and petiole diameter; a near zero correlation between days to flowering with leaf width and petiole length. Prickly seeds, high percentage of female plants, smooth leaf texture, high numbers of leaves at flowering, greygreen leaves, erect petiole attitude and long petiole length are important characters for spinach breeding programmes.
Principal Component Surface (2011) for St. Thomas East End Reserve, St. Thomas
National Oceanic and Atmospheric Administration, Department of Commerce — This image represents a 0.3x0.3 meter principal component analysis (PCA) surface for areas the St. Thomas East End Reserve (STEER) in the U.S. Virgin Islands (USVI)....
Principal Component Analysis: Resources for an Essential Application of Linear Algebra
Pankavich, Stephen; Swanson, Rebecca
2015-01-01
Principal Component Analysis (PCA) is a highly useful topic within an introductory Linear Algebra course, especially since it can be used to incorporate a number of applied projects. This method represents an essential application and extension of the Spectral Theorem and is commonly used within a variety of fields, including statistics,…
Impact of Autocorrelation on Principal Components and Their Use in Statistical Process Control
DEFF Research Database (Denmark)
Vanhatalo, Erik; Kulahci, Murat
2015-01-01
A basic assumption when using principal component analysis (PCA) for inferential purposes, such as in statistical process control (SPC), is that the data are independent in time. In many industrial processes, frequent sampling and process dynamics make this assumption unrealistic rendering sampled...
k-t PCA: temporally constrained k-t BLAST reconstruction using principal component analysis
DEFF Research Database (Denmark)
Pedersen, Henrik; Kozerke, Sebastian; Ringgaard, Steffen
2009-01-01
in applications exhibiting a broad range of temporal frequencies such as free-breathing myocardial perfusion imaging. We show that temporal basis functions calculated by subjecting the training data to principal component analysis (PCA) can be used to constrain the reconstruction such that the temporal resolution...... is improved. The presented method is called k-t PCA....
DEFF Research Database (Denmark)
Tian, Fang; Rades, Thomas; Sandler, Niklas
2008-01-01
The purpose of this research is to gain a greater insight into the hydrate formation processes of different carbamazepine (CBZ) anhydrate forms in aqueous suspension, where principal component analysis (PCA) was applied for data analysis. The capability of PCA to visualize and to reveal simplified...
Khodasevich, M. A.; Sinitsyn, G. V.; Gres'ko, M. A.; Dolya, V. M.; Rogovaya, M. V.; Kazberuk, A. V.
2017-07-01
A study of 153 brands of commercial vodka products showed that counterfeit samples could be identified by introducing a unified additive at the minimum concentration acceptable for instrumental detection and multivariate analysis of UV-Vis transmission spectra. Counterfeit products were detected with 100% probability by using hierarchical cluster analysis or the C-means method in two-dimensional principal-component space.
DEFF Research Database (Denmark)
Sharifzadeh, Sara; Ghodsi, Ali; Clemmensen, Line H.
2017-01-01
Principal component analysis (PCA) is one of the main unsupervised pre-processing methods for dimension reduction. When the training labels are available, it is worth using a supervised PCA strategy. In cases that both dimension reduction and variable selection are required, sparse PCA (SPCA...
Li, Jiangtong; Luo, Yongdao; Dai, Honglin
2018-01-01
Water is the source of life and the essential foundation of all life. With the development of industrialization, the phenomenon of water pollution is becoming more and more frequent, which directly affects the survival and development of human. Water quality detection is one of the necessary measures to protect water resources. Ultraviolet (UV) spectral analysis is an important research method in the field of water quality detection, which partial least squares regression (PLSR) analysis method is becoming predominant technology, however, in some special cases, PLSR's analysis produce considerable errors. In order to solve this problem, the traditional principal component regression (PCR) analysis method was improved by using the principle of PLSR in this paper. The experimental results show that for some special experimental data set, improved PCR analysis method performance is better than PLSR. The PCR and PLSR is the focus of this paper. Firstly, the principal component analysis (PCA) is performed by MATLAB to reduce the dimensionality of the spectral data; on the basis of a large number of experiments, the optimized principal component is extracted by using the principle of PLSR, which carries most of the original data information. Secondly, the linear regression analysis of the principal component is carried out with statistic package for social science (SPSS), which the coefficients and relations of principal components can be obtained. Finally, calculating a same water spectral data set by PLSR and improved PCR, analyzing and comparing two results, improved PCR and PLSR is similar for most data, but improved PCR is better than PLSR for data near the detection limit. Both PLSR and improved PCR can be used in Ultraviolet spectral analysis of water, but for data near the detection limit, improved PCR's result better than PLSR.
Kim, Se-Kang; Davison, Mark L.
A study was conducted to examine how principal components analysis (PCA) and Profile Analysis via Multidimensional Scaling (PAMS) can be used to diagnose individuals observed score profiles in terms of core profile patterns identified by each method. The standardization sample from the Wechsler Intelligence Scale for Children, Third Edition…
Shaffer, John R; Feingold, Eleanor; Wang, Xiaojing; Tcuenco, Karen T; Weeks, Daniel E; DeSensi, Rebecca S; Polk, Deborah E; Wendell, Steve; Weyant, Robert J; Crout, Richard; McNeil, Daniel W; Marazita, Mary L
2012-03-09
Dental caries is the result of a complex interplay among environmental, behavioral, and genetic factors, with distinct patterns of decay likely due to specific etiologies. Therefore, global measures of decay, such as the DMFS index, may not be optimal for identifying risk factors that manifest as specific decay patterns, especially if the risk factors such as genetic susceptibility loci have small individual effects. We used two methods to extract patterns of decay from surface-level caries data in order to generate novel phenotypes with which to explore the genetic regulation of caries. The 128 tooth surfaces of the permanent dentition were scored as carious or not by intra-oral examination for 1,068 participants aged 18 to 75 years from 664 biological families. Principal components analysis (PCA) and factor analysis (FA), two methods of identifying underlying patterns without a priori surface classifications, were applied to our data. The three strongest caries patterns identified by PCA recaptured variation represented by DMFS index (correlation, r = 0.97), pit and fissure surface caries (r = 0.95), and smooth surface caries (r = 0.89). However, together, these three patterns explained only 37% of the variability in the data, indicating that a priori caries measures are insufficient for fully quantifying caries variation. In comparison, the first pattern identified by FA was strongly correlated with pit and fissure surface caries (r = 0.81), but other identified patterns, including a second pattern representing caries of the maxillary incisors, were not representative of any previously defined caries indices. Some patterns identified by PCA and FA were heritable (h(2) = 30-65%, p = 0.043-0.006), whereas other patterns were not, indicating both genetic and non-genetic etiologies of individual decay patterns. This study demonstrates the use of decay patterns as novel phenotypes to assist in understanding the multifactorial nature of dental caries.
International Nuclear Information System (INIS)
Di Maria, Costanzo; Liu, Chengyu; Zheng, Dingchang; Murray, Alan; Langley, Philip
2014-01-01
This study presents a systematic comparison of different approaches to the automated selection of the principal components (PC) which optimise the detection of maternal and fetal heart beats from non-invasive maternal abdominal recordings. A public database of 75 4-channel non-invasive maternal abdominal recordings was used for training the algorithm. Four methods were developed and assessed to determine the optimal PC: (1) power spectral distribution, (2) root mean square, (3) sample entropy, and (4) QRS template. The sensitivity of the performance of the algorithm to large-amplitude noise removal (by wavelet de-noising) and maternal beat cancellation methods were also assessed. The accuracy of maternal and fetal beat detection was assessed against reference annotations and quantified using the detection accuracy score F1 [2*PPV*Se / (PPV + Se)], sensitivity (Se), and positive predictive value (PPV). The best performing implementation was assessed on a test dataset of 100 recordings and the agreement between the computed and the reference fetal heart rate (fHR) and fetal RR (fRR) time series quantified. The best performance for detecting maternal beats (F1 99.3%, Se 99.0%, PPV 99.7%) was obtained when using the QRS template method to select the optimal maternal PC and applying wavelet de-noising. The best performance for detecting fetal beats (F1 89.8%, Se 89.3%, PPV 90.5%) was obtained when the optimal fetal PC was selected using the sample entropy method and utilising a fixed-length time window for the cancellation of the maternal beats. The performance on the test dataset was 142.7 beats 2 /min 2 for fHR and 19.9 ms for fRR, ranking respectively 14 and 17 (out of 29) when compared to the other algorithms presented at the Physionet Challenge 2013. (paper)
Principal Component and Cluster Analysis as a Tool in the Assessment of Tomato Hybrids and Cultivars
Directory of Open Access Journals (Sweden)
G. Evgenidis
2011-01-01
Full Text Available Determination of germplasm diversity and genetic relationships among breeding materials is an invaluable aid in crop improvement strategies. This study assessed the breeding value of tomato source material. Two commercial hybrids along with an experimental hybrid and four cultivars were assessed with cluster and principal component analyses based on morphophysiological data, yield and quality, stability of performance, heterosis, and combining abilities. The assessment of commercial hybrids revealed a related origin and subsequently does not support the identification of promising offspring in their crossing. The assessment of the cultivars discriminated them according to origin and evolutionary and selection effects. On the Principal Component 1, the largest group with positive loading included, yield components, heterosis, general and specific combining ability, whereas the largest negative loading was obtained by qualitative and descriptive traits. The Principal Component 2 revealed two smaller groups, a positive one with phenotypic traits and a negative one with tolerance to inbreeding. Stability of performance was loaded positively and/or negatively. In conclusion, combing ability, yield components, and heterosis provided a mechanism for ensuring continued improvement in plant selection programs.
International Nuclear Information System (INIS)
Zarzo, Manuel; Marti, Pau
2011-01-01
Research highlights: →Principal components analysis was applied to R s data recorded at 30 stations. → Four principal components explain 97% of the data variability. → The latent variables can be fitted according to latitude, longitude and altitude. → The PCA approach is more effective for gap infilling than conventional approaches. → The proposed method allows daily R s estimations at locations in the area of study. - Abstract: Measurements of global terrestrial solar radiation (R s ) are commonly recorded in meteorological stations. Daily variability of R s has to be taken into account for the design of photovoltaic systems and energy efficient buildings. Principal components analysis (PCA) was applied to R s data recorded at 30 stations in the Mediterranean coast of Spain. Due to equipment failures and site operation problems, time series of R s often present data gaps or discontinuities. The PCA approach copes with this problem and allows estimation of present and past values by taking advantage of R s records from nearby stations. The gap infilling performance of this methodology is compared with neural networks and alternative conventional approaches. Four principal components explain 66% of the data variability with respect to the average trajectory (97% if non-centered values are considered). A new method based on principal components regression was also developed for R s estimation if previous measurements are not available. By means of multiple linear regression, it was found that the latent variables associated to the four relevant principal components can be fitted according to the latitude, longitude and altitude of the station where data were recorded from. Additional geographical or climatic variables did not increase the predictive goodness-of-fit. The resulting models allow the estimation of daily R s values at any location in the area under study and present higher accuracy than artificial neural networks and some conventional approaches
Panazzolo, Diogo G; Sicuro, Fernando L; Clapauch, Ruth; Maranhão, Priscila A; Bouskela, Eliete; Kraemer-Aguiar, Luiz G
2012-11-13
We aimed to evaluate the multivariate association between functional microvascular variables and clinical-laboratorial-anthropometrical measurements. Data from 189 female subjects (34.0 ± 15.5 years, 30.5 ± 7.1 kg/m2), who were non-smokers, non-regular drug users, without a history of diabetes and/or hypertension, were analyzed by principal component analysis (PCA). PCA is a classical multivariate exploratory tool because it highlights common variation between variables allowing inferences about possible biological meaning of associations between them, without pre-establishing cause-effect relationships. In total, 15 variables were used for PCA: body mass index (BMI), waist circumference, systolic and diastolic blood pressure (BP), fasting plasma glucose, levels of total cholesterol, high-density lipoprotein cholesterol (HDL-c), low-density lipoprotein cholesterol (LDL-c), triglycerides (TG), insulin, C-reactive protein (CRP), and functional microvascular variables measured by nailfold videocapillaroscopy. Nailfold videocapillaroscopy was used for direct visualization of nutritive capillaries, assessing functional capillary density, red blood cell velocity (RBCV) at rest and peak after 1 min of arterial occlusion (RBCV(max)), and the time taken to reach RBCV(max) (TRBCV(max)). A total of 35% of subjects had metabolic syndrome, 77% were overweight/obese, and 9.5% had impaired fasting glucose. PCA was able to recognize that functional microvascular variables and clinical-laboratorial-anthropometrical measurements had a similar variation. The first five principal components explained most of the intrinsic variation of the data. For example, principal component 1 was associated with BMI, waist circumference, systolic BP, diastolic BP, insulin, TG, CRP, and TRBCV(max) varying in the same way. Principal component 1 also showed a strong association among HDL-c, RBCV, and RBCV(max), but in the opposite way. Principal component 3 was associated only with microvascular
Directory of Open Access Journals (Sweden)
Panazzolo Diogo G
2012-11-01
Full Text Available Abstract Background We aimed to evaluate the multivariate association between functional microvascular variables and clinical-laboratorial-anthropometrical measurements. Methods Data from 189 female subjects (34.0±15.5 years, 30.5±7.1 kg/m2, who were non-smokers, non-regular drug users, without a history of diabetes and/or hypertension, were analyzed by principal component analysis (PCA. PCA is a classical multivariate exploratory tool because it highlights common variation between variables allowing inferences about possible biological meaning of associations between them, without pre-establishing cause-effect relationships. In total, 15 variables were used for PCA: body mass index (BMI, waist circumference, systolic and diastolic blood pressure (BP, fasting plasma glucose, levels of total cholesterol, high-density lipoprotein cholesterol (HDL-c, low-density lipoprotein cholesterol (LDL-c, triglycerides (TG, insulin, C-reactive protein (CRP, and functional microvascular variables measured by nailfold videocapillaroscopy. Nailfold videocapillaroscopy was used for direct visualization of nutritive capillaries, assessing functional capillary density, red blood cell velocity (RBCV at rest and peak after 1 min of arterial occlusion (RBCVmax, and the time taken to reach RBCVmax (TRBCVmax. Results A total of 35% of subjects had metabolic syndrome, 77% were overweight/obese, and 9.5% had impaired fasting glucose. PCA was able to recognize that functional microvascular variables and clinical-laboratorial-anthropometrical measurements had a similar variation. The first five principal components explained most of the intrinsic variation of the data. For example, principal component 1 was associated with BMI, waist circumference, systolic BP, diastolic BP, insulin, TG, CRP, and TRBCVmax varying in the same way. Principal component 1 also showed a strong association among HDL-c, RBCV, and RBCVmax, but in the opposite way. Principal component 3 was
Principal component reconstruction (PCR) for cine CBCT with motion learning from 2D fluoroscopy.
Gao, Hao; Zhang, Yawei; Ren, Lei; Yin, Fang-Fang
2018-01-01
This work aims to generate cine CT images (i.e., 4D images with high-temporal resolution) based on a novel principal component reconstruction (PCR) technique with motion learning from 2D fluoroscopic training images. In the proposed PCR method, the matrix factorization is utilized as an explicit low-rank regularization of 4D images that are represented as a product of spatial principal components and temporal motion coefficients. The key hypothesis of PCR is that temporal coefficients from 4D images can be reasonably approximated by temporal coefficients learned from 2D fluoroscopic training projections. For this purpose, we can acquire fluoroscopic training projections for a few breathing periods at fixed gantry angles that are free from geometric distortion due to gantry rotation, that is, fluoroscopy-based motion learning. Such training projections can provide an effective characterization of the breathing motion. The temporal coefficients can be extracted from these training projections and used as priors for PCR, even though principal components from training projections are certainly not the same for these 4D images to be reconstructed. For this purpose, training data are synchronized with reconstruction data using identical real-time breathing position intervals for projection binning. In terms of image reconstruction, with a priori temporal coefficients, the data fidelity for PCR changes from nonlinear to linear, and consequently, the PCR method is robust and can be solved efficiently. PCR is formulated as a convex optimization problem with the sum of linear data fidelity with respect to spatial principal components and spatiotemporal total variation regularization imposed on 4D image phases. The solution algorithm of PCR is developed based on alternating direction method of multipliers. The implementation is fully parallelized on GPU with NVIDIA CUDA toolbox and each reconstruction takes about a few minutes. The proposed PCR method is validated and
Bouhlel, Jihéne; Jouan-Rimbaud Bouveresse, Delphine; Abouelkaram, Said; Baéza, Elisabeth; Jondreville, Catherine; Travel, Angélique; Ratel, Jérémy; Engel, Erwan; Rutledge, Douglas N
2018-02-01
The aim of this work is to compare a novel exploratory chemometrics method, Common Components Analysis (CCA), with Principal Components Analysis (PCA) and Independent Components Analysis (ICA). CCA consists in adapting the multi-block statistical method known as Common Components and Specific Weights Analysis (CCSWA or ComDim) by applying it to a single data matrix, with one variable per block. As an application, the three methods were applied to SPME-GC-MS volatolomic signatures of livers in an attempt to reveal volatile organic compounds (VOCs) markers of chicken exposure to different types of micropollutants. An application of CCA to the initial SPME-GC-MS data revealed a drift in the sample Scores along CC2, as a function of injection order, probably resulting from time-related evolution in the instrument. This drift was eliminated by orthogonalization of the data set with respect to CC2, and the resulting data are used as the orthogonalized data input into each of the three methods. Since the first step in CCA is to norm-scale all the variables, preliminary data scaling has no effect on the results, so that CCA was applied only to orthogonalized SPME-GC-MS data, while, PCA and ICA were applied to the "orthogonalized", "orthogonalized and Pareto-scaled", and "orthogonalized and autoscaled" data. The comparison showed that PCA results were highly dependent on the scaling of variables, contrary to ICA where the data scaling did not have a strong influence. Nevertheless, for both PCA and ICA the clearest separations of exposed groups were obtained after autoscaling of variables. The main part of this work was to compare the CCA results using the orthogonalized data with those obtained with PCA and ICA applied to orthogonalized and autoscaled variables. The clearest separations of exposed chicken groups were obtained by CCA. CCA Loadings also clearly identified the variables contributing most to the Common Components giving separations. The PCA Loadings did not
Young, Cole; Reinkensmeyer, David J
2014-08-01
Athletes rely on subjective assessment of complex movements from coaches and judges to improve their motor skills. In some sports, such as diving, snowboard half pipe, gymnastics, and figure skating, subjective scoring forms the basis for competition. It is currently unclear whether this scoring process can be mathematically modeled; doing so could provide insight into what motor skill is. Principal components analysis has been proposed as a motion analysis method for identifying fundamental units of coordination. We used PCA to analyze movement quality of dives taken from USA Diving's 2009 World Team Selection Camp, first identifying eigenpostures associated with dives, and then using the eigenpostures and their temporal weighting coefficients, as well as elements commonly assumed to affect scoring - gross body path, splash area, and board tip motion - to identify eigendives. Within this eigendive space we predicted actual judges' scores using linear regression. This technique rated dives with accuracy comparable to the human judges. The temporal weighting of the eigenpostures, body center path, splash area, and board tip motion affected the score, but not the eigenpostures themselves. These results illustrate that (1) subjective scoring in a competitive diving event can be mathematically modeled; (2) the elements commonly assumed to affect dive scoring actually do affect scoring (3) skill in elite diving is more associated with the gross body path and the effect of the movement on the board and water than the units of coordination that PCA extracts, which might reflect the high level of technique these divers had achieved. We also illustrate how eigendives can be used to produce dive animations that an observer can distort continuously from poor to excellent, which is a novel approach to performance visualization. Copyright © 2014 Elsevier B.V. All rights reserved.
Directory of Open Access Journals (Sweden)
Suwicha Jirayucharoensak
2014-01-01
Full Text Available Automatic emotion recognition is one of the most challenging tasks. To detect emotion from nonstationary EEG signals, a sophisticated learning algorithm that can represent high-level abstraction is required. This study proposes the utilization of a deep learning network (DLN to discover unknown feature correlation between input signals that is crucial for the learning task. The DLN is implemented with a stacked autoencoder (SAE using hierarchical feature learning approach. Input features of the network are power spectral densities of 32-channel EEG signals from 32 subjects. To alleviate overfitting problem, principal component analysis (PCA is applied to extract the most important components of initial input features. Furthermore, covariate shift adaptation of the principal components is implemented to minimize the nonstationary effect of EEG signals. Experimental results show that the DLN is capable of classifying three different levels of valence and arousal with accuracy of 49.52% and 46.03%, respectively. Principal component based covariate shift adaptation enhances the respective classification accuracy by 5.55% and 6.53%. Moreover, DLN provides better performance compared to SVM and naive Bayes classifiers.
Directory of Open Access Journals (Sweden)
Badaruddoza
2015-09-01
Full Text Available The current study focused to determine significant cardiovascular risk factors through principal component factor analysis (PCFA among three generations on 1827 individuals in three generations including 911 males (378 from offspring, 439 from parental and 94 from grand-parental generations and 916 females (261 from offspring, 515 from parental and 140 from grandparental generations. The study performed PCFA with orthogonal rotation to reduce 12 inter-correlated variables into groups of independent factors. The factors have been identified as 2 for male grandparents, 3 for male offspring, female parents and female grandparents each, 4 for male parents and 5 for female offspring. This data reduction method identified these factors that explained 72%, 84%, 79%, 69%, 70% and 73% for male and female offspring, male and female parents and male and female grandparents respectively, of the variations in original quantitative traits. The factor 1 accounting for the largest portion of variations was strongly loaded with factors related to obesity (body mass index (BMI, waist circumference (WC, waist to hip ratio (WHR, and thickness of skinfolds among all generations with both sexes, which has been known to be an independent predictor for cardiovascular morbidity and mortality. The second largest components, factor 2 and factor 3 for almost all generations reflected traits of blood pressure phenotypes loaded, however, in male offspring generation it was observed that factor 2 was loaded with blood pressure phenotypes as well as obesity. This study not only confirmed but also extended prior work by developing a cumulative risk scale from factor scores. Till today, such a cumulative and extensive scale has not been used in any Indian studies with individuals of three generations. These findings and study highlight the importance of global approach for assessing the risk and need for studies that elucidate how these different cardiovascular risk factors
Badaruddoza; Kumar, Raman; Kaur, Manpreet
2015-09-01
The current study focused to determine significant cardiovascular risk factors through principal component factor analysis (PCFA) among three generations on 1827 individuals in three generations including 911 males (378 from offspring, 439 from parental and 94 from grand-parental generations) and 916 females (261 from offspring, 515 from parental and 140 from grandparental generations). The study performed PCFA with orthogonal rotation to reduce 12 inter-correlated variables into groups of independent factors. The factors have been identified as 2 for male grandparents, 3 for male offspring, female parents and female grandparents each, 4 for male parents and 5 for female offspring. This data reduction method identified these factors that explained 72%, 84%, 79%, 69%, 70% and 73% for male and female offspring, male and female parents and male and female grandparents respectively, of the variations in original quantitative traits. The factor 1 accounting for the largest portion of variations was strongly loaded with factors related to obesity (body mass index (BMI), waist circumference (WC), waist to hip ratio (WHR), and thickness of skinfolds) among all generations with both sexes, which has been known to be an independent predictor for cardiovascular morbidity and mortality. The second largest components, factor 2 and factor 3 for almost all generations reflected traits of blood pressure phenotypes loaded, however, in male offspring generation it was observed that factor 2 was loaded with blood pressure phenotypes as well as obesity. This study not only confirmed but also extended prior work by developing a cumulative risk scale from factor scores. Till today, such a cumulative and extensive scale has not been used in any Indian studies with individuals of three generations. These findings and study highlight the importance of global approach for assessing the risk and need for studies that elucidate how these different cardiovascular risk factors interact with
Principal component analysis of the nonlinear coupling of harmonic modes in heavy-ion collisions
BoŻek, Piotr
2018-03-01
The principal component analysis of flow correlations in heavy-ion collisions is studied. The correlation matrix of harmonic flow is generalized to correlations involving several different flow vectors. The method can be applied to study the nonlinear coupling between different harmonic modes in a double differential way in transverse momentum or pseudorapidity. The procedure is illustrated with results from the hydrodynamic model applied to Pb + Pb collisions at √{sN N}=2760 GeV. Three examples of generalized correlations matrices in transverse momentum are constructed corresponding to the coupling of v22 and v4, of v2v3 and v5, or of v23,v33 , and v6. The principal component decomposition is applied to the correlation matrices and the dominant modes are calculated.
Directory of Open Access Journals (Sweden)
Haorui Liu
2016-01-01
Full Text Available In the car control systems, it is hard to measure some key vehicle states directly and accurately when running on the road and the cost of the measurement is high as well. To address these problems, a vehicle state estimation method based on the kernel principal component analysis and the improved Elman neural network is proposed. Combining with nonlinear vehicle model of three degrees of freedom (3 DOF, longitudinal, lateral, and yaw motion, this paper applies the method to the soft sensor of the vehicle states. The simulation results of the double lane change tested by Matlab/SIMULINK cosimulation prove the KPCA-IENN algorithm (kernel principal component algorithm and improved Elman neural network to be quick and precise when tracking the vehicle states within the nonlinear area. This algorithm method can meet the software performance requirements of the vehicle states estimation in precision, tracking speed, noise suppression, and other aspects.
Directory of Open Access Journals (Sweden)
S. Roy
2013-12-01
Full Text Available The present investigation is an experimental approach to deposit electroless Ni-P-W coating on mild steel substrate and find out the optimum combination of various tribological performances on the basis of minimum friction and wear, using weighted principal component analysis (WPCA. In this study three main tribological parameters are chosen viz. load (A, speed (B and time(C. The responses are coefficient of friction and wear depth. Here Weighted Principal Component Analysis (WPCA method is adopted to convert the multi-responses into single performance index called multiple performance index (MPI and Taguchi L27 orthogonal array is used to design the experiment and to find the optimum combination of tribological parameters for minimum coefficient of friction and wear depth. ANOVA is performed to find the significance of the each tribological process parameters and their interactions. The EDX analysis, SEM and XRD are performed to study the composition and structural aspects.
Cloud Masking for Remotely Sensed Data Using Spectral and Principal Components Analysis
Directory of Open Access Journals (Sweden)
A. Ahmad
2012-06-01
Full Text Available Two methods of cloud masking tuned to tropical conditions have been developed, based on spectral analysis and Principal Components Analysis (PCA of Moderate Resolution Imaging Spectroradiometer (MODIS data. In the spectral approach, thresholds were applied to four reflective bands (1, 2, 3, and 4, three thermal bands (29, 31 and 32, the band 2/band 1 ratio, and the difference between band 29 and 31 in order to detect clouds. The PCA approach applied a threshold to the first principal component derived from the seven quantities used for spectral analysis. Cloud detections were compared with the standard MODIS cloud mask, and their accuracy was assessed using reference images and geographical information on the study area.
Directory of Open Access Journals (Sweden)
Pengyu Gao
2016-03-01
Full Text Available It is difficult to forecast the well productivity because of the complexity of vertical and horizontal developments in fluvial facies reservoir. This paper proposes a method based on Principal Component Analysis and Artificial Neural Network to predict well productivity of fluvial facies reservoir. The method summarizes the statistical reservoir factors and engineering factors that affect the well productivity, extracts information by applying the principal component analysis method and approximates arbitrary functions of the neural network to realize an accurate and efficient prediction on the fluvial facies reservoir well productivity. This method provides an effective way for forecasting the productivity of fluvial facies reservoir which is affected by multi-factors and complex mechanism. The study result shows that this method is a practical, effective, accurate and indirect productivity forecast method and is suitable for field application.
A Cure for Variance Inflation in High Dimensional Kernel Principal Component Analysis
DEFF Research Database (Denmark)
Abrahamsen, Trine Julie; Hansen, Lars Kai
2011-01-01
Small sample high-dimensional principal component analysis (PCA) suffers from variance inflation and lack of generalizability. It has earlier been pointed out that a simple leave-one-out variance renormalization scheme can cure the problem. In this paper we generalize the cure in two directions......: First, we propose a computationally less intensive approximate leave-one-out estimator, secondly, we show that variance inflation is also present in kernel principal component analysis (kPCA) and we provide a non-parametric renormalization scheme which can quite efficiently restore generalizability in kPCA....... As for PCA our analysis also suggests a simplified approximate expression. © 2011 Trine J. Abrahamsen and Lars K. Hansen....
Variability search in M 31 using principal component analysis and the Hubble Source Catalogue
Moretti, M. I.; Hatzidimitriou, D.; Karampelas, A.; Sokolovsky, K. V.; Bonanos, A. Z.; Gavras, P.; Yang, M.
2018-06-01
Principal component analysis (PCA) is being extensively used in Astronomy but not yet exhaustively exploited for variability search. The aim of this work is to investigate the effectiveness of using the PCA as a method to search for variable stars in large photometric data sets. We apply PCA to variability indices computed for light curves of 18 152 stars in three fields in M 31 extracted from the Hubble Source Catalogue. The projection of the data into the principal components is used as a stellar variability detection and classification tool, capable of distinguishing between RR Lyrae stars, long-period variables (LPVs) and non-variables. This projection recovered more than 90 per cent of the known variables and revealed 38 previously unknown variable stars (about 30 per cent more), all LPVs except for one object of uncertain variability type. We conclude that this methodology can indeed successfully identify candidate variable stars.
Evaluation of functional scintigraphy of gastric emptying by the principal component method
Energy Technology Data Exchange (ETDEWEB)
Haeussler, M.; Eilles, C.; Reiners, C.; Moll, E.; Boerner, W.
1980-10-01
Gastric emptying of a standard semifluid test-meal, labeled with /sup 99/sup(m)Tc-DTPA, was studied by functional scintigraphy in 88 subjects (normals, patients with duodenal and gastric ulcer before and after selective proximal vagotomy with and without pyloroplasty). Gastric emptying curves were analysed by the method of principal components. Patients after selective proximal vagotomy with pyloroplasty showed an rapid initial emptying, whereas this was a rare finding in patients after selective proximal vagotomy without pyloroplasty. The method of principal components is well suited for mathematical analysis of gastric emptying; nevertheless the results are difficult to interpret. The method has advantages when looking at larger collectives and allows a separation into groups with different gastric emptying.
Principal component analysis of NEXAFS spectra for molybdenum speciation in hydrotreating catalysts
International Nuclear Information System (INIS)
Faro Junior, Arnaldo da C.; Rodrigues, Victor de O.; Eon, Jean-G.; Rocha, Angela S.
2010-01-01
Bulk and supported molybdenum based catalysts, modified by nickel, phosphorous or tungsten were studied by NEXAFS spectroscopy at the Mo L III and L II edges. The techniques of principal component analysis (PCA) together with a linear combination analysis (LCA) allowed the detection and quantification of molybdenum atoms in two different coordination states in the oxide form of the catalysts, namely tetrahedral and octahedral coordination. (author)
Duforet-Frebourg, Nicolas; Luu, Keurcien; Laval, Guillaume; Bazin, Eric; Blum, Michael G B
2016-04-01
To characterize natural selection, various analytical methods for detecting candidate genomic regions have been developed. We propose to perform genome-wide scans of natural selection using principal component analysis (PCA). We show that the common FST index of genetic differentiation between populations can be viewed as the proportion of variance explained by the principal components. Considering the correlations between genetic variants and each principal component provides a conceptual framework to detect genetic variants involved in local adaptation without any prior definition of populations. To validate the PCA-based approach, we consider the 1000 Genomes data (phase 1) considering 850 individuals coming from Africa, Asia, and Europe. The number of genetic variants is of the order of 36 millions obtained with a low-coverage sequencing depth (3×). The correlations between genetic variation and each principal component provide well-known targets for positive selection (EDAR, SLC24A5, SLC45A2, DARC), and also new candidate genes (APPBPP2, TP1A1, RTTN, KCNMA, MYO5C) and noncoding RNAs. In addition to identifying genes involved in biological adaptation, we identify two biological pathways involved in polygenic adaptation that are related to the innate immune system (beta defensins) and to lipid metabolism (fatty acid omega oxidation). An additional analysis of European data shows that a genome scan based on PCA retrieves classical examples of local adaptation even when there are no well-defined populations. PCA-based statistics, implemented in the PCAdapt R package and the PCAdapt fast open-source software, retrieve well-known signals of human adaptation, which is encouraging for future whole-genome sequencing project, especially when defining populations is difficult. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Influencing Factors of Catering and Food Service Industry Based on Principal Component Analysis
Zi Tang
2014-01-01
Scientific analysis of influencing factors is of great importance for the healthy development of catering and food service industry. This study attempts to present a set of critical indicators for evaluating the contribution of influencing factors to catering and food service industry in the particular context of Harbin City, Northeast China. Ten indicators that correlate closely with catering and food service industry were identified and performed by the principal component analysis method u...
Oluwasuji Dada, Joshua
2014-01-01
The purpose of this paper is to examine the intrinsic relationships among sets of quantity surveyors’ skill and competence variables with a view to reducing them into principal components. The research adopts a data reduction technique using factor analysis statistical technique. A structured questionnaire was administered among major stakeholders in the Nigerian construction industry. The respondents were asked to give rating, on a 5 point Likert scale, on skills and competencies re...
Jakovels, Dainis; Lihacova, Ilze; Kuzmina, Ilona; Spigulis, Janis
2013-11-01
Non-invasive and fast primary diagnostics of pigmented skin lesions is required due to frequent incidence of skin cancer - melanoma. Diagnostic potential of principal component analysis (PCA) for distant skin melanoma recognition is discussed. Processing of the measured clinical multi-spectral images (31 melanomas and 94 nonmalignant pigmented lesions) in the wavelength range of 450-950 nm by means of PCA resulted in 87 % sensitivity and 78 % specificity for separation between malignant melanomas and pigmented nevi.
Karpuzcu, M Ekrem; Fairbairn, David; Arnold, William A; Barber, Brian L; Kaufenberg, Elizabeth; Koskinen, William C; Novak, Paige J; Rice, Pamela J; Swackhamer, Deborah L
2014-01-01
Principal components analysis (PCA) was used to identify sources of emerging organic contaminants in the Zumbro River watershed in Southeastern Minnesota. Two main principal components (PCs) were identified, which together explained more than 50% of the variance in the data. Principal Component 1 (PC1) was attributed to urban wastewater-derived sources, including municipal wastewater and residential septic tank effluents, while Principal Component 2 (PC2) was attributed to agricultural sources. The variances of the concentrations of cotinine, DEET and the prescription drugs carbamazepine, erythromycin and sulfamethoxazole were best explained by PC1, while the variances of the concentrations of the agricultural pesticides atrazine, metolachlor and acetochlor were best explained by PC2. Mixed use compounds carbaryl, iprodione and daidzein did not specifically group with either PC1 or PC2. Furthermore, despite the fact that caffeine and acetaminophen have been historically associated with human use, they could not be attributed to a single dominant land use category (e.g., urban/residential or agricultural). Contributions from septic systems did not clarify the source for these two compounds, suggesting that additional sources, such as runoff from biosolid-amended soils, may exist. Based on these results, PCA may be a useful way to broadly categorize the sources of new and previously uncharacterized emerging contaminants or may help to clarify transport pathways in a given area. Acetaminophen and caffeine were not ideal markers for urban/residential contamination sources in the study area and may need to be reconsidered as such in other areas as well.
Modelling the Load Curve of Aggregate Electricity Consumption Using Principal Components
Matteo Manera; Angelo Marzullo
2003-01-01
Since oil is a non-renewable resource with a high environmental impact, and its most common use is to produce combustibles for electricity, reliable methods for modelling electricity consumption can contribute to a more rational employment of this hydrocarbon fuel. In this paper we apply the Principal Components (PC) method to modelling the load curves of Italy, France and Greece on hourly data of aggregate electricity consumption. The empirical results obtained with the PC approach are compa...
An application of principal component analysis to the clavicle and clavicle fixation devices.
LENUS (Irish Health Repository)
Daruwalla, Zubin J
2010-01-01
Principal component analysis (PCA) enables the building of statistical shape models of bones and joints. This has been used in conjunction with computer assisted surgery in the past. However, PCA of the clavicle has not been performed. Using PCA, we present a novel method that examines the major modes of size and three-dimensional shape variation in male and female clavicles and suggests a method of grouping the clavicle into size and shape categories.
Bisele, M; Bencsik, M; Lewis, MGC; Barnett, CT
2017-01-01
Assessment methods in human locomotion often involve the description of normalised graphical profiles and/or the extraction of discrete variables. Whilst useful, these approaches may not represent the full complexity of gait data. Multivariate statistical methods, such as Principal Component Analysis (PCA) and Discriminant Function Analysis (DFA), have been adopted since they have the potential to overcome these data handling issues. The aim of the current study was to develop and optimise a ...
An application of principal component analysis to the clavicle and clavicle fixation devices
Daruwalla, Zubin J; Courtis, Patrick; Fitzpatrick, Clare; Fitzpatrick, David; Mullett, Hannan
2010-01-01
Abstract Background Principal component analysis (PCA) enables the building of statistical shape models of bones and joints. This has been used in conjunction with computer assisted surgery in the past. However, PCA of the clavicle has not been performed. Using PCA, we present a novel method that examines the major modes of size and three-dimensional shape variation in male and female clavicles and suggests a method of grouping the clavicle into size and shape categories. Materials and method...
Directory of Open Access Journals (Sweden)
Christian NZENGUE PEGNET
2011-07-01
Full Text Available The recent financial turmoil has clearly highlighted the potential role of financial factors on amplification of macroeconomic developments and stressed the importance of analyzing the relationship between banks’ balance sheets and economic activity. This paper assesses the impact of the bank capital channel in the transmission of schocks in Europe on the basis of bank's balance sheet data. The empirical analysis is carried out through a Principal Component Analysis and in a Vector Error Correction Model.
THE STUDY OF THE CHARACTERIZATION INDICES OF FABRICS BY PRINCIPAL COMPONENT ANALYSIS METHOD
HRISTIAN Liliana; OSTAFE Maria Magdalena; BORDEIANU Demetra Lacramioara; APOSTOL Laura Liliana
2017-01-01
The paper was pursued to prioritize the worsted fabrics type, for the manufacture of outerwear products by characterization indeces of fabrics, using the mathematical model of Principal Component Analysis (PCA). There are a number of variables with a certain influence on the quality of fabrics, but some of these variables are more important than others, so it is useful to identify those variables to a better understanding the factors which can lead the improving of the fabrics quality. A s...
Mjørud, Marit; Kirkevold, Marit; Røsvik, Janne; Engedal, Knut
2014-01-01
To investigate which factors the Quality of Life in Late-Stage Dementia (QUALID) scale holds when used among people with dementia (pwd) in nursing homes and to find out how the symptom load varies across the different severity levels of dementia. We included 661 pwd [mean age ± SD, 85.3 ± 8.6 years; 71.4% women]. The QUALID and the Clinical Dementia Rating (CDR) scale were applied. A principal component analysis (PCA) with varimax rotation and Kaiser normalization was applied to test the factor structure. Nonparametric analyses were applied to examine differences of symptom load across the three CDR groups. The mean QUALID score was 21.5 (±7.1), and the CDR scores of the three groups were 1 in 22.5%, 2 in 33.6% and 3 in 43.9%. The results of the statistical measures employed were the following: Crohnbach's α of QUALID, 0.74; Bartlett's test of sphericity, p Kaiser-Meyer-Olkin measure, 0.77. The PCA analysis resulted in three components accounting for 53% of the variance. The first component was 'tension' ('facial expression of discomfort', 'appears physically uncomfortable', 'verbalization suggests discomfort', 'being irritable and aggressive', 'appears calm', Crohnbach's α = 0.69), the second was 'well-being' ('smiles', 'enjoys eating', 'enjoys touching/being touched', 'enjoys social interaction', Crohnbach's α = 0.62) and the third was 'sadness' ('appears sad', 'cries', 'facial expression of discomfort', Crohnbach's α 0.65). The mean score on the components 'tension' and 'well-being' increased significantly with increasing severity levels of dementia. Three components of quality of life (qol) were identified. Qol decreased with increasing severity of dementia. © 2013 S. Karger AG, Basel.
Zhang, Qiong; Peng, Cong; Lu, Yiming; Wang, Hao; Zhu, Kaiguang
2018-04-01
A novel technique is developed to level airborne geophysical data using principal component analysis based on flight line difference. In the paper, flight line difference is introduced to enhance the features of levelling error for airborne electromagnetic (AEM) data and improve the correlation between pseudo tie lines. Thus we conduct levelling to the flight line difference data instead of to the original AEM data directly. Pseudo tie lines are selected distributively cross profile direction, avoiding the anomalous regions. Since the levelling errors of selective pseudo tie lines show high correlations, principal component analysis is applied to extract the local levelling errors by low-order principal components reconstruction. Furthermore, we can obtain the levelling errors of original AEM data through inverse difference after spatial interpolation. This levelling method does not need to fly tie lines and design the levelling fitting function. The effectiveness of this method is demonstrated by the levelling results of survey data, comparing with the results from tie-line levelling and flight-line correlation levelling.
Dynamic of consumer groups and response of commodity markets by principal component analysis
Nobi, Ashadun; Alam, Shafiqul; Lee, Jae Woo
2017-09-01
This study investigates financial states and group dynamics by applying principal component analysis to the cross-correlation coefficients of the daily returns of commodity futures. The eigenvalues of the cross-correlation matrix in the 6-month timeframe displays similar values during 2010-2011, but decline following 2012. A sharp drop in eigenvalue implies the significant change of the market state. Three commodity sectors, energy, metals and agriculture, are projected into two dimensional spaces consisting of two principal components (PC). We observe that they form three distinct clusters in relation to various sectors. However, commodities with distinct features have intermingled with one another and scattered during severe crises, such as the European sovereign debt crises. We observe the notable change of the position of two dimensional spaces of groups during financial crises. By considering the first principal component (PC1) within the 6-month moving timeframe, we observe that commodities of the same group change states in a similar pattern, and the change of states of one group can be used as a warning for other group.
Assessing the effect of oil price on world food prices: Application of principal component analysis
International Nuclear Information System (INIS)
Esmaeili, Abdoulkarim; Shokoohi, Zainab
2011-01-01
The objective of this paper is to investigate the co-movement of food prices and the macroeconomic index, especially the oil price, by principal component analysis to further understand the influence of the macroeconomic index on food prices. We examined the food prices of seven major products: eggs, meat, milk, oilseeds, rice, sugar and wheat. The macroeconomic variables studied were crude oil prices, consumer price indexes, food production indexes and GDP around the world between 1961 and 2005. We use the Scree test and the proportion of variance method for determining the optimal number of common factors. The correlation coefficient between the extracted principal component and the macroeconomic index varies between 0.87 for the world GDP and 0.36 for the consumer price index. We find the food production index has the greatest influence on the macroeconomic index and that the oil price index has an influence on the food production index. Consequently, crude oil prices have an indirect effect on food prices. - Research Highlights: →We investigate the co-movement of food prices and the macroeconomic index. →The crude oil price has indirect effect on the world GDP via its impacts on food production index. →The food production index is the source of causation for CPI and GDP is affected by CPI. →The results confirm an indirect effect among oil price, food price principal component.
[Content of mineral elements of Gastrodia elata by principal components analysis].
Li, Jin-ling; Zhao, Zhi; Liu, Hong-chang; Luo, Chun-li; Huang, Ming-jin; Luo, Fu-lai; Wang, Hua-lei
2015-03-01
To study the content of mineral elements and the principal components in Gastrodia elata. Mineral elements were determined by ICP and the data was analyzed by SPSS. K element has the highest content-and the average content was 15.31 g x kg(-1). The average content of N element was 8.99 g x kg(-1), followed by K element. The coefficient of variation of K and N was small, but the Mn was the biggest with 51.39%. The highly significant positive correlation was found among N, P and K . Three principal components were selected by principal components analysis to evaluate the quality of G. elata. P, B, N, K, Cu, Mn, Fe and Mg were the characteristic elements of G. elata. The content of K and N elements was higher and relatively stable. The variation of Mn content was biggest. The quality of G. elata in Guizhou and Yunnan was better from the perspective of mineral elements.
Sánchez-Sánchez, M Luz; Belda-Lois, Juan-Manuel; Mena-Del Horno, Silvia; Viosca-Herrero, Enrique; Igual-Camacho, Celedonia; Gisbert-Morant, Beatriz
2018-05-05
A major goal in stroke rehabilitation is the establishment of more effective physical therapy techniques to recover postural stability. Functional Principal Component Analysis provides greater insight into recovery trends. However, when missing values exist, obtaining functional data presents some difficulties. The purpose of this study was to reveal an alternative technique for obtaining the Functional Principal Components without requiring the conversion to functional data beforehand and to investigate this methodology to determine the effect of specific physical therapy techniques in balance recovery trends in elderly subjects with hemiplegia post-stroke. A randomized controlled pilot trial was developed. Thirty inpatients post-stroke were included. Control and target groups were treated with the same conventional physical therapy protocol based on functional criteria, but specific techniques were added to the target group depending on the subjects' functional level. Postural stability during standing was quantified by posturography. The assessments were performed once a month from the moment the participants were able to stand up to six months post-stroke. The target group showed a significant improvement in postural control recovery trend six months after stroke that was not present in the control group. Some of the assessed parameters revealed significant differences between treatment groups (P Functional Principal Component Analysis to be performed when data is scarce. Moreover, it allowed the dynamics of recovery of two different treatment groups to be determined, showing that the techniques added in the target group increased postural stability compared to the base protocol. Copyright © 2018 Elsevier Ltd. All rights reserved.
Dovbeshko, G. I.; Repnytska, O. P.; Pererva, T.; Miruta, A.; Kosenkov, D.
2004-07-01
Conformation analysis of mutated DNA-bacteriophages (PLys-23, P23-2, P47- the numbers have been assigned by T. Pererva) induced by MS2 virus incorporated in Ecoli AB 259 Hfr 3000 has been done. Surface enhanced infrared absorption (SEIRA) spectroscopy and principal component analysis has been applied for solving this problem. The nucleic acids isolated from the mutated phages had a form of double stranded DNA with different modifications. The nucleic acid from phage P47 was undergone the structural rearrangement in the most degree. The shape and position ofthe fine structure of the Phosphate asymmetrical band at 1071cm-1 as well as the stretching OH vibration at 3370-3390 cm-1 has indicated to the appearance ofadditional OH-groups. The Z-form feature has been found in the base vibration region (1694 cm-1) and the sugar region (932 cm-1). A supposition about modification of structure of DNA by Z-fragments for P47 phage has been proposed. The P23-2 and PLys-23 phages have showed the numerous minor structural changes also. On the basis of SEIRA spectra we have determined the characteristic parameters of the marker bands of nucleic acid used for construction of principal components. Contribution of different spectral parameters of nucleic acids to principal components has been estimated.
THE STUDY OF THE CHARACTERIZATION INDICES OF FABRICS BY PRINCIPAL COMPONENT ANALYSIS METHOD
Directory of Open Access Journals (Sweden)
HRISTIAN Liliana
2017-05-01
Full Text Available The paper was pursued to prioritize the worsted fabrics type, for the manufacture of outerwear products by characterization indeces of fabrics, using the mathematical model of Principal Component Analysis (PCA. There are a number of variables with a certain influence on the quality of fabrics, but some of these variables are more important than others, so it is useful to identify those variables to a better understanding the factors which can lead the improving of the fabrics quality. A solution to this problem can be the application of a method of factorial analysis, the so-called Principal Component Analysis, with the final goal of establishing and analyzing those variables which influence in a significant manner the internal structure of combed wool fabrics according to armire type. By applying PCA it is obtained a small number of the linear combinations (principal components from a set of variables, describing the internal structure of the fabrics, which can hold as much information as possible from the original variables. Data analysis is an important initial step in decision making, allowing identification of the causes that lead to a decision- making situations. Thus it is the action of transforming the initial data in order to extract useful information and to facilitate reaching the conclusions. The process of data analysis can be defined as a sequence of steps aimed at formulating hypotheses, collecting primary information and validation, the construction of the mathematical model describing this phenomenon and reaching these conclusions about the behavior of this model.
Strale, Mathieu; Krysinska, Karolina; Overmeiren, Gaëtan Van; Andriessen, Karl
2017-06-01
This study investigated the geographic distribution of suicide and railway suicide in Belgium over 2008--2013 on local (i.e., district or arrondissement) level. There were differences in the regional distribution of suicide and railway suicides in Belgium over the study period. Principal component analysis identified three groups of correlations among population variables and socio-economic indicators, such as population density, unemployment, and age group distribution, on two components that helped explaining the variance of railway suicide at a local (arrondissement) level. This information is of particular importance to prevent suicides in high-risk areas on the Belgian railway network.
International Nuclear Information System (INIS)
Jesse, Stephen; Kalinin, Sergei V
2009-01-01
An approach for the analysis of multi-dimensional, spectroscopic-imaging data based on principal component analysis (PCA) is explored. PCA selects and ranks relevant response components based on variance within the data. It is shown that for examples with small relative variations between spectra, the first few PCA components closely coincide with results obtained using model fitting, and this is achieved at rates approximately four orders of magnitude faster. For cases with strong response variations, PCA allows an effective approach to rapidly process, de-noise, and compress data. The prospects for PCA combined with correlation function analysis of component maps as a universal tool for data analysis and representation in microscopy are discussed.
A multi-dimensional functional principal components analysis of EEG data.
Hasenstab, Kyle; Scheffler, Aaron; Telesca, Donatello; Sugar, Catherine A; Jeste, Shafali; DiStefano, Charlotte; Şentürk, Damla
2017-09-01
The electroencephalography (EEG) data created in event-related potential (ERP) experiments have a complex high-dimensional structure. Each stimulus presentation, or trial, generates an ERP waveform which is an instance of functional data. The experiments are made up of sequences of multiple trials, resulting in longitudinal functional data and moreover, responses are recorded at multiple electrodes on the scalp, adding an electrode dimension. Traditional EEG analyses involve multiple simplifications of this structure to increase the signal-to-noise ratio, effectively collapsing the functional and longitudinal components by identifying key features of the ERPs and averaging them across trials. Motivated by an implicit learning paradigm used in autism research in which the functional, longitudinal, and electrode components all have critical interpretations, we propose a multidimensional functional principal components analysis (MD-FPCA) technique which does not collapse any of the dimensions of the ERP data. The proposed decomposition is based on separation of the total variation into subject and subunit level variation which are further decomposed in a two-stage functional principal components analysis. The proposed methodology is shown to be useful for modeling longitudinal trends in the ERP functions, leading to novel insights into the learning patterns of children with Autism Spectrum Disorder (ASD) and their typically developing peers as well as comparisons between the two groups. Finite sample properties of MD-FPCA are further studied via extensive simulations. © 2017, The International Biometric Society.
Stuckey, Bronwyn G A; Opie, Nicole; Cussons, Andrea J; Watts, Gerald F; Burke, Valerie
2014-08-01
Polycystic ovary syndrome (PCOS) is a prevalent condition with heterogeneity of clinical features and cardiovascular risk factors that implies multiple aetiological factors and possible outcomes. To reduce a set of correlated variables to a smaller number of uncorrelated and interpretable factors that may delineate subgroups within PCOS or suggest pathogenetic mechanisms. We used principal component analysis (PCA) to examine the endocrine and cardiometabolic variables associated with PCOS defined by the National Institutes of Health (NIH) criteria. Data were retrieved from the database of a single clinical endocrinologist. We included women with PCOS (N = 378) who were not taking the oral contraceptive pill or other sex hormones, lipid lowering medication, metformin or other medication that could influence the variables of interest. PCA was performed retaining those factors with eigenvalues of at least 1.0. Varimax rotation was used to produce interpretable factors. We identified three principal components. In component 1, the dominant variables were homeostatic model assessment (HOMA) index, body mass index (BMI), high density lipoprotein (HDL) cholesterol and sex hormone binding globulin (SHBG); in component 2, systolic blood pressure, low density lipoprotein (LDL) cholesterol and triglycerides; in component 3, total testosterone and LH/FSH ratio. These components explained 37%, 13% and 11% of the variance in the PCOS cohort respectively. Multiple correlated variables from patients with PCOS can be reduced to three uncorrelated components characterised by insulin resistance, dyslipidaemia/hypertension or hyperandrogenaemia. Clustering of risk factors is consistent with different pathogenetic pathways within PCOS and/or differing cardiometabolic outcomes. Copyright © 2014 Elsevier Inc. All rights reserved.
Li, Xiaozhou; Yang, Tianyue; Li, Siqi; Wang, Deli; Song, Youtao; Zhang, Su
2016-03-01
This paper attempts to investigate the feasibility of using Raman spectroscopy for the diagnosis of colon cancer. Serum taken from 75 healthy volunteers, 65 colon cancer patients and 60 post-operation colon cancer patients was measured in this experiment. In the Raman spectra of all three groups, the Raman peaks at 750, 1083, 1165, 1321, 1629 and 1779 cm-1 assigned to nucleic acids, amino acids and chromophores were consistently observed. All of these six Raman peaks were observed to have statistically significant differences between groups. For quantitative analysis, the multivariate statistical techniques of principal component analysis (PCA) and k nearest neighbour analysis (KNN) were utilized to develop diagnostic algorithms for classification. In PCA, several peaks in the principal component (PC) loadings spectra were identified as the major contributors to the PC scores. Some of the peaks in the PC loadings spectra were also reported as characteristic peaks for colon tissues, which implies correlation between peaks in PC loadings spectra and those in the original Raman spectra. KNN was also performed on the obtained PCs, and a diagnostic accuracy of 91.0% and a specificity of 92.6% were achieved.
International Nuclear Information System (INIS)
Li, Xiaozhou; Yang, Tianyue; Wang, Deli; Li, Siqi; Song, Youtao; Zhang, Su
2016-01-01
This paper attempts to investigate the feasibility of using Raman spectroscopy for the diagnosis of colon cancer. Serum taken from 75 healthy volunteers, 65 colon cancer patients and 60 post-operation colon cancer patients was measured in this experiment. In the Raman spectra of all three groups, the Raman peaks at 750, 1083, 1165, 1321, 1629 and 1779 cm −1 assigned to nucleic acids, amino acids and chromophores were consistently observed. All of these six Raman peaks were observed to have statistically significant differences between groups. For quantitative analysis, the multivariate statistical techniques of principal component analysis (PCA) and k nearest neighbour analysis (KNN) were utilized to develop diagnostic algorithms for classification. In PCA, several peaks in the principal component (PC) loadings spectra were identified as the major contributors to the PC scores. Some of the peaks in the PC loadings spectra were also reported as characteristic peaks for colon tissues, which implies correlation between peaks in PC loadings spectra and those in the original Raman spectra. KNN was also performed on the obtained PCs, and a diagnostic accuracy of 91.0% and a specificity of 92.6% were achieved. (paper)
Wang, Zhuozheng; Deller, J. R.; Fleet, Blair D.
2016-01-01
Acquired digital images are often corrupted by a lack of camera focus, faulty illumination, or missing data. An algorithm is presented for fusion of multiple corrupted images of a scene using the lifting wavelet transform. The method employs adaptive fusion arithmetic based on matrix completion and self-adaptive regional variance estimation. Characteristics of the wavelet coefficients are used to adaptively select fusion rules. Robust principal component analysis is applied to low-frequency image components, and regional variance estimation is applied to high-frequency components. Experiments reveal that the method is effective for multifocus, visible-light, and infrared image fusion. Compared with traditional algorithms, the new algorithm not only increases the amount of preserved information and clarity but also improves robustness.
The use of principal components and univariate charts to control multivariate processes
Directory of Open Access Journals (Sweden)
Marcela A. G. Machado
2008-04-01
Full Text Available In this article, we evaluate the performance of the T² chart based on the principal components (PC X chart and the simultaneous univariate control charts based on the original variables (SU charts or based on the principal components (SUPC charts. The main reason to consider the PC chart lies on the dimensionality reduction. However, depending on the disturbance and on the way the original variables are related, the chart is very slow in signaling, except when all variables are negatively correlated and the principal component is wisely selected. Comparing the SU , the SUPC and the T² charts we conclude that the SU X charts (SUPC charts have a better overall performance when the variables are positively (negatively correlated. We also develop the expression to obtain the power of two S² charts designed for monitoring the covariance matrix. These joint S² charts are, in the majority of the cases, more efficient than the generalized variance chart.Neste artigo, avaliamos o desempenho do gráfico de T² baseado em componentes principais (gráfico PC e dos gráficos de controle simultâneos univariados baseados nas variáveis originais (gráfico SU X ou baseados em componentes principais (gráfico SUPC. A principal razão para o uso do gráfico PC é a redução de dimensionalidade. Entretanto, dependendo da perturbação e da correlação entre as variáveis originais, o gráfico é lento em sinalizar, exceto quando todas as variáveis são negativamente correlacionadas e a componente principal é adequadamente escolhida. Comparando os gráficos SU X, SUPC e T² concluímos que o gráfico SU X (gráfico SUPC tem um melhor desempenho global quando as variáveis são positivamente (negativamente correlacionadas. Desenvolvemos também uma expressão para obter o poder de detecção de dois gráficos de S² projetados para controlar a matriz de covariâncias. Os gráficos conjuntos de S² são, na maioria dos casos, mais eficientes que o gr
Nordemann, D. J. R.; Rigozo, N. R.; de Souza Echer, M. P.; Echer, E.
2008-11-01
We present here an implementation of a least squares iterative regression method applied to the sine functions embedded in the principal components extracted from geophysical time series. This method seems to represent a useful improvement for the non-stationary time series periodicity quantitative analysis. The principal components determination followed by the least squares iterative regression method was implemented in an algorithm written in the Scilab (2006) language. The main result of the method is to obtain the set of sine functions embedded in the series analyzed in decreasing order of significance, from the most important ones, likely to represent the physical processes involved in the generation of the series, to the less important ones that represent noise components. Taking into account the need of a deeper knowledge of the Sun's past history and its implication to global climate change, the method was applied to the Sunspot Number series (1750-2004). With the threshold and parameter values used here, the application of the method leads to a total of 441 explicit sine functions, among which 65 were considered as being significant and were used for a reconstruction that gave a normalized mean squared error of 0.146.
International Nuclear Information System (INIS)
Baloch, M.J.
2003-01-01
Nine upland cotton varieties/strains were tested over 36 environments in Pakistan so as to determine their stability in yield performance. The regression coefficient (b) was used as a measure of adaptability, whereas parameters such as coefficient of determination (r2) and sum of squared deviations from regression (s/sup 2/d) were used as measure of stability. Although the regression coefficients (b) of all varieties did not deviate significantly from the unit slope, the varieties CRIS-5A. BII-89, DNH-40 and Rehmani gave b value closer to unity implying their better adaptation. Lower s/sub 2/d and higher r/sub 2/ of CRIS- 121 and DNH-40 suggest that both of these are fairly stable. The results indicate that, generally, adaptability and stability parameters are independent of each in as much as not all of the parameters simultaneously favoured one variety over the other excepting the variety DNH-40, which was stable based on majority of the parameters. Principal component analysis revealed that the first two components (latent roots) account for about 91.4% of the total variation. The latent vectors of first principal component (PCA1) were smaller and positive which also suggest that most of the varieties were quite adaptive to all of the test environments. (author)
The application of principal component analysis to quantify technique in sports.
Federolf, P; Reid, R; Gilgien, M; Haugen, P; Smith, G
2014-06-01
Analyzing an athlete's "technique," sport scientists often focus on preselected variables that quantify important aspects of movement. In contrast, coaches and practitioners typically describe movements in terms of basic postures and movement components using subjective and qualitative features. A challenge for sport scientists is finding an appropriate quantitative methodology that incorporates the holistic perspective of human observers. Using alpine ski racing as an example, this study explores principal component analysis (PCA) as a mathematical method to decompose a complex movement pattern into its main movement components. Ski racing movements were recorded by determining the three-dimensional coordinates of 26 points on each skier which were subsequently interpreted as a 78-dimensional posture vector at each time point. PCA was then used to determine the mean posture and principal movements (PMk ) carried out by the athletes. The first four PMk contained 95.5 ± 0.5% of the variance in the posture vectors which quantified changes in body inclination, vertical or fore-aft movement of the trunk, and distance between skis. In summary, calculating PMk offered a data-driven, quantitative, and objective method of analyzing human movement that is similar to how human observers such as coaches or ski instructors would describe the movement. © 2012 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
International Nuclear Information System (INIS)
Iqbal, Q.; Saleem, M.Y.; Hameed, A.; Asghar, M.
2014-01-01
For the improvement of qualitative and quantitative traits, existence of variability has prime importance in plant breeding. Data on different morphological and reproductive traits of 47 tomato genotypes were analyzed for correlation,agglomerative hierarchical clustering and principal component analysis (PCA) to select genotypes and traits for future breeding program. Correlation analysis revealed significant positive association between yield and yield components like fruit diameter, single fruit weight and number of fruits plant-1. Principal component (PC) analysis depicted first three PCs with Eigen-value higher than 1 contributing 81.72% of total variability for different traits. The PC-I showed positive factor loadings for all the traits except number of fruits plant-1. The contribution of single fruit weight and fruit diameter was highest in PC-1. Cluster analysis grouped all genotypes into five divergent clusters. The genotypes in cluster-II and cluster-V exhibited uniform maturity and higher yield. The D2 statistics confirmed highest distance between cluster- III and cluster-V while maximum similarity was observed in cluster-II and cluster-III. It is therefore suggested that crosses between genotypes of cluster-II and cluster-V with those of cluster-I and cluster-III may exhibit heterosis in F1 for hybrid breeding and for selection of superior genotypes in succeeding generations for cross breeding programme. (author)
Machine learning of frustrated classical spin models. I. Principal component analysis
Wang, Ce; Zhai, Hui
2017-10-01
This work aims at determining whether artificial intelligence can recognize a phase transition without prior human knowledge. If this were successful, it could be applied to, for instance, analyzing data from the quantum simulation of unsolved physical models. Toward this goal, we first need to apply the machine learning algorithm to well-understood models and see whether the outputs are consistent with our prior knowledge, which serves as the benchmark for this approach. In this work, we feed the computer data generated by the classical Monte Carlo simulation for the X Y model in frustrated triangular and union jack lattices, which has two order parameters and exhibits two phase transitions. We show that the outputs of the principal component analysis agree very well with our understanding of different orders in different phases, and the temperature dependences of the major components detect the nature and the locations of the phase transitions. Our work offers promise for using machine learning techniques to study sophisticated statistical models, and our results can be further improved by using principal component analysis with kernel tricks and the neural network method.
Ioele, Giuseppina; De Luca, Michele; Dinç, Erdal; Oliverio, Filomena; Ragno, Gaetano
2011-01-01
A chemometric approach based on the combined use of the principal component analysis (PCA) and artificial neural network (ANN) was developed for the multicomponent determination of caffeine (CAF), mepyramine (MEP), phenylpropanolamine (PPA) and pheniramine (PNA) in their pharmaceutical preparations without any chemical separation. The predictive ability of the ANN method was compared with the classical linear regression method Partial Least Squares 2 (PLS2). The UV spectral data between 220 and 300 nm of a training set of sixteen quaternary mixtures were processed by PCA to reduce the dimensions of input data and eliminate the noise coming from instrumentation. Several spectral ranges and different numbers of principal components (PCs) were tested to find the PCA-ANN and PLS2 models reaching the best determination results. A two layer ANN, using the first four PCs, was used with log-sigmoid transfer function in first hidden layer and linear transfer function in output layer. Standard error of prediction (SEP) was adopted to assess the predictive accuracy of the models when subjected to external validation. PCA-ANN showed better prediction ability in the determination of PPA and PNA in synthetic samples with added excipients and pharmaceutical formulations. Since both components are characterized by low absorptivity, the better performance of PCA-ANN was ascribed to the ability in considering all non-linear information from noise or interfering excipients.
An application of principal component analysis to the clavicle and clavicle fixation devices.
Daruwalla, Zubin J; Courtis, Patrick; Fitzpatrick, Clare; Fitzpatrick, David; Mullett, Hannan
2010-03-26
Principal component analysis (PCA) enables the building of statistical shape models of bones and joints. This has been used in conjunction with computer assisted surgery in the past. However, PCA of the clavicle has not been performed. Using PCA, we present a novel method that examines the major modes of size and three-dimensional shape variation in male and female clavicles and suggests a method of grouping the clavicle into size and shape categories. Twenty-one high-resolution computerized tomography scans of the clavicle were reconstructed and analyzed using a specifically developed statistical software package. After performing statistical shape analysis, PCA was applied to study the factors that account for anatomical variation. The first principal component representing size accounted for 70.5 percent of anatomical variation. The addition of a further three principal components accounted for almost 87 percent. Using statistical shape analysis, clavicles in males have a greater lateral depth and are longer, wider and thicker than in females. However, the sternal angle in females is larger than in males. PCA confirmed these differences between genders but also noted that men exhibit greater variance and classified clavicles into five morphological groups. This unique approach is the first that standardizes a clavicular orientation. It provides information that is useful to both, the biomedical engineer and clinician. Other applications include implant design with regard to modifying current or designing future clavicle fixation devices. Our findings support the need for further development of clavicle fixation devices and the questioning of whether gender-specific devices are necessary.
An application of principal component analysis to the clavicle and clavicle fixation devices
Directory of Open Access Journals (Sweden)
Fitzpatrick David
2010-03-01
Full Text Available Abstract Background Principal component analysis (PCA enables the building of statistical shape models of bones and joints. This has been used in conjunction with computer assisted surgery in the past. However, PCA of the clavicle has not been performed. Using PCA, we present a novel method that examines the major modes of size and three-dimensional shape variation in male and female clavicles and suggests a method of grouping the clavicle into size and shape categories. Materials and methods Twenty-one high-resolution computerized tomography scans of the clavicle were reconstructed and analyzed using a specifically developed statistical software package. After performing statistical shape analysis, PCA was applied to study the factors that account for anatomical variation. Results The first principal component representing size accounted for 70.5 percent of anatomical variation. The addition of a further three principal components accounted for almost 87 percent. Using statistical shape analysis, clavicles in males have a greater lateral depth and are longer, wider and thicker than in females. However, the sternal angle in females is larger than in males. PCA confirmed these differences between genders but also noted that men exhibit greater variance and classified clavicles into five morphological groups. Discussion And Conclusions This unique approach is the first that standardizes a clavicular orientation. It provides information that is useful to both, the biomedical engineer and clinician. Other applications include implant design with regard to modifying current or designing future clavicle fixation devices. Our findings support the need for further development of clavicle fixation devices and the questioning of whether gender-specific devices are necessary.
Competition analysis on the operating system market using principal component analysis
Directory of Open Access Journals (Sweden)
Brătucu, G.
2011-01-01
Full Text Available Operating system market has evolved greatly. The largest software producer in the world, Microsoft, dominates the operating systems segment. With three operating systems: Windows XP, Windows Vista and Windows 7 the company held a market share of 87.54% in January 2011. Over time, open source operating systems have begun to penetrate the market very strongly affecting other manufacturers. Companies such as Apple Inc. and Google Inc. penetrated the operating system market. This paper aims to compare the best-selling operating systems on the market in terms of defining characteristics. To this purpose the principal components analysis method was used.
Smilek, Jan; Hadas, Zdenek
2017-02-01
In this paper we propose the use of principal component analysis to process the measured acceleration data in order to determine the direction of acceleration with the highest variance on given frequency of interest. This method can be used for improving the power generated by inertial energy harvesters. Their power output is highly dependent on the excitation acceleration magnitude and frequency, but the axes of acceleration measurements might not always be perfectly aligned with the directions of movement, and therefore the generated power output might be severely underestimated in simulations, possibly leading to false conclusions about the feasibility of using the inertial energy harvester for the examined application.
Evaluation of in-core measurements by means of principal components method
International Nuclear Information System (INIS)
Makai, M.; Temesvari, E.
1996-01-01
Surveillance of a nuclear reactor core comprehends determination of assemblies' three-dimensional (3D) power distribution. Derived from other assemblies' measured values, power of non-measured assembly is calculated for every assembly with the help of principal components method (PCM) which is also presented. The measured values are interpolated for different geometrical coverings of the WWER-440 core. Different procedures have been elaborated and investigated, among them the most successful methods are discussed. Each method offers self consistent means to determine numerical errors of the interpolated values. (author). 13 refs, 7 figs, 2 tabs
DEFF Research Database (Denmark)
Rasmussen, Peter Mondrup; Abrahamsen, Trine Julie; Madsen, Kristoffer Hougaard
2012-01-01
We investigate the use of kernel principal component analysis (PCA) and the inverse problem known as pre-image estimation in neuroimaging: i) We explore kernel PCA and pre-image estimation as a means for image denoising as part of the image preprocessing pipeline. Evaluation of the denoising...... procedure is performed within a data-driven split-half evaluation framework. ii) We introduce manifold navigation for exploration of a nonlinear data manifold, and illustrate how pre-image estimation can be used to generate brain maps in the continuum between experimentally defined brain states/classes. We...
Directory of Open Access Journals (Sweden)
Oliveira-Esquerre K.P.
2002-01-01
Full Text Available This work presents a way to predict the biochemical oxygen demand (BOD of the output stream of the biological wastewater treatment plant at RIPASA S/A Celulose e Papel, one of the major pulp and paper plants in Brazil. The best prediction performance is achieved when the data are preprocessed using principal components analysis (PCA before they are fed to a backpropagated neural network. The influence of input variables is analyzed and satisfactory prediction results are obtained for an optimized situation.
Kernel Principal Component Analysis and its Applications in Face Recognition and Active Shape Models
Wang, Quan
2012-01-01
Principal component analysis (PCA) is a popular tool for linear dimensionality reduction and feature extraction. Kernel PCA is the nonlinear form of PCA, which better exploits the complicated spatial structure of high-dimensional features. In this paper, we first review the basic ideas of PCA and kernel PCA. Then we focus on the reconstruction of pre-images for kernel PCA. We also give an introduction on how PCA is used in active shape models (ASMs), and discuss how kernel PCA can be applied ...
Enhancement of Jahani (Firouzabad salt dome lithological units, using principal components analysis
Directory of Open Access Journals (Sweden)
Houshang Pourcaseb
2016-04-01
Full Text Available In this study, principal components analysis was used to investigate lithological characteristics of Jahani salt dome, Firouzabad. The spectral curves of rocks in the study area show that the evaporate rocks have the highest reflectance at specified wavelengths. The highest reflection has been seen in gypsum and white salt, while minimal reflection can be observed in the igneous rocks from the region. The results show that PCs have significantly low information. It is clear that PC1 shows more information in the highest variance while PC2 has less information. Regional geological map and field controls show compatibility between the enhanced zones and outcrops in the field.
InterFace: A software package for face image warping, averaging, and principal components analysis.
Kramer, Robin S S; Jenkins, Rob; Burton, A Mike
2017-12-01
We describe InterFace, a software package for research in face recognition. The package supports image warping, reshaping, averaging of multiple face images, and morphing between faces. It also supports principal components analysis (PCA) of face images, along with tools for exploring the "face space" produced by PCA. The package uses a simple graphical user interface, allowing users to perform these sophisticated image manipulations without any need for programming knowledge. The program is available for download in the form of an app, which requires that users also have access to the (freely available) MATLAB Runtime environment.
A principal components approach to parent-to-newborn body composition associations in South India
Directory of Open Access Journals (Sweden)
Hill Jacqueline C
2009-02-01
Full Text Available Abstract Background Size at birth is influenced by environmental factors, like maternal nutrition and parity, and by genes. Birth weight is a composite measure, encompassing bone, fat and lean mass. These may have different determinants. The main purpose of this paper was to use anthropometry and principal components analysis (PCA to describe maternal and newborn body composition, and associations between them, in an Indian population. We also compared maternal and paternal measurements (body mass index (BMI and height as predictors of newborn body composition. Methods Weight, height, head and mid-arm circumferences, skinfold thicknesses and external pelvic diameters were measured at 30 ± 2 weeks gestation in 571 pregnant women attending the antenatal clinic of the Holdsworth Memorial Hospital, Mysore, India. Paternal height and weight were also measured. At birth, detailed neonatal anthropometry was performed. Unrotated and varimax rotated PCA was applied to the maternal and neonatal measurements. Results Rotated PCA reduced maternal measurements to 4 independent components (fat, pelvis, height and muscle and neonatal measurements to 3 components (trunk+head, fat, and leg length. An SD increase in maternal fat was associated with a 0.16 SD increase (β in neonatal fat (p Conclusion Principal components analysis is a useful method to describe neonatal body composition and its determinants. Newborn adiposity is related to maternal nutritional status and parity, while newborn length is genetically determined. Further research is needed to understand mechanisms linking maternal pelvic size to fetal growth and the determinants and implications of the components (trunk v leg length of fetal skeletal growth.
A Filtering of Incomplete GNSS Position Time Series with Probabilistic Principal Component Analysis
Gruszczynski, Maciej; Klos, Anna; Bogusz, Janusz
2018-04-01
For the first time, we introduced the probabilistic principal component analysis (pPCA) regarding the spatio-temporal filtering of Global Navigation Satellite System (GNSS) position time series to estimate and remove Common Mode Error (CME) without the interpolation of missing values. We used data from the International GNSS Service (IGS) stations which contributed to the latest International Terrestrial Reference Frame (ITRF2014). The efficiency of the proposed algorithm was tested on the simulated incomplete time series, then CME was estimated for a set of 25 stations located in Central Europe. The newly applied pPCA was compared with previously used algorithms, which showed that this method is capable of resolving the problem of proper spatio-temporal filtering of GNSS time series characterized by different observation time span. We showed, that filtering can be carried out with pPCA method when there exist two time series in the dataset having less than 100 common epoch of observations. The 1st Principal Component (PC) explained more than 36% of the total variance represented by time series residuals' (series with deterministic model removed), what compared to the other PCs variances (less than 8%) means that common signals are significant in GNSS residuals. A clear improvement in the spectral indices of the power-law noise was noticed for the Up component, which is reflected by an average shift towards white noise from - 0.98 to - 0.67 (30%). We observed a significant average reduction in the accuracy of stations' velocity estimated for filtered residuals by 35, 28 and 69% for the North, East, and Up components, respectively. CME series were also subjected to analysis in the context of environmental mass loading influences of the filtering results. Subtraction of the environmental loading models from GNSS residuals provides to reduction of the estimated CME variance by 20 and 65% for horizontal and vertical components, respectively.
A stable systemic risk ranking in China's banking sector: Based on principal component analysis
Fang, Libing; Xiao, Binqing; Yu, Honghai; You, Qixing
2018-02-01
In this paper, we compare five popular systemic risk rankings, and apply principal component analysis (PCA) model to provide a stable systemic risk ranking for the Chinese banking sector. Our empirical results indicate that five methods suggest vastly different systemic risk rankings for the same bank, while the combined systemic risk measure based on PCA provides a reliable ranking. Furthermore, according to factor loadings of the first component, PCA combined ranking is mainly based on fundamentals instead of market price data. We clearly find that price-based rankings are not as practical a method as fundamentals-based ones. This PCA combined ranking directly shows systemic risk contributions of each bank for banking supervision purpose and reminds banks to prevent and cope with the financial crisis in advance.
Directory of Open Access Journals (Sweden)
Romi Wiryadinata
2016-03-01
Full Text Available Presensi is a logging attendance, part of activity reporting an institution, or a component institution itself which contains the presence data compiled and arranged so that it is easy to search for and used when required at any time by the parties concerned. Computer application developed in the presensi system is a computer application that can recognize a person's face using only a webcam. Face recognition in this study using a webcam to capture an image of the room at any given time who later identified the existing faces. Some of the methods used in the research here is a method of the Dynamic Times Wrapping (DTW, Principal Component Analysis (PCA and Gabor Wavelet. This system, used in testing with normal facial image expression. The success rate of the introduction with the normal expression of face image using DTW amounting to 80%, 100% and PCA Gabor wavelet 97%
Directory of Open Access Journals (Sweden)
Hemant Pathak
2011-01-01
Full Text Available Groundwater is one of the major resources of the drinking water in Sagar city (India.. In this study 15 sampling station were selected for the investigations on 14 chemical parameters. The work was carried out during different months of the pre-monsoon, monsoon and post-monsoon seasons in June 2009 to June 2010. The multivariate statistics such as principal component and cluster analysis were applied to the datasets to investigate seasonal variations in groundwater quality. Principal axis factoring has been used to observe the mode of association of parameters and their interrelationships, for evaluating water quality. Average value of BOD, COD, ammonia and iron was high during entire study period. Elevated values of BOD and ammonia in monsoon, slightly more value of BOD in post-monsoon, BOD, ammonia and iron in pre-monsoon period reflected contribution on temporal effect on groundwater. Results of principal component analysis evinced that all the parameters equally and significantly contribute to groundwater quality variations. Factor 1 and factor 2 analysis revealed the DO value deteriorate due to organic load (BOD/Ammonia in different seasons. Hierarchical cluster analysis grouped 15 stations into four clusters in monsoon, five clusters in post-monsoon and five clusters in pre-monsoon with similar water quality features. Clustered group at monsoon, post-monsoon and pre-monsoon consisted one station exhibiting significant spatial variation in physicochemical composition. The anthropogenic nitrogenous species, as fallout from modernization activities. The study indicated that the groundwater sufficiently well oxygenated and nutrient-rich in study places.
Portable XRF and principal component analysis for bill characterization in forensic science
International Nuclear Information System (INIS)
Appoloni, C.R.; Melquiades, F.L.
2014-01-01
Several modern techniques have been applied to prevent counterfeiting of money bills. The objective of this study was to demonstrate the potential of Portable X-ray Fluorescence (PXRF) technique and the multivariate analysis method of Principal Component Analysis (PCA) for classification of bills in order to use it in forensic science. Bills of Dollar, Euro and Real (Brazilian currency) were measured directly at different colored regions, without any previous preparation. Spectra interpretation allowed the identification of Ca, Ti, Fe, Cu, Sr, Y, Zr and Pb. PCA analysis separated the bills in three groups and subgroups among Brazilian currency. In conclusion, the samples were classified according to its origin identifying the elements responsible for differentiation and basic pigment composition. PXRF allied to multivariate discriminate methods is a promising technique for rapid and no destructive identification of false bills in forensic science. - Highlights: • The paper is about a direct method for bills discrimination by EDXRF and principal component analysis. • The bills are analyzed directly, without sample preparation and non destructively. • The results demonstrates that the methodology is feasible and could be applied in forensic science for identification of origin and false banknotes. • The novelty is that portable EDXRF is very fast and efficient for bills characterization
Liu, Xiao-Fang; Xue, Chang-Hu; Wang, Yu-Ming; Li, Zhao-Jie; Xue, Yong; Xu, Jie
2011-11-01
The present study is to investigate the feasibility of multi-elements analysis in determination of the geographical origin of sea cucumber Apostichopus japonicus, and to make choice of the effective tracers in sea cucumber Apostichopus japonicus geographical origin assessment. The content of the elements such as Al, V, Cr, Mn, Fe, Co, Ni, Cu, Zn, As, Se, Mo, Cd, Hg and Pb in sea cucumber Apostichopus japonicus samples from seven places of geographical origin were determined by means of ICP-MS. The results were used for the development of elements database. Cluster analysis(CA) and principal component analysis (PCA) were applied to differentiate the sea cucumber Apostichopus japonicus geographical origin. Three principal components which accounted for over 89% of the total variance were extracted from the standardized data. The results of Q-type cluster analysis showed that the 26 samples could be clustered reasonably into five groups, the classification results were significantly associated with the marine distribution of the sea cucumber Apostichopus japonicus samples. The CA and PCA were the effective methods for elements analysis of sea cucumber Apostichopus japonicus samples. The content of the mineral elements in sea cucumber Apostichopus japonicus samples was good chemical descriptors for differentiating their geographical origins.
Impact of different conditions on accuracy of five rules for principal components retention
Directory of Open Access Journals (Sweden)
Zorić Aleksandar
2013-01-01
Full Text Available Polemics about criteria for nontrivial principal components are still present in the literature. Finding of a lot of papers, is that the most frequently used Guttman Kaiser’s criterion has very poor performance. In the last three years some new criteria were proposed. In this Monte Carlo experiment we aimed to investigate the impact that sample size, number of analyzed variables, number of supposed factors and proportion of error variance have on the accuracy of analyzed criteria for principal components retention. We compared the following criteria: Bartlett’s χ2 test, Horn’s Parallel Analysis, Guttman-Kaiser’s eigenvalue over one, Velicer’s MAP and CHull originally proposed by Ceulemans & Kiers. Factors were systematically combined resulting in 690 different combinations. A total of 138,000 simulations were performed. Novelty in this research is systematic variation of the error variance. Performed simulations showed that, in favorable research conditions, all analyzed criteria work properly. Bartlett’s and Horns criterion expressed the robustness in most of analyzed situations. Velicer’s MAP had the best accuracy in situations with small number of subjects and high number of variables. Results confirm earlier findings of Guttman-Kaiser’s criterion having the worse performance.
Fault detection of flywheel system based on clustering and principal component analysis
Directory of Open Access Journals (Sweden)
Wang Rixin
2015-12-01
Full Text Available Considering the nonlinear, multifunctional properties of double-flywheel with closed-loop control, a two-step method including clustering and principal component analysis is proposed to detect the two faults in the multifunctional flywheels. At the first step of the proposed algorithm, clustering is taken as feature recognition to check the instructions of “integrated power and attitude control” system, such as attitude control, energy storage or energy discharge. These commands will ask the flywheel system to work in different operation modes. Therefore, the relationship of parameters in different operations can define the cluster structure of training data. Ordering points to identify the clustering structure (OPTICS can automatically identify these clusters by the reachability-plot. K-means algorithm can divide the training data into the corresponding operations according to the reachability-plot. Finally, the last step of proposed model is used to define the relationship of parameters in each operation through the principal component analysis (PCA method. Compared with the PCA model, the proposed approach is capable of identifying the new clusters and learning the new behavior of incoming data. The simulation results show that it can effectively detect the faults in the multifunctional flywheels system.
Directory of Open Access Journals (Sweden)
Cucu Suhery
2017-04-01
Full Text Available Berbagai sistem monitoring presensi yang ada memiliki kekurangan dan kelebihan masing-masing, dan perlu untuk terus dikembangkan sehingga memudahkan dalam proses pengolahan datanya. Pada penelitian ini dikembangkan suatu sistem monitoring presensi menggunakan deteksi wajah manusia yang diintegrasikan dengan basis data menggunakan bahasa pemrograman Python dan library opencv. Akuisisi data citra dilakukan dengan ponsel android, kemudian citra tersebut dideteksi dan dipotong sehingga hanya didapat bagian wajah saja. Deteksi wajah menggunakan metode Haar-Cascade Classifier, kemudian ekstraksi fitur dilakukan menggunakan metode Principal Component Analysis (PCA. Hasil dari PCA diberi label sesuai dengan data manusia yang ada pada basis data. Semua citra yang telah memiliki nilai PCA dan tersimpan di basis data akan dicari kemiripannya dengan citra wajah pada proses pengujian menggunakan metoda Euclidian Distance. Pada penelitian ini basis data yang digunakan yaitu MySQL. Hasil deteksi citra wajah pada proses pelatihan memiliki tingkat keberhasilan 100% dan hasil identifikasi wajah pada proses pengujian memiliki tingkat keberhasilan 90%.. Kata kunci— android, haar-cascade classifier, principal component analysis, euclidian distance, MySQL, sistem monitoring presensi, deteksi wajah
Finger crease pattern recognition using Legendre moments and principal component analysis
Luo, Rongfang; Lin, Tusheng
2007-03-01
The finger joint lines defined as finger creases and its distribution can identify a person. In this paper, we propose a new finger crease pattern recognition method based on Legendre moments and principal component analysis (PCA). After obtaining the region of interest (ROI) for each finger image in the pre-processing stage, Legendre moments under Radon transform are applied to construct a moment feature matrix from the ROI, which greatly decreases the dimensionality of ROI and can represent principal components of the finger creases quite well. Then, an approach to finger crease pattern recognition is designed based on Karhunen-Loeve (K-L) transform. The method applies PCA to a moment feature matrix rather than the original image matrix to achieve the feature vector. The proposed method has been tested on a database of 824 images from 103 individuals using the nearest neighbor classifier. The accuracy up to 98.584% has been obtained when using 4 samples per class for training. The experimental results demonstrate that our proposed approach is feasible and effective in biometrics.
Principal component analysis of solar flares in the soft X-ray flux
International Nuclear Information System (INIS)
Teuber, D.L.; Reichmann, E.J.; Wilson, R.M.; National Aeronautics and Space Administration, Huntsville, AL
1979-01-01
Principal component analysis is a technique for extracting the salient features from a mass of data. It applies, in particular, to the analysis of nonstationary ensembles. Computational schemes for this task require the evaluation of eigenvalues of matrices. We have used EISPACK Matrix Eigen System Routines on an IBM 360-75 to analyze full-disk proportional-counter data from the X-ray event analyzer (X-REA) which was part of the Skylab ATM/S-056 experiment. Empirical orthogonal functions have been derived for events in the soft X-ray spectrum between 2.5 and 20 A during different time frames between June 1973 and January 1974. Results indicate that approximately 90% of the cumulative power of each analyzed flare is contained in the largest eigenvector. The first two largest eigenvectors are sufficient for an empirical curve-fit through the raw data and a characterization of solar flares in the soft X-ray flux. Power spectra of the two largest eigenvectors reveal a previously reported periodicity of approximately 5 min. Similar signatures were also obtained from flares that are synchronized on maximum pulse-height when subjected to a principal component analysis. (orig.)
Directory of Open Access Journals (Sweden)
Rockson Dobgegah
2011-03-01
Full Text Available The study adopts a data reduction technique to examine the presence of any complex structure among a set of project management competency variables. A structured survey questionnaire was administered to 100 project managers to elicit relevant data, and this achieved a relatively high response rate of 54%. After satisfying all the necessary tests of reliability of the survey instrument, sample size adequacy and population matrix, the data was subjected to principal component analysis, resulting in the identification of six new thematic project management competency areas ; and were explained in terms of human resource management and project control; construction innovation and communication; project financial resources management; project risk and quality management; business ethics and; physical resources and procurement management. These knowledge areas now form the basis for lateral project management training requirements in the context of the Ghanaian construction industry. Key contribution of the paper is manifested in the use of the principal component analysis, which has rigorously provided understanding into the complex structure and the relationship between the various knowledge areas. The originality and value of the paper is embedded in the use of contextual-task conceptual knowledge to expound the six uncorrelated empirical utility of the project management competencies.
IMPROVED SEARCH OF PRINCIPAL COMPONENT ANALYSIS DATABASES FOR SPECTRO-POLARIMETRIC INVERSION
International Nuclear Information System (INIS)
Casini, R.; Lites, B. W.; Ramos, A. Asensio; Ariste, A. López
2013-01-01
We describe a simple technique for the acceleration of spectro-polarimetric inversions based on principal component analysis (PCA) of Stokes profiles. This technique involves the indexing of the database models based on the sign of the projections (PCA coefficients) of the first few relevant orders of principal components of the four Stokes parameters. In this way, each model in the database can be attributed a distinctive binary number of 2 4n bits, where n is the number of PCA orders used for the indexing. Each of these binary numbers (indices) identifies a group of ''compatible'' models for the inversion of a given set of observed Stokes profiles sharing the same index. The complete set of the binary numbers so constructed evidently determines a partition of the database. The search of the database for the PCA inversion of spectro-polarimetric data can profit greatly from this indexing. In practical cases it becomes possible to approach the ideal acceleration factor of 2 4n as compared to the systematic search of a non-indexed database for a traditional PCA inversion. This indexing method relies on the existence of a physical meaning in the sign of the PCA coefficients of a model. For this reason, the presence of model ambiguities and of spectro-polarimetric noise in the observations limits in practice the number n of relevant PCA orders that can be used for the indexing
International Nuclear Information System (INIS)
Carvajal Escobar, Yesid; Marco Segura, Juan B
2005-01-01
An EOF analysis or principal component analysis (PC) was made for monthly precipitation (1972-1998) using 50 stations, and for monthly rate of flow (1951-2000) at 8 stations in the Valle del Cauca state, Colombia. Previously, we had applied 5 measures in order to verify the convenience of the analysis. These measures were: i) evaluation of significance level of correlation between variables; II) the kaiser-Meyer-Oikin (KMO) test; III) the Bartlett sphericity test; (IV) the measurement of sample adequacy (MSA), and v) the percentage of non-redundant residues with absolute values>0.05. For the selection of the significant PCS in every set of variables we applied seven criteria: the graphical method, the explained variance percentage, the mean root, the tests of Velicer, Bartlett, Broken Stich and the cross validation test. We chose the latter as the best one. It is robust and quantitative. Precipitation stations were divided in three homogeneous groups, applying a hierarchical cluster analysis, which was verified through the geographic method and the discriminate analysis for the first four EOFs of precipitation. There are many advantages to the EOF method: reduction of the dimensionality of multivariate data, calculation of missing data, evaluation and reduction of multi-co linearity, building of homogeneous groups, and detection of outliers. With the first four principal components we can explain 60.34% of the total variance of monthly precipitation for the Valle del Cauca state, and 94% of the total variance for the selected records of rates of flow
Registration of dynamic dopamine D2receptor images using principal component analysis
International Nuclear Information System (INIS)
Acton, P.D.; Ell, P.J.; Pilowsky, L.S.; Brammer, M.J.; Suckling, J.
1997-01-01
This paper describes a novel technique for registering a dynamic sequence of single-photon emission tomography (SPET) dopamine D 2 receptor images, using principal component analysis (PCA). Conventional methods for registering images, such as count difference and correlation coefficient algorithms, fail to take into account the dynamic nature of the data, resulting in large systematic errors when registering time-varying images. However, by using principal component analysis to extract the temporal structure of the image sequence, misregistration can be quantified by examining the distribution of eigenvalues. The registration procedures were tested using a computer-generated dynamic phantom derived from a high-resolution magnetic resonance image of a realistic brain phantom. Each method was also applied to clinical SPET images of dopamine D 2 receptors, using the ligands iodine-123 iodobenzamide and iodine-123 epidepride, to investigate the influence of misregistration on kinetic modelling parameters and the binding potential. The PCA technique gave highly significant (P 123 I-epidepride scans. The PCA method produced data of much greater quality for subsequent kinetic modelling, with an improvement of nearly 50% in the χ 2 of the fit to the compartmental model, and provided superior quality registration of particularly difficult dynamic sequences. (orig.)
Using principal component analysis to understand the variability of PDS 456
Parker, M. L.; Reeves, J. N.; Matzeu, G. A.; Buisson, D. J. K.; Fabian, A. C.
2018-02-01
We present a spectral-variability analysis of the low-redshift quasar PDS 456 using principal component analysis. In the XMM-Newton data, we find a strong peak in the first principal component at the energy of the Fe absorption line from the highly blueshifted outflow. This indicates that the absorption feature is more variable than the continuum, and that it is responding to the continuum. We find qualitatively different behaviour in the Suzaku data, which is dominated by changes in the column density of neutral absorption. In this case, we find no evidence of the absorption produced by the highly ionized gas being correlated with this variability. Additionally, we perform simulations of the source variability, and demonstrate that PCA can trivially distinguish between outflow variability correlated, anticorrelated and un-correlated with the continuum flux. Here, the observed anticorrelation between the absorption line equivalent width and the continuum flux may be due to the ionization of the wind responding to the continuum. Finally, we compare our results with those found in the narrow-line Seyfert 1 IRAS 13224-3809. We find that the Fe K UFO feature is sharper and more prominent in PDS 456, but that it lacks the lower energy features from lighter elements found in IRAS 13224-3809, presumably due to differences in ionization.
Gaudreault, Nathaly; Mezghani, Neila; Turcot, Katia; Hagemeister, Nicola; Boivin, Karine; de Guise, Jacques A
2011-03-01
Interpreting gait data is challenging due to intersubject variability observed in the gait pattern of both normal and pathological populations. The objective of this study was to investigate the impact of using principal component analysis for grouping knee osteoarthritis (OA) patients' gait data in more homogeneous groups when studying the effect of a physiotherapy treatment. Three-dimensional (3D) knee kinematic and kinetic data were recorded during the gait of 29 participants diagnosed with knee OA before and after they received 12 weeks of physiotherapy treatment. Principal component analysis was applied to extract groups of knee flexion/extension, adduction/abduction and internal/external rotation angle and moment data. The treatment's effect on parameters of interest was assessed using paired t-tests performed before and after grouping the knee kinematic data. Increased quadriceps and hamstring strength was observed following treatment (Pphysiotherapy on gait mechanics of knee osteoarthritis patients may be masked or underestimated if kinematic data are not separated into more homogeneous groups when performing pre- and post-treatment comparisons. Copyright © 2010 Elsevier Ltd. All rights reserved.
Boundary layer noise subtraction in hydrodynamic tunnel using robust principal component analysis.
Amailland, Sylvain; Thomas, Jean-Hugh; Pézerat, Charles; Boucheron, Romuald
2018-04-01
The acoustic study of propellers in a hydrodynamic tunnel is of paramount importance during the design process, but can involve significant difficulties due to the boundary layer noise (BLN). Indeed, advanced denoising methods are needed to recover the acoustic signal in case of poor signal-to-noise ratio. The technique proposed in this paper is based on the decomposition of the wall-pressure cross-spectral matrix (CSM) by taking advantage of both the low-rank property of the acoustic CSM and the sparse property of the BLN CSM. Thus, the algorithm belongs to the class of robust principal component analysis (RPCA), which derives from the widely used principal component analysis. If the BLN is spatially decorrelated, the proposed RPCA algorithm can blindly recover the acoustical signals even for negative signal-to-noise ratio. Unfortunately, in a realistic case, acoustic signals recorded in a hydrodynamic tunnel show that the noise may be partially correlated. A prewhitening strategy is then considered in order to take into account the spatially coherent background noise. Numerical simulations and experimental results show an improvement in terms of BLN reduction in the large hydrodynamic tunnel. The effectiveness of the denoising method is also investigated in the context of acoustic source localization.
Shah, Syed Muhammad Saqlain; Batool, Safeera; Khan, Imran; Ashraf, Muhammad Usman; Abbas, Syed Hussnain; Hussain, Syed Adnan
2017-09-01
Automatic diagnosis of human diseases are mostly achieved through decision support systems. The performance of these systems is mainly dependent on the selection of the most relevant features. This becomes harder when the dataset contains missing values for the different features. Probabilistic Principal Component Analysis (PPCA) has reputation to deal with the problem of missing values of attributes. This research presents a methodology which uses the results of medical tests as input, extracts a reduced dimensional feature subset and provides diagnosis of heart disease. The proposed methodology extracts high impact features in new projection by using Probabilistic Principal Component Analysis (PPCA). PPCA extracts projection vectors which contribute in highest covariance and these projection vectors are used to reduce feature dimension. The selection of projection vectors is done through Parallel Analysis (PA). The feature subset with the reduced dimension is provided to radial basis function (RBF) kernel based Support Vector Machines (SVM). The RBF based SVM serves the purpose of classification into two categories i.e., Heart Patient (HP) and Normal Subject (NS). The proposed methodology is evaluated through accuracy, specificity and sensitivity over the three datasets of UCI i.e., Cleveland, Switzerland and Hungarian. The statistical results achieved through the proposed technique are presented in comparison to the existing research showing its impact. The proposed technique achieved an accuracy of 82.18%, 85.82% and 91.30% for Cleveland, Hungarian and Switzerland dataset respectively.
Lin, Nan; Jiang, Junhai; Guo, Shicheng; Xiong, Momiao
2015-01-01
Due to the advancement in sensor technology, the growing large medical image data have the ability to visualize the anatomical changes in biological tissues. As a consequence, the medical images have the potential to enhance the diagnosis of disease, the prediction of clinical outcomes and the characterization of disease progression. But in the meantime, the growing data dimensions pose great methodological and computational challenges for the representation and selection of features in image cluster analysis. To address these challenges, we first extend the functional principal component analysis (FPCA) from one dimension to two dimensions to fully capture the space variation of image the signals. The image signals contain a large number of redundant features which provide no additional information for clustering analysis. The widely used methods for removing the irrelevant features are sparse clustering algorithms using a lasso-type penalty to select the features. However, the accuracy of clustering using a lasso-type penalty depends on the selection of the penalty parameters and the threshold value. In practice, they are difficult to determine. Recently, randomized algorithms have received a great deal of attentions in big data analysis. This paper presents a randomized algorithm for accurate feature selection in image clustering analysis. The proposed method is applied to both the liver and kidney cancer histology image data from the TCGA database. The results demonstrate that the randomized feature selection method coupled with functional principal component analysis substantially outperforms the current sparse clustering algorithms in image cluster analysis. PMID:26196383
Choi, Ji Yeh; Hwang, Heungsun; Yamamoto, Michio; Jung, Kwanghee; Woodward, Todd S
2017-06-01
Functional principal component analysis (FPCA) and functional multiple-set canonical correlation analysis (FMCCA) are data reduction techniques for functional data that are collected in the form of smooth curves or functions over a continuum such as time or space. In FPCA, low-dimensional components are extracted from a single functional dataset such that they explain the most variance of the dataset, whereas in FMCCA, low-dimensional components are obtained from each of multiple functional datasets in such a way that the associations among the components are maximized across the different sets. In this paper, we propose a unified approach to FPCA and FMCCA. The proposed approach subsumes both techniques as special cases. Furthermore, it permits a compromise between the techniques, such that components are obtained from each set of functional data to maximize their associations across different datasets, while accounting for the variance of the data well. We propose a single optimization criterion for the proposed approach, and develop an alternating regularized least squares algorithm to minimize the criterion in combination with basis function approximations to functions. We conduct a simulation study to investigate the performance of the proposed approach based on synthetic data. We also apply the approach for the analysis of multiple-subject functional magnetic resonance imaging data to obtain low-dimensional components of blood-oxygen level-dependent signal changes of the brain over time, which are highly correlated across the subjects as well as representative of the data. The extracted components are used to identify networks of neural activity that are commonly activated across the subjects while carrying out a working memory task.
Newbern, Dorothee; Balikcioglu, Metin; Bain, James; Muehlbauer, Michael; Stevens, Robert; Ilkayeva, Olga; Dolinsky, Diana; Armstrong, Sarah; Irizarry, Krystal; Freemark, Michael
2014-01-01
Objective: Obesity and insulin resistance (IR) predispose to type 2 diabetes mellitus. Yet only half of obese adolescents have IR and far fewer progress to type 2 diabetes mellitus. We hypothesized that amino acid and fatty acid metabolites may serve as biomarkers or determinants of IR in obese teens. Research Design and Methods: Fasting blood samples were analyzed by tandem mass spectrometry in 82 obese adolescents. A principal components analysis and multiple linear regression models were used to correlate metabolic components with surrogate measures of IR: homeostasis model assessment index of insulin resistance (HOMA-IR), adiponectin, and triglyceride (TG) to high-density lipoprotein (HDL) ratio. Results: Branched-chain amino acid (BCAA) levels and products of BCAA catabolism were higher (P BCAA, uric acid, and long-chain acylcarnitines and negatively with byproducts of complete fatty acid oxidation (R2 = 0.659, P < .0001). In contrast, only BMI z-score correlated with HOMA-IR in females. Adiponectin correlated inversely with BCAA and uric acid (R2 = 0.268, P = .0212) in males but not females. TG to HDL ratio correlated with BMI z-score and the BCAA signature in females but not males. Conclusions: BCAA levels and byproducts of BCAA catabolism are higher in obese teenage boys than girls of comparable BMI z-score. A metabolic signature comprising BCAA and uric acid correlates positively with HOMA-IR in males and TG to HDL ratio in females and inversely with adiponectin in males but not females. Likewise, byproducts of fatty acid oxidation associate inversely with HOMA-IR in males but not females. Our findings underscore the roles of sex differences in metabolic function and outcomes in pediatric obesity. PMID:25202817
Zha, N.; Capaldi, D. P. I.; Pike, D.; McCormack, D. G.; Cunningham, I. A.; Parraga, G.
2015-03-01
Pulmonary x-ray computed tomography (CT) may be used to characterize emphysema and airways disease in patients with chronic obstructive pulmonary disease (COPD). One analysis approach - parametric response mapping (PMR) utilizes registered inspiratory and expiratory CT image volumes and CT-density-histogram thresholds, but there is no consensus regarding the threshold values used, or their clinical meaning. Principal-component-analysis (PCA) of the CT density histogram can be exploited to quantify emphysema using data-driven CT-density-histogram thresholds. Thus, the objective of this proof-of-concept demonstration was to develop a PRM approach using PCA-derived thresholds in COPD patients and ex-smokers without airflow limitation. Methods: Fifteen COPD ex-smokers and 5 normal ex-smokers were evaluated. Thoracic CT images were also acquired at full inspiration and full expiration and these images were non-rigidly co-registered. PCA was performed for the CT density histograms, from which the components with the highest eigenvalues greater than one were summed. Since the values of the principal component curve correlate directly with the variability in the sample, the maximum and minimum points on the curve were used as threshold values for the PCA-adjusted PRM technique. Results: A significant correlation was determined between conventional and PCA-adjusted PRM with 3He MRI apparent diffusion coefficient (p<0.001), with CT RA950 (p<0.0001), as well as with 3He MRI ventilation defect percent, a measurement of both small airways disease (p=0.049 and p=0.06, respectively) and emphysema (p=0.02). Conclusions: PRM generated using PCA thresholds of the CT density histogram showed significant correlations with CT and 3He MRI measurements of emphysema, but not airways disease.
Chen, Shuming; Wang, Dengfeng; Liu, Bo
This paper investigates optimization design of the thickness of the sound package performed on a passenger automobile. The major characteristics indexes for performance selected to evaluate the processes are the SPL of the exterior noise and the weight of the sound package, and the corresponding parameters of the sound package are the thickness of the glass wool with aluminum foil for the first layer, the thickness of the glass fiber for the second layer, and the thickness of the PE foam for the third layer. In this paper, the process is fundamentally with multiple performances, thus, the grey relational analysis that utilizes grey relational grade as performance index is especially employed to determine the optimal combination of the thickness of the different layers for the designed sound package. Additionally, in order to evaluate the weighting values corresponding to various performance characteristics, the principal component analysis is used to show their relative importance properly and objectively. The results of the confirmation experiments uncover that grey relational analysis coupled with principal analysis methods can successfully be applied to find the optimal combination of the thickness for each layer of the sound package material. Therefore, the presented method can be an effective tool to improve the vehicle exterior noise and lower the weight of the sound package. In addition, it will also be helpful for other applications in the automotive industry, such as the First Automobile Works in China, Changan Automobile in China, etc.
Oil classification using X-ray scattering and principal component analysis
Energy Technology Data Exchange (ETDEWEB)
Almeida, Danielle S.; Souza, Amanda S.; Lopes, Ricardo T., E-mail: dani.almeida84@gmail.com, E-mail: ricardo@lin.ufrj.br, E-mail: amandass@bioqmed.ufrj.br [Universidade Federal do Rio de Janeiro (UFRJ), Rio de Janeiro, RJ (Brazil); Oliveira, Davi F.; Anjos, Marcelino J., E-mail: davi.oliveira@uerj.br, E-mail: marcelin@uerj.br [Universidade do Estado do Rio de Janeiro (UERJ), Rio de Janeiro, RJ (Brazil). Inst. de Fisica Armando Dias Tavares
2015-07-01
X-ray scattering techniques have been considered promising for the classification and characterization of many types of samples. This study employed this technique combined with chemical analysis and multivariate analysis to characterize 54 vegetable oil samples (being 25 olive oils)with different properties obtained in commercial establishments in Rio de Janeiro city. The samples were chemically analyzed using the following indexes: iodine, acidity, saponification and peroxide. In order to obtain the X-ray scattering spectrum, an X-ray tube with a silver anode operating at 40kV and 50 μA was used. The results showed that oils cab ne divided in tow large groups: olive oils and non-olive oils. Additionally, in a multivariate analysis (Principal Component Analysis - PCA), two components were obtained and accounted for more than 80% of the variance. One component was associated with chemical parameters and the other with scattering profiles of each sample. Results showed that use of X-ray scattering spectra combined with chemical analysis and PCA can be a fast, cheap and efficient method for vegetable oil characterization. (author)
Progress Towards Improved Analysis of TES X-ray Data Using Principal Component Analysis
Busch, S. E.; Adams, J. S.; Bandler, S. R.; Chervenak, J. A.; Eckart, M. E.; Finkbeiner, F. M.; Fixsen, D. J.; Kelley, R. L.; Kilbourne, C. A.; Lee, S.-J.;
2015-01-01
The traditional method of applying a digital optimal filter to measure X-ray pulses from transition-edge sensor (TES) devices does not achieve the best energy resolution when the signals have a highly non-linear response to energy, or the noise is non-stationary during the pulse. We present an implementation of a method to analyze X-ray data from TESs, which is based upon principal component analysis (PCA). Our method separates the X-ray signal pulse into orthogonal components that have the largest variance. We typically recover pulse height, arrival time, differences in pulse shape, and the variation of pulse height with detector temperature. These components can then be combined to form a representation of pulse energy. An added value of this method is that by reporting information on more descriptive parameters (as opposed to a single number representing energy), we generate a much more complete picture of the pulse received. Here we report on progress in developing this technique for future implementation on X-ray telescopes. We used an 55Fe source to characterize Mo/Au TESs. On the same dataset, the PCA method recovers a spectral resolution that is better by a factor of two than achievable with digital optimal filters.
Oil classification using X-ray scattering and principal component analysis
International Nuclear Information System (INIS)
Almeida, Danielle S.; Souza, Amanda S.; Lopes, Ricardo T.; Oliveira, Davi F.; Anjos, Marcelino J.
2015-01-01
X-ray scattering techniques have been considered promising for the classification and characterization of many types of samples. This study employed this technique combined with chemical analysis and multivariate analysis to characterize 54 vegetable oil samples (being 25 olive oils)with different properties obtained in commercial establishments in Rio de Janeiro city. The samples were chemically analyzed using the following indexes: iodine, acidity, saponification and peroxide. In order to obtain the X-ray scattering spectrum, an X-ray tube with a silver anode operating at 40kV and 50 μA was used. The results showed that oils cab ne divided in tow large groups: olive oils and non-olive oils. Additionally, in a multivariate analysis (Principal Component Analysis - PCA), two components were obtained and accounted for more than 80% of the variance. One component was associated with chemical parameters and the other with scattering profiles of each sample. Results showed that use of X-ray scattering spectra combined with chemical analysis and PCA can be a fast, cheap and efficient method for vegetable oil characterization. (author)
Mishra-Kalyani, Pallavi S.; Johnson, Brent A.; Glass, Jonathan D.; Long, Qi
2016-09-01
Clinical disease registries offer a rich collection of valuable patient information but also pose challenges that require special care and attention in statistical analyses. The goal of this paper is to propose a statistical framework that allows for estimating the effect of surgical insertion of a percutaneous endogastrostomy (PEG) tube for patients living with amyotrophic lateral sclerosis (ALS) using data from a clinical registry. Although all ALS patients are informed about PEG, only some patients agree to the procedure which, leads to the potential for selection bias. Assessing the effect of PEG is further complicated by the aggressively fatal disease, such that time to death competes directly with both the opportunity to receive PEG and clinical outcome measurements. Our proposed methodology handles the “censoring by death” phenomenon through principal stratification and selection bias for PEG treatment through generalized propensity scores. We develop a fully Bayesian modeling approach to estimate the survivor average causal effect (SACE) of PEG on BMI, a surrogate outcome measure of nutrition and quality of life. The use of propensity score methods within the principal stratification framework demonstrates a significant and positive effect of PEG treatment, particularly when time of treatment is included in the treatment definition.
Elsawy, Amr S; Eldawlatly, Seif; Taher, Mohamed; Aly, Gamal M
2014-01-01
The current trend to use Brain-Computer Interfaces (BCIs) with mobile devices mandates the development of efficient EEG data processing methods. In this paper, we demonstrate the performance of a Principal Component Analysis (PCA) ensemble classifier for P300-based spellers. We recorded EEG data from multiple subjects using the Emotiv neuroheadset in the context of a classical oddball P300 speller paradigm. We compare the performance of the proposed ensemble classifier to the performance of traditional feature extraction and classifier methods. Our results demonstrate the capability of the PCA ensemble classifier to classify P300 data recorded using the Emotiv neuroheadset with an average accuracy of 86.29% on cross-validation data. In addition, offline testing of the recorded data reveals an average classification accuracy of 73.3% that is significantly higher than that achieved using traditional methods. Finally, we demonstrate the effect of the parameters of the P300 speller paradigm on the performance of the method.
Directory of Open Access Journals (Sweden)
James M. Cheverud
2007-03-01
Full Text Available Comparisons of covariance patterns are becoming more common as interest in the evolution of relationships between traits and in the evolutionary phenotypic diversification of clades have grown. We present parallel analyses of covariance matrix similarity for cranial traits in 14 New World Monkey genera using the Random Skewers (RS, T-statistics, and Common Principal Components (CPC approaches. We find that the CPC approach is very powerful in that with adequate sample sizes, it can be used to detect significant differences in matrix structure, even between matrices that are virtually identical in their evolutionary properties, as indicated by the RS results. We suggest that in many instances the assumption that population covariance matrices are identical be rejected out of hand. The more interesting and relevant question is, How similar are two covariance matrices with respect to their predicted evolutionary responses? This issue is addressed by the random skewers method described here.
Application of Principal Component Analysis in Prompt Gamma Spectra for Material Sorting
Energy Technology Data Exchange (ETDEWEB)
Im, Hee Jung; Lee, Yun Hee; Song, Byoung Chul; Park, Yong Joon; Kim, Won Ho
2006-11-15
For the detection of illicit materials in a very short time by comparing unknown samples' gamma spectra to pre-programmed material signatures, we at first, selected a method to reduce the noise of the obtained gamma spectra. After a noise reduction, a pattern recognition technique was applied to discriminate the illicit materials from the innocuous materials in the noise reduced data. Principal component analysis was applied for a noise reduction and pattern recognition in prompt gamma spectra. A computer program for the detection of illicit materials based on PCA method was developed in our lab and can be applied to the PGNAA system for the baggage checking at all ports of entry at a very short time.
Harrison, Jay M; Howard, Delia; Malven, Marianne; Halls, Steven C; Culler, Angela H; Harrigan, George G; Wolfinger, Russell D
2013-07-03
Compositional studies on genetically modified (GM) and non-GM crops have consistently demonstrated that their respective levels of key nutrients and antinutrients are remarkably similar and that other factors such as germplasm and environment contribute more to compositional variability than transgenic breeding. We propose that graphical and statistical approaches that can provide meaningful evaluations of the relative impact of different factors to compositional variability may offer advantages over traditional frequentist testing. A case study on the novel application of principal variance component analysis (PVCA) in a compositional assessment of herbicide-tolerant GM cotton is presented. Results of the traditional analysis of variance approach confirmed the compositional equivalence of the GM and non-GM cotton. The multivariate approach of PVCA provided further information on the impact of location and germplasm on compositional variability relative to GM.
The Purification Method of Matching Points Based on Principal Component Analysis
Directory of Open Access Journals (Sweden)
DONG Yang
2017-02-01
Full Text Available The traditional purification method of matching points usually uses a small number of the points as initial input. Though it can meet most of the requirements of point constraints, the iterative purification solution is easy to fall into local extreme, which results in the missing of correct matching points. To solve this problem, we introduce the principal component analysis method to use the whole point set as initial input. And thorough mismatching points step eliminating and robust solving, more accurate global optimal solution, which intends to reduce the omission rate of correct matching points and thus reaches better purification effect, can be obtained. Experimental results show that this method can obtain the global optimal solution under a certain original false matching rate, and can decrease or avoid the omission of correct matching points.
Principal components analysis of protein structure ensembles calculated using NMR data
International Nuclear Information System (INIS)
Howe, Peter W.A.
2001-01-01
One important problem when calculating structures of biomolecules from NMR data is distinguishing converged structures from outlier structures. This paper describes how Principal Components Analysis (PCA) has the potential to classify calculated structures automatically, according to correlated structural variation across the population. PCA analysis has the additional advantage that it highlights regions of proteins which are varying across the population. To apply PCA, protein structures have to be reduced in complexity and this paper describes two different representations of protein structures which achieve this. The calculated structures of a 28 amino acid peptide are used to demonstrate the methods. The two different representations of protein structure are shown to give equivalent results, and correct results are obtained even though the ensemble of structures used as an example contains two different protein conformations. The PCA analysis also correctly identifies the structural differences between the two conformations
International Nuclear Information System (INIS)
Koch, C.D.; Pirkle, F.L.; Schmidt, J.S.
1981-01-01
A Principal Components Analysis (PCA) has been written to aid in the interpretation of multivariate aerial radiometric data collected by the US Department of Energy (DOE) under the National Uranium Resource Evaluation (NURE) program. The variations exhibited by these data have been reduced and classified into a number of linear combinations by using the PCA program. The PCA program then generates histograms and outlier maps of the individual variates. Black and white plots can be made on a Calcomp plotter by the application of follow-up programs. All programs referred to in this guide were written for a DEC-10. From this analysis a geologist may begin to interpret the data structure. Insight into geological processes underlying the data may be obtained
Zia, Asif Iqbal
2015-06-01
The surface roughness of thin-film gold electrodes induces instability in impedance spectroscopy measurements of capacitive interdigital printable sensors. Post-fabrication thermodynamic annealing was carried out at temperatures ranging from 30 °C to 210 °C in a vacuum oven and the variation in surface morphology of thin-film gold electrodes was observed by scanning electron microscopy. Impedance spectra obtained at different temperatures were translated into equivalent circuit models by applying complex nonlinear least square curve-fitting algorithm. Principal component analysis was applied to deduce the classification of the parameters affected due to the annealing process and to evaluate the performance stability using mathematical model. Physics of the thermodynamic annealing was discussed based on the surface activation energies. The post anneal testing of the sensors validated the achieved stability in impedance measurement. © 2001-2012 IEEE.
A feasibility study on age-related factors of wrist pulse using principal component analysis.
Jang-Han Bae; Young Ju Jeon; Sanghun Lee; Jaeuk U Kim
2016-08-01
Various analysis methods for examining wrist pulse characteristics are needed for accurate pulse diagnosis. In this feasibility study, principal component analysis (PCA) was performed to observe age-related factors of wrist pulse from various analysis parameters. Forty subjects in the age group of 20s and 40s were participated, and their wrist pulse signal and respiration signal were acquired with the pulse tonometric device. After pre-processing of the signals, twenty analysis parameters which have been regarded as values reflecting pulse characteristics were calculated and PCA was performed. As a results, we could reduce complex parameters to lower dimension and age-related factors of wrist pulse were observed by combining-new analysis parameter derived from PCA. These results demonstrate that PCA can be useful tool for analyzing wrist pulse signal.
Directory of Open Access Journals (Sweden)
A. Bhushan
2015-07-01
Full Text Available In this paper, we address outliers in spatiotemporal data streams obtained from sensors placed across geographically distributed locations. Outliers may appear in such sensor data due to various reasons such as instrumental error and environmental change. Real-time detection of these outliers is essential to prevent propagation of errors in subsequent analyses and results. Incremental Principal Component Analysis (IPCA is one possible approach for detecting outliers in such type of spatiotemporal data streams. IPCA has been widely used in many real-time applications such as credit card fraud detection, pattern recognition, and image analysis. However, the suitability of applying IPCA for outlier detection in spatiotemporal data streams is unknown and needs to be investigated. To fill this research gap, this paper contributes by presenting two new IPCA-based outlier detection methods and performing a comparative analysis with the existing IPCA-based outlier detection methods to assess their suitability for spatiotemporal sensor data streams.
Abdullah, Nurul Azma; Saidi, Md. Jamri; Rahman, Nurul Hidayah Ab; Wen, Chuah Chai; Hamid, Isredza Rahmi A.
2017-10-01
In practice, identification of criminal in Malaysia is done through thumbprint identification. However, this type of identification is constrained as most of criminal nowadays getting cleverer not to leave their thumbprint on the scene. With the advent of security technology, cameras especially CCTV have been installed in many public and private areas to provide surveillance activities. The footage of the CCTV can be used to identify suspects on scene. However, because of limited software developed to automatically detect the similarity between photo in the footage and recorded photo of criminals, the law enforce thumbprint identification. In this paper, an automated facial recognition system for criminal database was proposed using known Principal Component Analysis approach. This system will be able to detect face and recognize face automatically. This will help the law enforcements to detect or recognize suspect of the case if no thumbprint present on the scene. The results show that about 80% of input photo can be matched with the template data.
Principal component analysis of air particulate data from the industrial area of islamabad, pakistan
International Nuclear Information System (INIS)
Waheed, S.; Siddique, N.; Daud, M.
2008-01-01
A Gent air sampler was used to collect 72 pairs of size fractionated coarse and fine (PM/sub 10/ and PM/sub 2.5/) particulate mass samples from the industrial zone (sector I-9) of Islamabad. These samples were analyzed for their elemental composition using Instrumental Neutron Activation Analysis (INAA). Principal component analysis (PCA), which can be used for source apportionment of quantified elemental data, was used to interpret the data. Graphical representations of loadings were used to explain the data through grouping of the elements from same source. The present work shows well defined elemental fingerprints of suspended soil and road dust, industry, motor vehicle exhaust and tyres, and coal and refuses combustions for the studied locality of Islamabad. (author)
Learning representative features for facial images based on a modified principal component analysis
Averkin, Anton; Potapov, Alexey
2013-05-01
The paper is devoted to facial image analysis and particularly deals with the problem of automatic evaluation of the attractiveness of human faces. We propose a new approach for automatic construction of feature space based on a modified principal component analysis. Input data sets for the algorithm are the learning data sets of facial images, which are rated by one person. The proposed approach allows one to extract features of the individual subjective face beauty perception and to predict attractiveness values for new facial images, which were not included into a learning data set. The Pearson correlation coefficient between values predicted by our method for new facial images and personal attractiveness estimation values equals to 0.89. This means that the new approach proposed is promising and can be used for predicting subjective face attractiveness values in real systems of the facial images analysis.
DEFF Research Database (Denmark)
Kotwa, Ewelina Katarzyna; Jørgensen, Bo Munk; Brockhoff, Per B.
2013-01-01
In this paper, we introduce a new method, based on spherical principal component analysis (S‐PCA), for the identification of Rayleigh and Raman scatters in fluorescence excitation–emission data. These scatters should be found and eliminated as a prestep before fitting parallel factor analysis...... models to the data, in order to avoid model degeneracies. The work is inspired and based on a previous research, where scatter removal was automatic (based on a robust version of PCA called ROBPCA) and required no visual data inspection but appeared to be computationally intensive. To overcome...... this drawback, we implement the fast S‐PCA in the scatter identification routine. Moreover, an additional pattern interpolation step that complements the method, based on robust regression, will be applied. In this way, substantial time savings are gained, and the user's engagement is restricted to a minimum...
Directory of Open Access Journals (Sweden)
Shengkun Xie
2014-01-01
Full Text Available Classification of electroencephalography (EEG is the most useful diagnostic and monitoring procedure for epilepsy study. A reliable algorithm that can be easily implemented is the key to this procedure. In this paper a novel signal feature extraction method based on dynamic principal component analysis and nonoverlapping moving window is proposed. Along with this new technique, two detection methods based on extracted sparse features are applied to deal with signal classification. The obtained results demonstrated that our proposed methodologies are able to differentiate EEGs from controls and interictal for epilepsy diagnosis and to separate EEGs from interictal and ictal for seizure detection. Our approach yields high classification accuracy for both single-channel short-term EEGs and multichannel long-term EEGs. The classification performance of the method is also compared with other state-of-the-art techniques on the same datasets and the effect of signal variability on the presented methods is also studied.
He, A.; Quan, C.
2018-04-01
The principal component analysis (PCA) and region matching combined method is effective for fringe direction estimation. However, its mask construction algorithm for region matching fails in some circumstances, and the algorithm for conversion of orientation to direction in mask areas is computationally-heavy and non-optimized. We propose an improved PCA based region matching method for the fringe direction estimation, which includes an improved and robust mask construction scheme, and a fast and optimized orientation-direction conversion algorithm for the mask areas. Along with the estimated fringe direction map, filtered fringe pattern by automatic selective reconstruction modification and enhanced fast empirical mode decomposition (ASRm-EFEMD) is used for Hilbert spiral transform (HST) to demodulate the phase. Subsequently, windowed Fourier ridge (WFR) method is used for the refinement of the phase. The robustness and effectiveness of proposed method are demonstrated by both simulated and experimental fringe patterns.
Portable XRF and principal component analysis for bill characterization in forensic science.
Appoloni, C R; Melquiades, F L
2014-02-01
Several modern techniques have been applied to prevent counterfeiting of money bills. The objective of this study was to demonstrate the potential of Portable X-ray Fluorescence (PXRF) technique and the multivariate analysis method of Principal Component Analysis (PCA) for classification of bills in order to use it in forensic science. Bills of Dollar, Euro and Real (Brazilian currency) were measured directly at different colored regions, without any previous preparation. Spectra interpretation allowed the identification of Ca, Ti, Fe, Cu, Sr, Y, Zr and Pb. PCA analysis separated the bills in three groups and subgroups among Brazilian currency. In conclusion, the samples were classified according to its origin identifying the elements responsible for differentiation and basic pigment composition. PXRF allied to multivariate discriminate methods is a promising technique for rapid and no destructive identification of false bills in forensic science. Copyright © 2013 Elsevier Ltd. All rights reserved.
Directory of Open Access Journals (Sweden)
Wenjing Zhao
2018-01-01
Full Text Available SGK (sequential generalization of K-means dictionary learning denoising algorithm has the characteristics of fast denoising speed and excellent denoising performance. However, the noise standard deviation must be known in advance when using SGK algorithm to process the image. This paper presents a denoising algorithm combined with SGK dictionary learning and the principal component analysis (PCA noise estimation. At first, the noise standard deviation of the image is estimated by using the PCA noise estimation algorithm. And then it is used for SGK dictionary learning algorithm. Experimental results show the following: (1 The SGK algorithm has the best denoising performance compared with the other three dictionary learning algorithms. (2 The SGK algorithm combined with PCA is superior to the SGK algorithm combined with other noise estimation algorithms. (3 Compared with the original SGK algorithm, the proposed algorithm has higher PSNR and better denoising performance.
QIM blind video watermarking scheme based on Wavelet transform and principal component analysis
Directory of Open Access Journals (Sweden)
Nisreen I. Yassin
2014-12-01
Full Text Available In this paper, a blind scheme for digital video watermarking is proposed. The security of the scheme is established by using one secret key in the retrieval of the watermark. Discrete Wavelet Transform (DWT is applied on each video frame decomposing it into a number of sub-bands. Maximum entropy blocks are selected and transformed using Principal Component Analysis (PCA. Quantization Index Modulation (QIM is used to quantize the maximum coefficient of the PCA blocks of each sub-band. Then, the watermark is embedded into the selected suitable quantizer values. The proposed scheme is tested using a number of video sequences. Experimental results show high imperceptibility. The computed average PSNR exceeds 45 dB. Finally, the scheme is applied on two medical videos. The proposed scheme shows high robustness against several attacks such as JPEG coding, Gaussian noise addition, histogram equalization, gamma correction, and contrast adjustment in both cases of regular videos and medical videos.
An analytics of electricity consumption characteristics based on principal component analysis
Feng, Junshu
2018-02-01
Abstract . More detailed analysis of the electricity consumption characteristics can make demand side management (DSM) much more targeted. In this paper, an analytics of electricity consumption characteristics based on principal component analysis (PCA) is given, which the PCA method can be used in to extract the main typical characteristics of electricity consumers. Then, electricity consumption characteristics matrix is designed, which can make a comparison of different typical electricity consumption characteristics between different types of consumers, such as industrial consumers, commercial consumers and residents. In our case study, the electricity consumption has been mainly divided into four characteristics: extreme peak using, peak using, peak-shifting using and others. Moreover, it has been found that industrial consumers shift their peak load often, meanwhile commercial and residential consumers have more peak-time consumption. The conclusions can provide decision support of DSM for the government and power providers.
Neural Network for Principal Component Analysis with Applications in Image Compression
Directory of Open Access Journals (Sweden)
Luminita State
2007-04-01
Full Text Available Classical feature extraction and data projection methods have been extensively investigated in the pattern recognition and exploratory data analysis literature. Feature extraction and multivariate data projection allow avoiding the "curse of dimensionality", improve the generalization ability of classifiers and significantly reduce the computational requirements of pattern classifiers. During the past decade a large number of artificial neural networks and learning algorithms have been proposed for solving feature extraction problems, most of them being adaptive in nature and well-suited for many real environments where adaptive approach is required. Principal Component Analysis, also called Karhunen-Loeve transform is a well-known statistical method for feature extraction, data compression and multivariate data projection and so far it has been broadly used in a large series of signal and image processing, pattern recognition and data analysis applications.
Ferrero, A; Campos, J; Rabal, A M; Pons, A; Hernanz, M L; Corróns, A
2011-09-26
The Bidirectional Reflectance Distribution Function (BRDF) is essential to characterize an object's reflectance properties. This function depends both on the various illumination-observation geometries as well as on the wavelength. As a result, the comprehensive interpretation of the data becomes rather complex. In this work we assess the use of the multivariable analysis technique of Principal Components Analysis (PCA) applied to the experimental BRDF data of a ceramic colour standard. It will be shown that the result may be linked to the various reflection processes occurring on the surface, assuming that the incoming spectral distribution is affected by each one of these processes in a specific manner. Moreover, this procedure facilitates the task of interpolating a series of BRDF measurements obtained for a particular sample. © 2011 Optical Society of America
Energy Technology Data Exchange (ETDEWEB)
Clegg, Samuel M [Los Alamos National Laboratory; Barefield, James E [Los Alamos National Laboratory; Wiens, Roger C [Los Alamos National Laboratory; Sklute, Elizabeth [MT HOLYOKE COLLEGE; Dyare, Melinda D [MT HOLYOKE COLLEGE
2008-01-01
Quantitative analysis with LIBS traditionally employs calibration curves that are complicated by the chemical matrix effects. These chemical matrix effects influence the LIBS plasma and the ratio of elemental composition to elemental emission line intensity. Consequently, LIBS calibration typically requires a priori knowledge of the unknown, in order for a series of calibration standards similar to the unknown to be employed. In this paper, three new Multivariate Analysis (MV A) techniques are employed to analyze the LIBS spectra of 18 disparate igneous and highly-metamorphosed rock samples. Partial Least Squares (PLS) analysis is used to generate a calibration model from which unknown samples can be analyzed. Principal Components Analysis (PCA) and Soft Independent Modeling of Class Analogy (SIMCA) are employed to generate a model and predict the rock type of the samples. These MV A techniques appear to exploit the matrix effects associated with the chemistries of these 18 samples.
Xu, Shaoping; Zeng, Xiaoxia; Jiang, Yinnan; Tang, Yiling
2018-01-01
We proposed a noniterative principal component analysis (PCA)-based noise level estimation (NLE) algorithm that addresses the problem of estimating the noise level with a two-step scheme. First, we randomly extracted a number of raw patches from a given noisy image and took the smallest eigenvalue of the covariance matrix of the raw patches as the preliminary estimation of the noise level. Next, the final estimation was directly obtained with a nonlinear mapping (rectification) function that was trained on some representative noisy images corrupted with different known noise levels. Compared with the state-of-art NLE algorithms, the experiment results show that the proposed NLE algorithm can reliably infer the noise level and has robust performance over a wide range of image contents and noise levels, showing a good compromise between speed and accuracy in general.
Zia, Asif Iqbal; Mukhopadhyay, Subhas Chandra; Yu, Paklam; Al-Bahadly, Ibrahim H.; Gooneratne, Chinthaka Pasan; Kosel, Jü rgen
2015-01-01
The surface roughness of thin-film gold electrodes induces instability in impedance spectroscopy measurements of capacitive interdigital printable sensors. Post-fabrication thermodynamic annealing was carried out at temperatures ranging from 30 °C to 210 °C in a vacuum oven and the variation in surface morphology of thin-film gold electrodes was observed by scanning electron microscopy. Impedance spectra obtained at different temperatures were translated into equivalent circuit models by applying complex nonlinear least square curve-fitting algorithm. Principal component analysis was applied to deduce the classification of the parameters affected due to the annealing process and to evaluate the performance stability using mathematical model. Physics of the thermodynamic annealing was discussed based on the surface activation energies. The post anneal testing of the sensors validated the achieved stability in impedance measurement. © 2001-2012 IEEE.
Mapping brain activity in gradient-echo functional MRI using principal component analysis
Khosla, Deepak; Singh, Manbir; Don, Manuel
1997-05-01
The detection of sites of brain activation in functional MRI has been a topic of immense research interest and many technique shave been proposed to this end. Recently, principal component analysis (PCA) has been applied to extract the activated regions and their time course of activation. This method is based on the assumption that the activation is orthogonal to other signal variations such as brain motion, physiological oscillations and other uncorrelated noises. A distinct advantage of this method is that it does not require any knowledge of the time course of the true stimulus paradigm. This technique is well suited to EPI image sequences where the sampling rate is high enough to capture the effects of physiological oscillations. In this work, we propose and apply tow methods that are based on PCA to conventional gradient-echo images and investigate their usefulness as tools to extract reliable information on brain activation. The first method is a conventional technique where a single image sequence with alternating on and off stages is subject to a principal component analysis. The second method is a PCA-based approach called the common spatial factor analysis technique (CSF). As the name suggests, this method relies on common spatial factors between the above fMRI image sequence and a background fMRI. We have applied these methods to identify active brain ares during visual stimulation and motor tasks. The results from these methods are compared to those obtained by using the standard cross-correlation technique. We found good agreement in the areas identified as active across all three techniques. The results suggest that PCA and CSF methods have good potential in detecting the true stimulus correlated changes in the presence of other interfering signals.
Directory of Open Access Journals (Sweden)
Yang Zhao
Full Text Available GWAS has facilitated greatly the discovery of risk SNPs associated with complex diseases. Traditional methods analyze SNP individually and are limited by low power and reproducibility since correction for multiple comparisons is necessary. Several methods have been proposed based on grouping SNPs into SNP sets using biological knowledge and/or genomic features. In this article, we compare the linear kernel machine based test (LKM and principal components analysis based approach (PCA using simulated datasets under the scenarios of 0 to 3 causal SNPs, as well as simple and complex linkage disequilibrium (LD structures of the simulated regions. Our simulation study demonstrates that both LKM and PCA can control the type I error at the significance level of 0.05. If the causal SNP is in strong LD with the genotyped SNPs, both the PCA with a small number of principal components (PCs and the LKM with kernel of linear or identical-by-state function are valid tests. However, if the LD structure is complex, such as several LD blocks in the SNP set, or when the causal SNP is not in the LD block in which most of the genotyped SNPs reside, more PCs should be included to capture the information of the causal SNP. Simulation studies also demonstrate the ability of LKM and PCA to combine information from multiple causal SNPs and to provide increased power over individual SNP analysis. We also apply LKM and PCA to analyze two SNP sets extracted from an actual GWAS dataset on non-small cell lung cancer.
Directory of Open Access Journals (Sweden)
D. Kurtzman
2012-03-01
Full Text Available Two sequential multilevel profiles were obtained in an observation well opened to a 130-m thick, unconfined, contaminated aquifer in Tel Aviv, Israel. While the general profile characteristics of major ions, trace elements, and volatile organic compounds were maintained in the two sampling campaigns conducted 295 days apart, the vertical locations of high concentration gradients were shifted between the two profiles. Principal component analysis (PCA of the chemical variables resulted in a first principal component which was responsible for ∼60% of the variability, and was highly correlated with depth. PCA revealed three distinct depth-dependent water bodies in both multilevel profiles, which were found to have shifted vertically between the sampling events. This shift cut across a clayey bed which separated the top and intermediate water bodies in the first profile, and was located entirely within the intermediate water body in the second profile. Continuous electrical conductivity monitoring in a packed-off section of the observation well revealed an event in which a distinct water body flowed through the monitored section (v ∼ 150 m yr^{−1}. It was concluded that the observed changes in the profiles result from dominantly lateral flow of water bodies in the aquifer rather than vertical flow. The significance of this study is twofold: (a it demonstrates the utility of sequential multilevel observations from deep wells and the efficacy of PCA for evaluating the data; (b the fact that distinct water bodies of 10 to 100 m vertical and horizontal dimensions flow under contaminated sites, which has implications for monitoring and remediation.
Characterization of soil chemical properties of strawberry fields using principal component analysis
Directory of Open Access Journals (Sweden)
Gláucia Oliveira Islabão
2013-02-01
Full Text Available One of the largest strawberry-producing municipalities of Rio Grande do Sul (RS is Turuçu, in the South of the State. The strawberry production system adopted by farmers is similar to that used in other regions in Brazil and in the world. The main difference is related to the soil management, which can change the soil chemical properties during the strawberry cycle. This study had the objective of assessing the spatial and temporal distribution of soil fertility parameters using principal component analysis (PCA. Soil sampling was based on topography, dividing the field in three thirds: upper, middle and lower. From each of these thirds, five soil samples were randomly collected in the 0-0.20 m layer, to form a composite sample for each third. Four samples were taken during the strawberry cycle and the following properties were determined: soil organic matter (OM, soil total nitrogen (N, available phosphorus (P and potassium (K, exchangeable calcium (Ca and magnesium (Mg, soil pH (pH, cation exchange capacity (CEC at pH 7.0, soil base (V% and soil aluminum saturation(m%. No spatial variation was observed for any of the studied soil fertility parameters in the strawberry fields and temporal variation was only detected for available K. Phosphorus and K contents were always high or very high from the beginning of the strawberry cycle, while pH values ranged from very low to very high. Principal component analysis allowed the clustering of all strawberry fields based on variables related to soil acidity and organic matter content.
Use of a Principal Components Analysis for the Generation of Daily Time Series.
Dreveton, Christine; Guillou, Yann
2004-07-01
A new approach for generating daily time series is considered in response to the weather-derivatives market. This approach consists of performing a principal components analysis to create independent variables, the values of which are then generated separately with a random process. Weather derivatives are financial or insurance products that give companies the opportunity to cover themselves against adverse climate conditions. The aim of a generator is to provide a wider range of feasible situations to be used in an assessment of risk. Generation of a temperature time series is required by insurers or bankers for pricing weather options. The provision of conditional probabilities and a good representation of the interannual variance are the main challenges of a generator when used for weather derivatives. The generator was developed according to this new approach using a principal components analysis and was applied to the daily average temperature time series of the Paris-Montsouris station in France. The observed dataset was homogenized and the trend was removed to represent correctly the present climate. The results obtained with the generator show that it represents correctly the interannual variance of the observed climate; this is the main result of the work, because one of the main discrepancies of other generators is their inability to represent accurately the observed interannual climate variance—this discrepancy is not acceptable for an application to weather derivatives. The generator was also tested to calculate conditional probabilities: for example, the knowledge of the aggregated value of heating degree-days in the middle of the heating season allows one to estimate the probability if reaching a threshold at the end of the heating season. This represents the main application of a climate generator for use with weather derivatives.
Institute of Scientific and Technical Information of China (English)
Nilanchal Patel; Brijesh Kumar Kaushal
2010-01-01
The classification accuracy of the various categories on the classified remotely sensed images are usually evaluated by two different measures of accuracy, namely, producer's accuracy (PA) and user's accuracy (UA). The PA of a category indicates to what extent the reference pixels of the category are correctly classified, whereas the UA ora category represents to what extent the other categories are less misclassified into the category in question. Therefore, the UA of the various categories determines the reliability of their interpretation on the classified image and is more important to the analyst than the PA. The present investigation has been performed in order to determine ifthere occurs improvement in the UA of the various categories on the classified image of the principal components of the original bands and on the classified image of the stacked image of two different years. We performed the analyses using the IRS LISS Ⅲ images of two different years, i.e., 1996 and 2009, that represent the different magnitude of urbanization and the stacked image of these two years pertaining to Ranchi area, Jharkhand, India, with a view to assessing the impacts of urbanization on the UA of the different categories. The results of the investigation demonstrated that there occurs significant improvement in the UA of the impervious categories in the classified image of the stacked image, which is attributable to the aggregation of the spectral information from twice the number of bands from two different years. On the other hand, the classified image of the principal components did not show any improvement in the UA as compared to the original images.
Registration of dynamic dopamine D{sub 2}receptor images using principal component analysis
Energy Technology Data Exchange (ETDEWEB)
Acton, P.D.; Ell, P.J. [Institute of Nuclear Medicine, University College London Medical School, London (United Kingdom); Pilowsky, L.S.; Brammer, M.J. [Institute of Psychiatry, De Crespigny Park, London (United Kingdom); Suckling, J. [Clinical Age Research Unit, Kings College School of Medicine and Dentistry, London (United Kingdom)
1997-11-01
This paper describes a novel technique for registering a dynamic sequence of single-photon emission tomography (SPET) dopamine D{sub 2}receptor images, using principal component analysis (PCA). Conventional methods for registering images, such as count difference and correlation coefficient algorithms, fail to take into account the dynamic nature of the data, resulting in large systematic errors when registering time-varying images. However, by using principal component analysis to extract the temporal structure of the image sequence, misregistration can be quantified by examining the distribution of eigenvalues. The registration procedures were tested using a computer-generated dynamic phantom derived from a high-resolution magnetic resonance image of a realistic brain phantom. Each method was also applied to clinical SPET images of dopamine D {sub 2}receptors, using the ligands iodine-123 iodobenzamide and iodine-123 epidepride, to investigate the influence of misregistration on kinetic modelling parameters and the binding potential. The PCA technique gave highly significant (P <0.001) improvements in image registration, leading to alignment errors in x and y of about 25% of the alternative methods, with reductions in autocorrelations over time. It could also be applied to align image sequences which the other methods failed completely to register, particularly {sup 123}I-epidepride scans. The PCA method produced data of much greater quality for subsequent kinetic modelling, with an improvement of nearly 50% in the {chi}{sup 2}of the fit to the compartmental model, and provided superior quality registration of particularly difficult dynamic sequences. (orig.) With 4 figs., 2 tabs., 26 refs.
Principal Component Analysis to Explore Climatic Variability and Dengue Outbreak in Lahore
Directory of Open Access Journals (Sweden)
Syed Afrozuddin Ahmed
2014-08-01
Full Text Available Normal 0 false false false EN-US X-NONE X-NONE Various studies have reported that global warming causes unstable climate and many serious impact to physical environment and public health. The increasing incidence of dengue incidence is now a priority health issue and become a health burden of Pakistan. In this study it has been investigated that spatial pattern of environment causes the emergence or increasing rate of dengue fever incidence that effects the population and its health. Principal component analysis is performed for the purpose of finding if there is/are any general environmental factor/structure which could be affected in the emergence of dengue fever cases in Pakistani climate. Principal component is applied to find structure in data for all four periods i.e. 1980 to 2012, 1980 to 1995 and 1996 to 2012. The first three PCs for the period (1980-2012, 1980-1994, 1995-2012 are almost the same and it represent hot and windy weather. The PC1s of all dengue periods are different to each other. PC2 for all period are same and it is wetness in weather. PC3s are different and it is the combination of wetness and windy weather. PC4s for all period show humid but no rain in weather. For climatic variable only minimum temperature and maximum temperature are significantly correlated with daily dengue cases. PC1, PC3 and PC4 are highly significantly correlated with daily dengue cases
Principal component analysis of MSBAS DInSAR time series from Campi Flegrei, Italy
Tiampo, Kristy F.; González, Pablo J.; Samsonov, Sergey; Fernández, Jose; Camacho, Antonio
2017-09-01
Because of its proximity to the city of Naples and with a population of nearly 1 million people within its caldera, Campi Flegrei is one of the highest risk volcanic areas in the world. Since the last major eruption in 1538, the caldera has undergone frequent episodes of ground subsidence and uplift accompanied by seismic activity that has been interpreted as the result of a stationary, deeper source below the caldera that feeds shallower eruptions. However, the location and depth of the deeper source is not well-characterized and its relationship to current activity is poorly understood. Recently, a significant increase in the uplift rate has occurred, resulting in almost 13 cm of uplift by 2013 (De Martino et al., 2014; Samsonov et al., 2014b; Di Vito et al., 2016). Here we apply a principal component decomposition to high resolution time series from the region produced by the advanced Multidimensional SBAS DInSAR technique in order to better delineate both the deeper source and the recent shallow activity. We analyzed both a period of substantial subsidence (1993-1999) and a second of significant uplift (2007-2013) and inverted the associated vertical surface displacement for the most likely source models. Results suggest that the underlying dynamics of the caldera changed in the late 1990s, from one in which the primary signal arises from a shallow deflating source above a deeper, expanding source to one dominated by a shallow inflating source. In general, the shallow source lies between 2700 and 3400 m below the caldera while the deeper source lies at 7600 m or more in depth. The combination of principal component analysis with high resolution MSBAS time series data allows for these new insights and confirms the applicability of both to areas at risk from dynamic natural hazards.
Principal components analysis based control of a multi-dof underactuated prosthetic hand
Directory of Open Access Journals (Sweden)
Magenes Giovanni
2010-04-01
Full Text Available Abstract Background Functionality, controllability and cosmetics are the key issues to be addressed in order to accomplish a successful functional substitution of the human hand by means of a prosthesis. Not only the prosthesis should duplicate the human hand in shape, functionality, sensorization, perception and sense of body-belonging, but it should also be controlled as the natural one, in the most intuitive and undemanding way. At present, prosthetic hands are controlled by means of non-invasive interfaces based on electromyography (EMG. Driving a multi degrees of freedom (DoF hand for achieving hand dexterity implies to selectively modulate many different EMG signals in order to make each joint move independently, and this could require significant cognitive effort to the user. Methods A Principal Components Analysis (PCA based algorithm is used to drive a 16 DoFs underactuated prosthetic hand prototype (called CyberHand with a two dimensional control input, in order to perform the three prehensile forms mostly used in Activities of Daily Living (ADLs. Such Principal Components set has been derived directly from the artificial hand by collecting its sensory data while performing 50 different grasps, and subsequently used for control. Results Trials have shown that two independent input signals can be successfully used to control the posture of a real robotic hand and that correct grasps (in terms of involved fingers, stability and posture may be achieved. Conclusions This work demonstrates the effectiveness of a bio-inspired system successfully conjugating the advantages of an underactuated, anthropomorphic hand with a PCA-based control strategy, and opens up promising possibilities for the development of an intuitively controllable hand prosthesis.
Rajab, Jasim M.; MatJafri, M. Z.; Lim, H. S.
2013-06-01
This study encompasses columnar ozone modelling in the peninsular Malaysia. Data of eight atmospheric parameters [air surface temperature (AST), carbon monoxide (CO), methane (CH4), water vapour (H2Ovapour), skin surface temperature (SSKT), atmosphere temperature (AT), relative humidity (RH), and mean surface pressure (MSP)] data set, retrieved from NASA's Atmospheric Infrared Sounder (AIRS), for the entire period (2003-2008) was employed to develop models to predict the value of columnar ozone (O3) in study area. The combined method, which is based on using both multiple regressions combined with principal component analysis (PCA) modelling, was used to predict columnar ozone. This combined approach was utilized to improve the prediction accuracy of columnar ozone. Separate analysis was carried out for north east monsoon (NEM) and south west monsoon (SWM) seasons. The O3 was negatively correlated with CH4, H2Ovapour, RH, and MSP, whereas it was positively correlated with CO, AST, SSKT, and AT during both the NEM and SWM season periods. Multiple regression analysis was used to fit the columnar ozone data using the atmospheric parameter's variables as predictors. A variable selection method based on high loading of varimax rotated principal components was used to acquire subsets of the predictor variables to be comprised in the linear regression model of the atmospheric parameter's variables. It was found that the increase in columnar O3 value is associated with an increase in the values of AST, SSKT, AT, and CO and with a drop in the levels of CH4, H2Ovapour, RH, and MSP. The result of fitting the best models for the columnar O3 value using eight of the independent variables gave about the same values of the R (≈0.93) and R2 (≈0.86) for both the NEM and SWM seasons. The common variables that appeared in both regression equations were SSKT, CH4 and RH, and the principal precursor of the columnar O3 value in both the NEM and SWM seasons was SSKT.
Li, Jun; Song, Minghui; Peng, Yuanxi
2018-03-01
Current infrared and visible image fusion methods do not achieve adequate information extraction, i.e., they cannot extract the target information from infrared images while retaining the background information from visible images. Moreover, most of them have high complexity and are time-consuming. This paper proposes an efficient image fusion framework for infrared and visible images on the basis of robust principal component analysis (RPCA) and compressed sensing (CS). The novel framework consists of three phases. First, RPCA decomposition is applied to the infrared and visible images to obtain their sparse and low-rank components, which represent the salient features and background information of the images, respectively. Second, the sparse and low-rank coefficients are fused by different strategies. On the one hand, the measurements of the sparse coefficients are obtained by the random Gaussian matrix, and they are then fused by the standard deviation (SD) based fusion rule. Next, the fused sparse component is obtained by reconstructing the result of the fused measurement using the fast continuous linearized augmented Lagrangian algorithm (FCLALM). On the other hand, the low-rank coefficients are fused using the max-absolute rule. Subsequently, the fused image is superposed by the fused sparse and low-rank components. For comparison, several popular fusion algorithms are tested experimentally. By comparing the fused results subjectively and objectively, we find that the proposed framework can extract the infrared targets while retaining the background information in the visible images. Thus, it exhibits state-of-the-art performance in terms of both fusion effects and timeliness.
Directory of Open Access Journals (Sweden)
Rodrigo Reis Mota
2016-09-01
Full Text Available ABSTRACT: The aim of this research was to evaluate the dimensional reduction of additive direct genetic covariance matrices in genetic evaluations of growth traits (range 100-730 days in Simmental cattle using principal components, as well as to estimate (covariance components and genetic parameters. Principal component analyses were conducted for five different models-one full and four reduced-rank models. Models were compared using Akaike information (AIC and Bayesian information (BIC criteria. Variance components and genetic parameters were estimated by restricted maximum likelihood (REML. The AIC and BIC values were similar among models. This indicated that parsimonious models could be used in genetic evaluations in Simmental cattle. The first principal component explained more than 96% of total variance in both models. Heritability estimates were higher for advanced ages and varied from 0.05 (100 days to 0.30 (730 days. Genetic correlation estimates were similar in both models regardless of magnitude and number of principal components. The first principal component was sufficient to explain almost all genetic variance. Furthermore, genetic parameter similarities and lower computational requirements allowed for parsimonious models in genetic evaluations of growth traits in Simmental cattle.
Directory of Open Access Journals (Sweden)
Selin Aviyente
2010-01-01
Full Text Available Joint time-frequency representations offer a rich representation of event related potentials (ERPs that cannot be obtained through individual time or frequency domain analysis. This representation, however, comes at the expense of increased data volume and the difficulty of interpreting the resulting representations. Therefore, methods that can reduce the large amount of time-frequency data to experimentally relevant components are essential. In this paper, we present a method that reduces the large volume of ERP time-frequency data into a few significant time-frequency parameters. The proposed method is based on applying the widely used matching pursuit (MP approach, with a Gabor dictionary, to principal components extracted from the time-frequency domain. The proposed PCA-Gabor decomposition is compared with other time-frequency data reduction methods such as the time-frequency PCA approach alone and standard matching pursuit methods using a Gabor dictionary for both simulated and biological data. The results show that the proposed PCA-Gabor approach performs better than either the PCA alone or the standard MP data reduction methods, by using the smallest amount of ERP data variance to produce the strongest statistical separation between experimental conditions.
Medina, José M; Díaz, José A; Vukusic, Pete
2015-04-20
Iridescent structural colors in biology exhibit sophisticated spatially-varying reflectance properties that depend on both the illumination and viewing angles. The classification of such spectral and spatial information in iridescent structurally colored surfaces is important to elucidate the functional role of irregularity and to improve understanding of color pattern formation at different length scales. In this study, we propose a non-invasive method for the spectral classification of spatial reflectance patterns at the micron scale based on the multispectral imaging technique and the principal component analysis similarity factor (PCASF). We demonstrate the effectiveness of this approach and its component methods by detailing its use in the study of the angle-dependent reflectance properties of Pavo cristatus (the common peacock) feathers, a species of peafowl very well known to exhibit bright and saturated iridescent colors. We show that multispectral reflectance imaging and PCASF approaches can be used as effective tools for spectral recognition of iridescent patterns in the visible spectrum and provide meaningful information for spectral classification of the irregularity of the microstructure in iridescent plumage.
New Role of Thermal Mapping in Winter Maintenance with Principal Components Analysis
Directory of Open Access Journals (Sweden)
Mario Marchetti
2014-01-01
Full Text Available Thermal mapping uses IR thermometry to measure road pavement temperature at a high resolution to identify and to map sections of the road network prone to ice occurrence. However, measurements are time-consuming and ultimately only provide a snapshot of road conditions at the time of the survey. As such, there is a need for surveys to be restricted to a series of specific climatic conditions during winter. Typically, five to six surveys are used, but it is questionable whether the full range of atmospheric conditions is adequately covered. This work investigates the role of statistics in adding value to thermal mapping data. Principal components analysis is used to interpolate between individual thermal mapping surveys to build a thermal map (or even a road surface temperature forecast, for a wider range of climatic conditions than that permitted by traditional surveys. The results indicate that when this approach is used, fewer thermal mapping surveys are actually required. Furthermore, comparisons with numerical models indicate that this approach could yield a suitable verification method for the spatial component of road weather forecasts—a key issue currently in winter road maintenance.
Principal Component Analysis of Chinese Porcelains from the Five Dynasties to the Qing Dynasty
Yap, C. T.; Hua, Younan
1992-10-01
This is a study of the possibility of identifying antique Chinese porcelains according to the period or dynasty, using major and minor chemical components (SiO2 , Al2O3 , Fe2O3 , K2O, Na2O, CaO and MgO) from the body of the porcelain. Principal component analysis is applied to published data on 66 pieces of Chinese procelains made in Jingdezhen during the Five Dynasties and the Song, Yuan, Ming and Qing Dynasties. It is shown that porcelains made during the Five Dynasties and the Yuan (or Ming) and Qing Dynasties can be segregated completely without any overlap. However, there is appreciable overlap between the Five Dynasties and the Song Dynasty, some overlap between the Song and Ming Dynasties and also between the Yuan and Ming Dynasties. Interestingly, Qing procelains are well separated from all the others. The percentage of silica in the porcelain body decreases and that of alumina increases with recentness with the exception of the Yuan and Ming Dynasties, where this trend is reversed.
Krishnan, M.; Bhowmik, B.; Hazra, B.; Pakrashi, V.
2018-02-01
In this paper, a novel baseline free approach for continuous online damage detection of multi degree of freedom vibrating structures using Recursive Principal Component Analysis (RPCA) in conjunction with Time Varying Auto-Regressive Modeling (TVAR) is proposed. In this method, the acceleration data is used to obtain recursive proper orthogonal components online using rank-one perturbation method, followed by TVAR modeling of the first transformed response, to detect the change in the dynamic behavior of the vibrating system from its pristine state to contiguous linear/non-linear-states that indicate damage. Most of the works available in the literature deal with algorithms that require windowing of the gathered data owing to their data-driven nature which renders them ineffective for online implementation. Algorithms focussed on mathematically consistent recursive techniques in a rigorous theoretical framework of structural damage detection is missing, which motivates the development of the present framework that is amenable for online implementation which could be utilized along with suite experimental and numerical investigations. The RPCA algorithm iterates the eigenvector and eigenvalue estimates for sample covariance matrices and new data point at each successive time instants, using the rank-one perturbation method. TVAR modeling on the principal component explaining maximum variance is utilized and the damage is identified by tracking the TVAR coefficients. This eliminates the need for offline post processing and facilitates online damage detection especially when applied to streaming data without requiring any baseline data. Numerical simulations performed on a 5-dof nonlinear system under white noise excitation and El Centro (also known as 1940 Imperial Valley earthquake) excitation, for different damage scenarios, demonstrate the robustness of the proposed algorithm. The method is further validated on results obtained from case studies involving
Directory of Open Access Journals (Sweden)
Anna Maria Stellacci
2012-07-01
Full Text Available Hyperspectral (HS data represents an extremely powerful means for rapidly detecting crop stress and then aiding in the rational management of natural resources in agriculture. However, large volume of data poses a challenge for data processing and extracting crucial information. Multivariate statistical techniques can play a key role in the analysis of HS data, as they may allow to both eliminate redundant information and identify synthetic indices which maximize differences among levels of stress. In this paper we propose an integrated approach, based on the combined use of Principal Component Analysis (PCA and Canonical Discriminant Analysis (CDA, to investigate HS plant response and discriminate plant status. The approach was preliminary evaluated on a data set collected on durum wheat plants grown under different nitrogen (N stress levels. Hyperspectral measurements were performed at anthesis through a high resolution field spectroradiometer, ASD FieldSpec HandHeld, covering the 325-1075 nm region. Reflectance data were first restricted to the interval 510-1000 nm and then divided into five bands of the electromagnetic spectrum [green: 510-580 nm; yellow: 581-630 nm; red: 631-690 nm; red-edge: 705-770 nm; near-infrared (NIR: 771-1000 nm]. PCA was applied to each spectral interval. CDA was performed on the extracted components to identify the factors maximizing the differences among plants fertilised with increasing N rates. Within the intervals of green, yellow and red only the first principal component (PC had an eigenvalue greater than 1 and explained more than 95% of total variance; within the ranges of red-edge and NIR, the first two PCs had an eigenvalue higher than 1. Two canonical variables explained cumulatively more than 81% of total variance and the first was able to discriminate wheat plants differently fertilised, as confirmed also by the significant correlation with aboveground biomass and grain yield parameters. The combined
Pöyhönen, Antti; Häkkinen, Jukka T; Koskimäki, Juha; Hakama, Matti; Tammela, Teuvo L J; Auvinen, Anssi
2013-03-01
WHAT'S KNOWN ON THE SUBJECT? AND WHAT DOES THE STUDY ADD?: The ICS has divided LUTS into three groups: storage, voiding and post-micturition symptoms. The classification is based on anatomical, physiological and urodynamic considerations of a theoretical nature. We used principal component analysis (PCA) to determine the inter-correlations of various LUTS, which is a novel approach to research and can strengthen existing knowledge of the phenomenology of LUTS. After we had completed our analyses, another study was published that used a similar approach and results were very similar to those of the present study. We evaluated the constellation of LUTS using PCA of the data from a population-based study that included >4000 men. In our analysis, three components emerged from the 12 LUTS: voiding, storage and incontinence components. Our results indicated that incontinence may be separate from the other storage symptoms and post-micturition symptoms should perhaps be regarded as voiding symptoms. To determine how lower urinary tract symptoms (LUTS) relate to each other and assess if the classification proposed by the International Continence Society (ICS) is consistent with empirical findings. The information on urinary symptoms for this population-based study was collected using a self-administered postal questionnaire in 2004. The questionnaire was sent to 7470 men, aged 30-80 years, from Pirkanmaa County (Finland), of whom 4384 (58.7%) returned the questionnaire. The Danish Prostatic Symptom Score-1 questionnaire was used to evaluate urinary symptoms. Principal component analysis (PCA) was used to evaluate the inter-correlations among various urinary symptoms. The PCA produced a grouping of 12 LUTS into three categories consisting of voiding, storage and incontinence symptoms. Post-micturition symptoms were related to voiding symptoms, but incontinence symptoms were separate from storage symptoms. In the analyses by age group, similar categorization was found at
Structured Sparse Principal Components Analysis With the TV-Elastic Net Penalty.
de Pierrefeu, Amicie; Lofstedt, Tommy; Hadj-Selem, Fouad; Dubois, Mathieu; Jardri, Renaud; Fovet, Thomas; Ciuciu, Philippe; Frouin, Vincent; Duchesnay, Edouard
2018-02-01
Principal component analysis (PCA) is an exploratory tool widely used in data analysis to uncover the dominant patterns of variability within a population. Despite its ability to represent a data set in a low-dimensional space, PCA's interpretability remains limited. Indeed, the components produced by PCA are often noisy or exhibit no visually meaningful patterns. Furthermore, the fact that the components are usually non-sparse may also impede interpretation, unless arbitrary thresholding is applied. However, in neuroimaging, it is essential to uncover clinically interpretable phenotypic markers that would account for the main variability in the brain images of a population. Recently, some alternatives to the standard PCA approach, such as sparse PCA (SPCA), have been proposed, their aim being to limit the density of the components. Nonetheless, sparsity alone does not entirely solve the interpretability problem in neuroimaging, since it may yield scattered and unstable components. We hypothesized that the incorporation of prior information regarding the structure of the data may lead to improved relevance and interpretability of brain patterns. We therefore present a simple extension of the popular PCA framework that adds structured sparsity penalties on the loading vectors in order to identify the few stable regions in the brain images that capture most of the variability. Such structured sparsity can be obtained by combining, e.g., and total variation (TV) penalties, where the TV regularization encodes information on the underlying structure of the data. This paper presents the structured SPCA (denoted SPCA-TV) optimization framework and its resolution. We demonstrate SPCA-TV's effectiveness and versatility on three different data sets. It can be applied to any kind of structured data, such as, e.g., -dimensional array images or meshes of cortical surfaces. The gains of SPCA-TV over unstructured approaches (such as SPCA and ElasticNet PCA) or structured approach
Directory of Open Access Journals (Sweden)
Matrone Giulia C
2012-06-01
Full Text Available Abstract Background In spite of the advances made in the design of dexterous anthropomorphic hand prostheses, these sophisticated devices still lack adequate control interfaces which could allow amputees to operate them in an intuitive and close-to-natural way. In this study, an anthropomorphic five-fingered robotic hand, actuated by six motors, was used as a prosthetic hand emulator to assess the feasibility of a control approach based on Principal Components Analysis (PCA, specifically conceived to address this problem. Since it was demonstrated elsewhere that the first two principal components (PCs can describe the whole hand configuration space sufficiently well, the controller here employed reverted the PCA algorithm and allowed to drive a multi-DoF hand by combining a two-differential channels EMG input with these two PCs. Hence, the novelty of this approach stood in the PCA application for solving the challenging problem of best mapping the EMG inputs into the degrees of freedom (DoFs of the prosthesis. Methods A clinically viable two DoFs myoelectric controller, exploiting two differential channels, was developed and twelve able-bodied participants, divided in two groups, volunteered to control the hand in simple grasp trials, using forearm myoelectric signals. Task completion rates and times were measured. The first objective (assessed through one group of subjects was to understand the effectiveness of the approach; i.e., whether it is possible to drive the hand in real-time, with reasonable performance, in different grasps, also taking advantage of the direct visual feedback of the moving hand. The second objective (assessed through a different group was to investigate the intuitiveness, and therefore to assess statistical differences in the performance throughout three consecutive days. Results Subjects performed several grasp, transport and release trials with differently shaped objects, by operating the hand with the myoelectric
Principal component analysis and the locus of the Fréchet mean in the space of phylogenetic trees.
Nye, Tom M W; Tang, Xiaoxian; Weyenberg, Grady; Yoshida, Ruriko
2017-12-01
Evolutionary relationships are represented by phylogenetic trees, and a phylogenetic analysis of gene sequences typically produces a collection of these trees, one for each gene in the analysis. Analysis of samples of trees is difficult due to the multi-dimensionality of the space of possible trees. In Euclidean spaces, principal component analysis is a popular method of reducing high-dimensional data to a low-dimensional representation that preserves much of the sample's structure. However, the space of all phylogenetic trees on a fixed set of species does not form a Euclidean vector space, and methods adapted to tree space are needed. Previous work introduced the notion of a principal geodesic in this space, analogous to the first principal component. Here we propose a geometric object for tree space similar to the [Formula: see text]th principal component in Euclidean space: the locus of the weighted Fréchet mean of [Formula: see text] vertex trees when the weights vary over the [Formula: see text]-simplex. We establish some basic properties of these objects, in particular showing that they have dimension [Formula: see text], and propose algorithms for projection onto these surfaces and for finding the principal locus associated with a sample of trees. Simulation studies demonstrate that these algorithms perform well, and analyses of two datasets, containing Apicomplexa and African coelacanth genomes respectively, reveal important structure from the second principal components.
Magnetic Flux Leakage and Principal Component Analysis for metal loss approximation in a pipeline
International Nuclear Information System (INIS)
Ruiz, M; Mujica, L E; Quintero, M; Florez, J; Quintero, S
2015-01-01
Safety and reliability of hydrocarbon transportation pipelines represent a critical aspect for the Oil an Gas industry. Pipeline failures caused by corrosion, external agents, among others, can develop leaks or even rupture, which can negatively impact on population, natural environment, infrastructure and economy. It is imperative to have accurate inspection tools traveling through the pipeline to diagnose the integrity. In this way, over the last few years, different techniques under the concept of structural health monitoring (SHM) have continuously been in development.This work is based on a hybrid methodology that combines the Magnetic Flux Leakage (MFL) and Principal Components Analysis (PCA) approaches. The MFL technique induces a magnetic field in the pipeline's walls. The data are recorded by sensors measuring leakage magnetic field in segments with loss of metal, such as cracking, corrosion, among others. The data provide information of a pipeline with 15 years of operation approximately, which transports gas, has a diameter of 20 inches and a total length of 110 km (with several changes in the topography). On the other hand, PCA is a well-known technique that compresses the information and extracts the most relevant information facilitating the detection of damage in several structures. At this point, the goal of this work is to detect and localize critical loss of metal of a pipeline that are currently working. (paper)
Bellemans, Aurélie; Parente, Alessandro; Magin, Thierry
2018-04-01
The present work introduces a novel approach for obtaining reduced chemistry representations of large kinetic mechanisms in strong non-equilibrium conditions. The need for accurate reduced-order models arises from compression of large ab initio quantum chemistry databases for their use in fluid codes. The method presented in this paper builds on existing physics-based strategies and proposes a new approach based on the combination of a simple coarse grain model with Principal Component Analysis (PCA). The internal energy levels of the chemical species are regrouped in distinct energy groups with a uniform lumping technique. Following the philosophy of machine learning, PCA is applied on the training data provided by the coarse grain model to find an optimally reduced representation of the full kinetic mechanism. Compared to recently published complex lumping strategies, no expert judgment is required before the application of PCA. In this work, we will demonstrate the benefits of the combined approach, stressing its simplicity, reliability, and accuracy. The technique is demonstrated by reducing the complex quantum N2(g+1Σ) -N(S4u ) database for studying molecular dissociation and excitation in strong non-equilibrium. Starting from detailed kinetics, an accurate reduced model is developed and used to study non-equilibrium properties of the N2(g+1Σ) -N(S4u ) system in shock relaxation simulations.
Improved Principal Component Analysis for Anomaly Detection: Application to an Emergency Department
Harrou, Fouzi; Kadri, Farid; Chaabane, Sondé s; Tahon, Christian; Sun, Ying
2015-01-01
Monitoring of production systems, such as those in hospitals, is primordial for ensuring the best management and maintenance desired product quality. Detection of emergent abnormalities allows preemptive actions that can prevent more serious consequences. Principal component analysis (PCA)-based anomaly-detection approach has been used successfully for monitoring systems with highly correlated variables. However, conventional PCA-based detection indices, such as the Hotelling’s T2T2 and the Q statistics, are ill suited to detect small abnormalities because they use only information from the most recent observations. Other multivariate statistical metrics, such as the multivariate cumulative sum (MCUSUM) control scheme, are more suitable for detection small anomalies. In this paper, a generic anomaly detection scheme based on PCA is proposed to monitor demands to an emergency department. In such a framework, the MCUSUM control chart is applied to the uncorrelated residuals obtained from the PCA model. The proposed PCA-based MCUSUM anomaly detection strategy is successfully applied to the practical data collected from the database of the pediatric emergency department in the Lille Regional Hospital Centre, France. The detection results evidence that the proposed method is more effective than the conventional PCA-based anomaly-detection methods.
State and group dynamics of world stock market by principal component analysis
Nobi, Ashadun; Lee, Jae Woo
2016-05-01
We study the dynamic interactions and structural changes by a principal component analysis (PCA) to cross-correlation coefficients of global financial indices in the years 1998-2012. The variances explained by the first PC increase with time and show a drastic change during the crisis. A sharp change in PC coefficient implies a transition of market state, a situation which occurs frequently in the American and Asian indices. However, the European indices remain stable over time. Using the first two PC coefficients, we identify indices that are similar and more strongly correlated than the others. We observe that the European indices form a robust group over the observation period. The dynamics of the individual indices within the group increase in similarity with time, and the dynamics of indices are more similar during the crises. Furthermore, the group formation of indices changes position in two-dimensional spaces due to crises. Finally, after a financial crisis, the difference of PCs between the European and American indices narrows.
Contact- and distance-based principal component analysis of protein dynamics
Energy Technology Data Exchange (ETDEWEB)
Ernst, Matthias; Sittel, Florian; Stock, Gerhard, E-mail: stock@physik.uni-freiburg.de [Biomolecular Dynamics, Institute of Physics, Albert Ludwigs University, 79104 Freiburg (Germany)
2015-12-28
To interpret molecular dynamics simulations of complex systems, systematic dimensionality reduction methods such as principal component analysis (PCA) represent a well-established and popular approach. Apart from Cartesian coordinates, internal coordinates, e.g., backbone dihedral angles or various kinds of distances, may be used as input data in a PCA. Adopting two well-known model problems, folding of villin headpiece and the functional dynamics of BPTI, a systematic study of PCA using distance-based measures is presented which employs distances between C{sub α}-atoms as well as distances between inter-residue contacts including side chains. While this approach seems prohibitive for larger systems due to the quadratic scaling of the number of distances with the size of the molecule, it is shown that it is sufficient (and sometimes even better) to include only relatively few selected distances in the analysis. The quality of the PCA is assessed by considering the resolution of the resulting free energy landscape (to identify metastable conformational states and barriers) and the decay behavior of the corresponding autocorrelation functions (to test the time scale separation of the PCA). By comparing results obtained with distance-based, dihedral angle, and Cartesian coordinates, the study shows that the choice of input variables may drastically influence the outcome of a PCA.
Sustainability Assessment of the Natural Gas Industry in China Using Principal Component Analysis
Directory of Open Access Journals (Sweden)
Xiucheng Dong
2015-05-01
Full Text Available Under pressure toward carbon emission reduction and air protection, China has accelerated energy restructuring by greatly improving the supply and consumption of natural gas in recent years. However, several issues with the sustainable development of the natural gas industry in China still need in-depth discussion. Therefore, based on the fundamental ideas of sustainable development, industrial development theories and features of the natural gas industry, a sustainable development theory is proposed in this thesis. The theory consists of five parts: resource, market, enterprise, technology and policy. The five parts, which unite for mutual connection and promotion, push the gas industry’s development forward together. Furthermore, based on the theoretical structure, the Natural Gas Industry Sustainability Index in China is established and evaluated via the Principal Component Analysis (PCA method. Finally, a conclusion is reached: that the sustainability of the natural gas industry in China kept rising from 2008 to 2013, mainly benefiting from increasing supply and demand, the enhancement of enterprise profits, technological innovation, policy support and the optimization and reformation of the gas market.
Improved Principal Component Analysis for Anomaly Detection: Application to an Emergency Department
Harrou, Fouzi
2015-07-03
Monitoring of production systems, such as those in hospitals, is primordial for ensuring the best management and maintenance desired product quality. Detection of emergent abnormalities allows preemptive actions that can prevent more serious consequences. Principal component analysis (PCA)-based anomaly-detection approach has been used successfully for monitoring systems with highly correlated variables. However, conventional PCA-based detection indices, such as the Hotelling’s T2T2 and the Q statistics, are ill suited to detect small abnormalities because they use only information from the most recent observations. Other multivariate statistical metrics, such as the multivariate cumulative sum (MCUSUM) control scheme, are more suitable for detection small anomalies. In this paper, a generic anomaly detection scheme based on PCA is proposed to monitor demands to an emergency department. In such a framework, the MCUSUM control chart is applied to the uncorrelated residuals obtained from the PCA model. The proposed PCA-based MCUSUM anomaly detection strategy is successfully applied to the practical data collected from the database of the pediatric emergency department in the Lille Regional Hospital Centre, France. The detection results evidence that the proposed method is more effective than the conventional PCA-based anomaly-detection methods.
CLASSIFICATION OF LIDAR DATA OVER BUILDING ROOFS USING K-MEANS AND PRINCIPAL COMPONENT ANALYSIS
Directory of Open Access Journals (Sweden)
Renato César dos Santos
Full Text Available Abstract: The classification is an important step in the extraction of geometric primitives from LiDAR data. Normally, it is applied for the identification of points sampled on geometric primitives of interest. In the literature there are several studies that have explored the use of eigenvalues to classify LiDAR points into different classes or structures, such as corner, edge, and plane. However, in some works the classes are defined considering an ideal geometry, which can be affected by the inadequate sampling and/or by the presence of noise when using real data. To overcome this limitation, in this paper is proposed the use of metrics based on eigenvalues and the k-means method to carry out the classification. So, the concept of principal component analysis is used to obtain the eigenvalues and the derived metrics, while the k-means is applied to cluster the roof points in two classes: edge and non-edge. To evaluate the proposed method four test areas with different levels of complexity were selected. From the qualitative and quantitative analyses, it could be concluded that the proposed classification procedure gave satisfactory results, resulting in completeness and correctness above 92% for the non-edge class, and between 61% to 98% for the edge class.
Directory of Open Access Journals (Sweden)
S. Prabhu
2014-06-01
Full Text Available Carbon nanotube (CNT mixed grinding wheel has been used in the electrolytic in-process dressing (ELID grinding process to analyze the surface characteristics of AISI D2 Tool steel material. CNT grinding wheel is having an excellent thermal conductivity and good mechanical property which is used to improve the surface finish of the work piece. The multiobjective optimization of grey relational analysis coupled with principal component analysis has been used to optimize the process parameters of ELID grinding process. Based on the Taguchi design of experiments, an L9 orthogonal array table was chosen for the experiments. The confirmation experiment verifies the proposed that grey-based Taguchi method has the ability to find out the optimal process parameters with multiple quality characteristics of surface roughness and metal removal rate. Analysis of variance (ANOVA has been used to verify and validate the model. Empirical model for the prediction of output parameters has been developed using regression analysis and the results were compared for with and without using CNT grinding wheel in ELID grinding process.
Lost-in-Space Star Identification Using Planar Triangle Principal Component Analysis Algorithm
Directory of Open Access Journals (Sweden)
Fuqiang Zhou
2015-01-01
Full Text Available It is a challenging task for a star sensor to implement star identification and determine the attitude of a spacecraft in the lost-in-space mode. Several algorithms based on triangle method are proposed for star identification in this mode. However, these methods hold great time consumption and large guide star catalog memory size. The star identification performance of these methods requires improvements. To address these problems, a star identification algorithm using planar triangle principal component analysis is presented here. A star pattern is generated based on the planar triangle created by stars within the field of view of a star sensor and the projection of the triangle. Since a projection can determine an index for a unique triangle in the catalog, the adoption of the k-vector range search technique makes this algorithm very fast. In addition, a sharing star validation method is constructed to verify the identification results. Simulation results show that the proposed algorithm is more robust than the planar triangle and P-vector algorithms under the same conditions.
A Principal Component Analysis/Fuzzy Comprehensive Evaluation for Rockburst Potential in Kimberlite
Pu, Yuanyuan; Apel, Derek; Xu, Huawei
2018-02-01
Kimberlite is an igneous rock which sometimes bears diamonds. Most of the diamonds mined in the world today are found in kimberlite ores. Burst potential in kimberlite has not been investigated, because kimberlite is mostly mined using open-pit mining, which poses very little threat of rock bursting. However, as the mining depth keeps increasing, the mines convert to underground mining methods, which can pose a threat of rock bursting in kimberlite. This paper focuses on the burst potential of kimberlite at a diamond mine in northern Canada. A combined model with the methods of principal component analysis (PCA) and fuzzy comprehensive evaluation (FCE) is developed to process data from 12 different locations in kimberlite pipes. Based on calculated 12 fuzzy evaluation vectors, 8 locations show a moderate burst potential, 2 locations show no burst potential, and 2 locations show strong and violent burst potential, respectively. Using statistical principles, a Mahalanobis distance is adopted to build a comprehensive fuzzy evaluation vector for the whole mine and the final evaluation for burst potential is moderate, which is verified by a practical rockbursting situation at mine site.
Aerodynamic multi-objective integrated optimization based on principal component analysis
Directory of Open Access Journals (Sweden)
Jiangtao HUANG
2017-08-01
Full Text Available Based on improved multi-objective particle swarm optimization (MOPSO algorithm with principal component analysis (PCA methodology, an efficient high-dimension multi-objective optimization method is proposed, which, as the purpose of this paper, aims to improve the convergence of Pareto front in multi-objective optimization design. The mathematical efficiency, the physical reasonableness and the reliability in dealing with redundant objectives of PCA are verified by typical DTLZ5 test function and multi-objective correlation analysis of supercritical airfoil, and the proposed method is integrated into aircraft multi-disciplinary design (AMDEsign platform, which contains aerodynamics, stealth and structure weight analysis and optimization module. Then the proposed method is used for the multi-point integrated aerodynamic optimization of a wide-body passenger aircraft, in which the redundant objectives identified by PCA are transformed to optimization constraints, and several design methods are compared. The design results illustrate that the strategy used in this paper is sufficient and multi-point design requirements of the passenger aircraft are reached. The visualization level of non-dominant Pareto set is improved by effectively reducing the dimension without losing the primary feature of the problem.
Wu, Shuang; Wu, Hulin
2013-01-16
One of the fundamental problems in time course gene expression data analysis is to identify genes associated with a biological process or a particular stimulus of interest, like a treatment or virus infection. Most of the existing methods for this problem are designed for data with longitudinal replicates. But in reality, many time course gene experiments have no replicates or only have a small number of independent replicates. We focus on the case without replicates and propose a new method for identifying differentially expressed genes by incorporating the functional principal component analysis (FPCA) into a hypothesis testing framework. The data-driven eigenfunctions allow a flexible and parsimonious representation of time course gene expression trajectories, leaving more degrees of freedom for the inference compared to that using a prespecified basis. Moreover, the information of all genes is borrowed for individual gene inferences. The proposed approach turns out to be more powerful in identifying time course differentially expressed genes compared to the existing methods. The improved performance is demonstrated through simulation studies and a real data application to the Saccharomyces cerevisiae cell cycle data.
International Nuclear Information System (INIS)
Nigran, K.S.; Barber, D.C.
1985-01-01
A method is proposed for automatic analysis of dynamic radionuclide studies using the mathematical technique of principal-components factor analysis. This method is considered as a possible alternative to the conventional manual regions-of-interest method widely used. The method emphasises the importance of introducing a priori information into the analysis about the physiology of at least one of the functional structures in a study. Information is added by using suitable mathematical models to describe the underlying physiological processes. A single physiological factor is extracted representing the particular dynamic structure of interest. Two spaces 'study space, S' and 'theory space, T' are defined in the formation of the concept of intersection of spaces. A one-dimensional intersection space is computed. An example from a dynamic 99 Tcsup(m) DTPA kidney study is used to demonstrate the principle inherent in the method proposed. The method requires no correction for the blood background activity, necessary when processing by the manual method. The careful isolation of the kidney by means of region of interest is not required. The method is therefore less prone to operator influence and can be automated. (author)
Directory of Open Access Journals (Sweden)
Francisco Criado-Aldeanueva
2013-01-01
Full Text Available Two different paradigms of the Mediterranean Oscillation (MO teleconnection index have been compared in this work: station-based definitions obtained by the difference of some climate variable between two selected points in the eastern and western basins (i.e., Algiers and Cairo, Gibraltar and Israel, Marseille and Jerusalem, or south France and Levantine basin and the principal component (PC approach in which the index is obtained as the time series of the first mode of normalised sea level pressure anomalies across the extended Mediterranean region. Interannual to interdecadal precipitation (P, evaporation (E, E-P, and net heat flux have been correlated with the different MO indices to compare their relative importance in the long-term variability of heat and freshwater budgets over the Mediterranean Sea. On an annual basis, the PC paradigm is the most effective tool to assess the effect of the large-scale atmospheric forcing in the Mediterranean Sea because the station-based indices exhibit a very poor correlation with all climatic variables and only influence a reduced fraction of the basin. In winter, the station-based indices highly improve their ability to represent the atmospheric forcing and results are fairly independent of the paradigm used.
Multi-point accelerometric detection and principal component analysis of heart sounds
International Nuclear Information System (INIS)
De Panfilis, S; Peccianti, M; Chiru, O M; Moroni, C; Vashkevich, V; Parisi, G; Cassone, R
2013-01-01
Heart sounds are a fundamental physiological variable that provide a unique insight into cardiac semiotics. However a deterministic and unambiguous association between noises in cardiac dynamics is far from being accomplished yet due to many and different overlapping events which contribute to the acoustic emission. The current computer-based capacities in terms of signal detection and processing allow one to move from the standard cardiac auscultation, even in its improved forms like electronic stethoscopes or hi-tech phonocardiography, to the extraction of information on the cardiac activity previously unexplored. In this report, we present a new equipment for the detection of heart sounds, based on a set of accelerometric sensors placed in contact with the chest skin on the precordial area, and are able to measure simultaneously the vibration induced on the chest surface by the heart's mechanical activity. By utilizing advanced algorithms for the data treatment, such as wavelet decomposition and principal component analysis, we are able to condense the spatially extended acoustic information and to provide a synthetical representation of the heart activity. We applied our approach to 30 adults, mixed per gender, age and healthiness, and correlated our results with standard echocardiographic examinations. We obtained a 93% concordance rate with echocardiography between healthy and unhealthy hearts, including minor abnormalities such as mitral valve prolapse. (fast track communication)
Dopico-García, M S; Fique, A; Guerra, L; Afonso, J M; Pereira, O; Valentão, P; Andrade, P B; Seabra, R M
2008-06-15
Phenolic profile of 10 different varieties of red "Vinho Verde" grapes (Azal Tinto, Borraçal, Brancelho, Doçal, Espadeiro, Padeiro de Basto, Pedral, Rabo de ovelha, Verdelho and Vinhão), from Minho (Portugal) were studied. Nine Flavonols, four phenolic acids, three flavan-3-ols, one stilben and eight anthocyanins were determined. Malvidin-3-O-glucoside was the most abundant anthocyanin while the main non-coloured compound was much more heterogeneous: catechin, epicatechin, myricetin-3-O-glucoside, quercetin-3-O-glucoside or syringetin-3-O-glucoside. Anthocyanin contents ranged from 42 to 97%. Principal component analysis (PCA) was applied to analyse the date and study the relations between the samples and their phenolic profiles. Anthocyanin profile proved to be a good marker to characterize the varieties even considering different origin and harvest. "Vinhão" grapes showed anthocyanins levels until twenty four times higher than the rest of the samples, with 97% of these compounds.
Pipeline monitoring using acoustic principal component analysis recognition with the Mel scale
International Nuclear Information System (INIS)
Wan, Chunfeng; Mita, Akira
2009-01-01
In modern cities, many important pipelines are laid underground. In order to prevent these lifeline infrastructures from accidental damage, monitoring systems are becoming indispensable. Third party activities were shown by recent reports to be a major cause of pipeline damage. Potential damage threat to the pipeline can be identified by detecting dangerous construction equipment nearby by studying the surrounding noise. Sound recognition technologies are used to identify them by their sounds, which can easily be captured by small sensors deployed along the pipelines. Pattern classification methods based on principal component analysis (PCA) were used to recognize the sounds from road cutters. In this paper, a Mel residual, i.e. the PCA residual in the Mel scale, is proposed to be the recognition feature. Determining if a captured sound belongs to a road cutter only requires checking how large its Mel residual is. Experiments were conducted and results showed that the proposed Mel-residual-based PCA recognition worked very well. The proposed Mel PCA residual recognition method will be very useful for pipeline monitoring systems to prevent accidental breakage and to ensure the safety of underground lifeline infrastructures
International Nuclear Information System (INIS)
Lu, Wei-Zhen; He, Hong-Di; Dong, Li-yun
2011-01-01
This study aims to evaluate the performance of two statistical methods, principal component analysis and cluster analysis, for the management of air quality monitoring network of Hong Kong and the reduction of associated expenses. The specific objectives include: (i) to identify city areas with similar air pollution behavior; and (ii) to locate emission sources. The statistical methods were applied to the mass concentrations of sulphur dioxide (SO 2 ), respirable suspended particulates (RSP) and nitrogen dioxide (NO 2 ), collected in monitoring network of Hong Kong from January 2001 to December 2007. The results demonstrate that, for each pollutant, the monitoring stations are grouped into different classes based on their air pollution behaviors. The monitoring stations located in nearby area are characterized by the same specific air pollution characteristics and suggested with an effective management of air quality monitoring system. The redundant equipments should be transferred to other monitoring stations for allowing further enlargement of the monitored area. Additionally, the existence of different air pollution behaviors in the monitoring network is explained by the variability of wind directions across the region. The results imply that the air quality problem in Hong Kong is not only a local problem mainly from street-level pollutions, but also a region problem from the Pearl River Delta region. (author)
Directory of Open Access Journals (Sweden)
Ida Vajčnerová
2016-01-01
Full Text Available The objective of the paper is to explore possibilities of evaluating the quality of a tourist destination by means of the principal components analysis (PCA and the cluster analysis. In the paper both types of analysis are compared on the basis of the results they provide. The aim is to identify advantage and limits of both methods and provide methodological suggestion for their further use in the tourism research. The analyses is based on the primary data from the customers’ satisfaction survey with the key quality factors of a destination. As output of the two statistical methods is creation of groups or cluster of quality factors that are similar in terms of respondents’ evaluations, in order to facilitate the evaluation of the quality of tourist destinations. Results shows the possibility to use both tested methods. The paper is elaborated in the frame of wider research project aimed to develop a methodology for the quality evaluation of tourist destinations, especially in the context of customer satisfaction and loyalty.
An analysis of workers' morale in the coal mining industry using principal component analysis
Energy Technology Data Exchange (ETDEWEB)
Armstrong, J; La Court, C; Pearson, J M
1987-03-01
This paper looks at labour morale in the coal mining industry from 1967 to 1984. In particular it examines absenteeism, turnover and accidents over that period, as well as constructing an index of morale based on these variables. The data are taken from the North Nottinghamshire and South Yorkshire coal areas, and a comparison is made between these areas in the period leading up to the industrial action in 1984/85. The indices constructed indicate that morale, as measured by the first principal component, increased considerably during the years before the 1984 industrial dispute and that low morale was an unlikely reason for the dispute, although morale in South Yorkshire, a strike area, was lower than in North Nottinghamshire, largely a non-strike area. The steep rise in morale in both North Nottinghamshire and South Yorkshire follows closely the rise in unemployment nationally and may simply be an indication of conventional industrial relations assumptions that manifestations of negative worker attitudes are greatest when jobs are relatively plentiful, and considerably less so when jobs are scarce.
Support vector machine and principal component analysis for microarray data classification
Astuti, Widi; Adiwijaya
2018-03-01
Cancer is a leading cause of death worldwide although a significant proportion of it can be cured if it is detected early. In recent decades, technology called microarray takes an important role in the diagnosis of cancer. By using data mining technique, microarray data classification can be performed to improve the accuracy of cancer diagnosis compared to traditional techniques. The characteristic of microarray data is small sample but it has huge dimension. Since that, there is a challenge for researcher to provide solutions for microarray data classification with high performance in both accuracy and running time. This research proposed the usage of Principal Component Analysis (PCA) as a dimension reduction method along with Support Vector Method (SVM) optimized by kernel functions as a classifier for microarray data classification. The proposed scheme was applied on seven data sets using 5-fold cross validation and then evaluation and analysis conducted on term of both accuracy and running time. The result showed that the scheme can obtained 100% accuracy for Ovarian and Lung Cancer data when Linear and Cubic kernel functions are used. In term of running time, PCA greatly reduced the running time for every data sets.
Directory of Open Access Journals (Sweden)
Dongdong Song
2015-01-01
Full Text Available To predict the service life of polystyrene (PS under an aggressive environment, the nondimensional expression Z was established from a data set of multiple properties of PS by principal component analysis (PCA. In this study, PS specimens were exposed to the tropical environment on Xisha Islands in China for two years. Chromatic aberration, gloss, tensile strength, elongation at break, flexural strength, and impact strength were tested to evaluate the aging behavior of PS. Based on different needs of industries, each of the multiple properties could be used to evaluate the service life of PS. However, selecting a single performance variation will inevitably hide some information about the entire aging process. Therefore, finding a comprehensive measure representing the overall aging performance of PS can be highly significant. Herein, PCA was applied to obtain a specific property (Z which can represent all properties of PS. Z of PS degradation showed a slight decrease for the initial two months of exposure after which it increased rapidly in the next eight months. Subsequently, a slower increase of Z value was observed. From the three different stages shown as Z value increases, three stages have been identified for PS service life.
Recognition of grasp types through principal components of DWT based EMG features.
Kakoty, Nayan M; Hazarika, Shyamanta M
2011-01-01
With the advancement in machine learning and signal processing techniques, electromyogram (EMG) signals have increasingly gained importance in man-machine interaction. Multifingered hand prostheses using surface EMG for control has appeared in the market. However, EMG based control is still rudimentary, being limited to a few hand postures based on higher number of EMG channels. Moreover, control is non-intuitive, in the sense that the user is required to learn to associate muscle remnants actions to unrelated posture of the prosthesis. Herein lies the promise of a low channel EMG based grasp classification architecture for development of an embedded intelligent prosthetic controller. This paper reports classification of six grasp types used during 70% of daily living activities based on two channel forearm EMG. A feature vector through principal component analysis of discrete wavelet transform coefficients based features of the EMG signal is derived. Classification is through radial basis function kernel based support vector machine following preprocessing and maximum voluntary contraction normalization of EMG signals. 10-fold cross validation is done. We have achieved an average recognition rate of 97.5%. © 2011 IEEE
Air Pollution and Human Development in Europe: A New Index Using Principal Component Analysis
Directory of Open Access Journals (Sweden)
Ana-Maria Săndică
2018-01-01
Full Text Available EU countries to measure human development incorporating the ambient PM2.5 concentration effect. Using a principal component analysis, we extract the information for 2010 and 2015 using the Real GDP/capita, the life expectancy at birth, tertiary educational attainment, ambient PM2.5 concentration, and the death rate due to exposure to ambient PM2.5 concentration for 29 European countries. This paper has two main results: it gives an overview about the relationship between human development and ambient PM2.5 concentration, and second, it provides a new quantitative measure, PHDI, which reshapes the concept of human development and the exposure to ambient PM2.5 concentration. Using rating classes, we defined thresholds for both HDI and PHDI values to group the countries in four categories. When comparing the migration matrix from 2010 to 2015 for HDI values, some countries improved the development indicator (Romania, Poland, Malta, Estonia, Cyprus, while no downgrades were observed. When comparing the transition matrix using the newly developed indicator, PHDI, the upgrades observed were for Denmark and Estonia, while some countries like Spain and Italy moved to a lower rating class due to ambient PM2.5 concentration.
Bisele, Maria; Bencsik, Martin; Lewis, Martin G C; Barnett, Cleveland T
2017-01-01
Assessment methods in human locomotion often involve the description of normalised graphical profiles and/or the extraction of discrete variables. Whilst useful, these approaches may not represent the full complexity of gait data. Multivariate statistical methods, such as Principal Component Analysis (PCA) and Discriminant Function Analysis (DFA), have been adopted since they have the potential to overcome these data handling issues. The aim of the current study was to develop and optimise a specific machine learning algorithm for processing human locomotion data. Twenty participants ran at a self-selected speed across a 15m runway in barefoot and shod conditions. Ground reaction forces (BW) and kinematics were measured at 1000 Hz and 100 Hz, respectively from which joint angles (°), joint moments (N.m.kg-1) and joint powers (W.kg-1) for the hip, knee and ankle joints were calculated in all three anatomical planes. Using PCA and DFA, power spectra of the kinematic and kinetic variables were used as a training database for the development of a machine learning algorithm. All possible combinations of 10 out of 20 participants were explored to find the iteration of individuals that would optimise the machine learning algorithm. The results showed that the algorithm was able to successfully predict whether a participant ran shod or barefoot in 93.5% of cases. To the authors' knowledge, this is the first study to optimise the development of a machine learning algorithm.
Determinants of Return on Assets in Romania: A Principal Component Analysis
Directory of Open Access Journals (Sweden)
Sorana Vatavu
2015-03-01
Full Text Available This paper examines the impact of capital structure, as well as its determinants on the financial performance of Romanian companies listed on the Bucharest Stock Exchange. The analysis is based on cross sectional regressions and factor analysis, and it refers to a ten-year period (2003-2012. Return on assets (ROA is the performance proxy, while the capital structure indicator is debt ratio. Regression results indicate that Romanian companies register higher returns when they operate with limited borrowings. Among the capital structure determinants, tangibility and business risk have a negative impact on ROA, but the level of taxation has a positive effect, showing that companies manage their assets more efficiently during times of higher fiscal pressure. Performance is sustained by sales turnover, but not significantly influenced by high levels of liquidity. Periods of unstable economic conditions, reflected by high inflation rates and the current financial crisis, have a strong negative impact on corporate performance. Based on regression results, three factors were considered through the method of iterated principal component factors: the first one incorporates debt and size, as an indicator of consumption, the second one integrates the influence of tangibility and liquidity, marking the investment potential, and the third one is an indicator of assessed risk, integrating the volatility of earnings with the level of taxation. ROA is significantly influenced by these three factors, regardless the regression method used. The consumption factor has a negative impact on performance, while the investment and risk variables positively influence ROA.
Magnetic Flux Leakage and Principal Component Analysis for metal loss approximation in a pipeline
Ruiz, M.; Mujica, L. E.; Quintero, M.; Florez, J.; Quintero, S.
2015-07-01
Safety and reliability of hydrocarbon transportation pipelines represent a critical aspect for the Oil an Gas industry. Pipeline failures caused by corrosion, external agents, among others, can develop leaks or even rupture, which can negatively impact on population, natural environment, infrastructure and economy. It is imperative to have accurate inspection tools traveling through the pipeline to diagnose the integrity. In this way, over the last few years, different techniques under the concept of structural health monitoring (SHM) have continuously been in development. This work is based on a hybrid methodology that combines the Magnetic Flux Leakage (MFL) and Principal Components Analysis (PCA) approaches. The MFL technique induces a magnetic field in the pipeline's walls. The data are recorded by sensors measuring leakage magnetic field in segments with loss of metal, such as cracking, corrosion, among others. The data provide information of a pipeline with 15 years of operation approximately, which transports gas, has a diameter of 20 inches and a total length of 110 km (with several changes in the topography). On the other hand, PCA is a well-known technique that compresses the information and extracts the most relevant information facilitating the detection of damage in several structures. At this point, the goal of this work is to detect and localize critical loss of metal of a pipeline that are currently working.
An analysis of workers' morale in the coal mining industry using principal component analysis
Energy Technology Data Exchange (ETDEWEB)
Armstrong, J.; La Court, C.; Pearson, J.M.
1987-03-01
This paper looks at labour morale in the coal mining industry from 1967 to 1984. In particular it examines absenteeism, turnover and accidents over that period, as well as constructing an index of morale based on these variables. The data are taken from the North Nottinghamshire and South Yorkshire coal areas, and a comparison is made between these areas in the period leading up to the industrial action in 1984/85. The indices constructed indicate that morale, as measured by the first principal component, increased considerably during the years before the 1984 industrial dispute and that low morale was an unlikely reason for the dispute, although morale in South Yorkshire, a strike area, was lower than in North Nottinghamshire, largely a non-strike area. The steep rise in morale in both North Nottinghamshire and South Yorkshire follows closely the rise in unemployment nationally and may simply be an indication of conventional industrial relations assumptions that manifestations of negative worker attitudes are greatest when jobs are relatively plentiful, and considerably less so when jobs are scarce.
International Nuclear Information System (INIS)
Kaistha, Nitin; Upadhyaya, Belle R.
2001-01-01
An integrated method for the detection and isolation of incipient faults in common field devices, such as sensors and actuators, using plant operational data is presented. The approach is based on the premise that data for normal operation lie on a surface and abnormal situations lead to deviations from the surface in a particular way. Statistically significant deviations from the surface result in the detection of faults, and the characteristic directions of deviations are used for isolation of one or more faults from the set of typical faults. Principal component analysis (PCA), a multivariate data-driven technique, is used to capture the relationships in the data and fit a hyperplane to the data. The fault direction for each of the scenarios is obtained using the singular value decomposition on the state and control function prediction errors, and fault isolation is then accomplished from projections on the fault directions. This approach is demonstrated for a simulated pressurized water reactor steam generator system and for a laboratory process control system under single device fault conditions. Enhanced fault isolation capability is also illustrated by incorporating realistic nonlinear terms in the PCA data matrix
Trajectory modeling of gestational weight: A functional principal component analysis approach.
Directory of Open Access Journals (Sweden)
Menglu Che
Full Text Available Suboptimal gestational weight gain (GWG, which is linked to increased risk of adverse outcomes for a pregnant woman and her infant, is prevalent. In the study of a large cohort of Canadian pregnant women, our goals are to estimate the individual weight growth trajectory using sparsely collected bodyweight data, and to identify the factors affecting the weight change during pregnancy, such as prepregnancy body mass index (BMI, dietary intakes and physical activity. The first goal was achieved through functional principal component analysis (FPCA by conditional expectation. For the second goal, we used linear regression with the total weight gain as the response variable. The trajectory modeling through FPCA had a significantly smaller root mean square error (RMSE and improved adaptability than the classic nonlinear mixed-effect models, demonstrating a novel tool that can be used to facilitate real time monitoring and interventions of GWG. Our regression analysis showed that prepregnancy BMI had a high predictive value for the weight changes during pregnancy, which agrees with the published weight gain guideline.
Ou, Hua-Se; Wei, Chao-Hai; Deng, Yang; Gao, Nai-Yun
2013-08-01
Qingcaosha Reservoir (QR) is the largest river-embedded reservoir in east China, which receives its source water from the Yangtze River (YR). The temporal and spatial variations in dissolved organic matter (DOM), chromophoric DOM (CDOM), nitrogen, phosphorus and phytoplankton biomass were investigated from June to September in 2012 and were integrated by principal component analysis (PCA). Three PCA factors were identified: (1) phytoplankton related factor 1, (2) total DOM related factor 2, and (3) eutrophication related factor 3. Factor 1 was a lake-type parameter which correlated with chlorophyll-a and protein-like CDOM (r = 0.793 and r = 0.831, respectively). Factor 2 was a river-type parameter which correlated with total DOC and humic-like CDOM (r = 0.668 and r = 0.726, respectively). Factor 3 correlated with total nitrogen and phosphorus (r = 0.864 and r = 0.621, respectively). The low flow speed, self-sedimentation and nutrient accumulation in QR resulted in increases in PCA factor 1 scores (phytoplankton biomass and derived CDOM) in the spatial scale, indicating a change of river-type water (YR) to lake-type water (QR). In summer, the water temperature variation induced a growth-bloom-decay process of phytoplankton combined with the increase of PCA factor 2 (humic-like CDOM) in the QR, which was absent in the YR.
International Nuclear Information System (INIS)
Stoiljkovic, Milovan M.; Pasti, Igor A.; Momcilovic, Milos D.; Savovic, Jelena J.; Pavlovic, Mirjana S.
2010-01-01
Enhancement of emission line intensities by induced oscillations of direct current (DC) arc plasma with continuous aerosol sample supply was investigated using multivariate statistics. Principal component analysis (PCA) was employed to evaluate enhancements of 34 atomic spectral lines belonging to 33 elements and 35 ionic spectral lines belonging to 23 elements. Correlation and classification of the elements were done not only by a single property such as the first ionization energy, but also by considering other relevant parameters. Special attention was paid to the influence of the oxide bond strength in an attempt to clarify/predict the enhancement effect. Energies of vaporization, atomization, and excitation were also considered in the analysis. In the case of atomic lines, the best correlation between the enhancements and first ionization energies was obtained as a negative correlation, with weak consistency in grouping of elements in score plots. Conversely, in the case of ionic lines, the best correlation of the enhancements with the sum of the first ionization energies and oxide bond energies was obtained as a positive correlation, with four distinctive groups of elements. The role of the gas-phase atom-oxide bond energy in the entire enhancement effect is underlined.
Wang, Kesheng; Liu, Ying; Ouedraogo, Youssoufou; Wang, Nianyang; Xie, Xin; Xu, Chun; Luo, Xingguang
2018-05-01
Early alcohol, tobacco and drug use prior to 18 years old are comorbid and correlated. This study included 6239 adults with major depressive disorder (MDD) in the past year and 72,010 controls from the combined data of 2013 and 2014 National Survey on Drug Use and Health (NSDUH). To deal with multicollinearity existing among 17 variables related to early alcohol, tobacco and drug use prior to 18 years old, we used principal component analysis (PCA) to infer PC scores and then use weighted multiple logistic regression analyses to estimate the associations of potential factors and PC scores with MDD. The odds ratios (ORs) with 95% confidence intervals (CIs) were estimated. The overall prevalence of MDD was 6.7%. The first four PCs could explain 57% of the total variance. Weighted multiple logistic regression showed that PC 1 (a measure of psychotherapeutic drugs and illicit drugs other than marijuana use), PC 2 (a measure of cocaine and hallucinogens), PC 3 (a measure of early alcohol, cigarettes, and marijuana use), and PC 4 (a measure of cigar, smokeless tobacco use and illicit drugs use) revealed significant associations with MDD (OR = 1.12, 95% CI = 1.08-1.16, OR = 1.08, 95% CI = 1.04-1.12, OR = 1.13, 95% CI = 1.07-1.18, and OR = 1.15, 95% CI = 1.09-1.21, respectively). In conclusion, PCA can be used to reduce the indicators in complex survey data. Early alcohol, tobacco and drug use prior to 18 years old were found to be associated with increased odds of adult MDD. Copyright © 2018 Elsevier Ltd. All rights reserved.
Škrbić, Biljana; Héberger, Károly; Durišić-Mladenović, Nataša
2013-10-01
Sum of ranking differences (SRD) was applied for comparing multianalyte results obtained by several analytical methods used in one or in different laboratories, i.e., for ranking the overall performances of the methods (or laboratories) in simultaneous determination of the same set of analytes. The data sets for testing of the SRD applicability contained the results reported during one of the proficiency tests (PTs) organized by EU Reference Laboratory for Polycyclic Aromatic Hydrocarbons (EU-RL-PAH). In this way, the SRD was also tested as a discriminant method alternative to existing average performance scores used to compare mutlianalyte PT results. SRD should be used along with the z scores--the most commonly used PT performance statistics. SRD was further developed to handle the same rankings (ties) among laboratories. Two benchmark concentration series were selected as reference: (a) the assigned PAH concentrations (determined precisely beforehand by the EU-RL-PAH) and (b) the averages of all individual PAH concentrations determined by each laboratory. Ranking relative to the assigned values and also to the average (or median) values pointed to the laboratories with the most extreme results, as well as revealed groups of laboratories with similar overall performances. SRD reveals differences between methods or laboratories even if classical test(s) cannot. The ranking was validated using comparison of ranks by random numbers (a randomization test) and using seven folds cross-validation, which highlighted the similarities among the (methods used in) laboratories. Principal component analysis and hierarchical cluster analysis justified the findings based on SRD ranking/grouping. If the PAH-concentrations are row-scaled, (i.e., z scores are analyzed as input for ranking) SRD can still be used for checking the normality of errors. Moreover, cross-validation of SRD on z scores groups the laboratories similarly. The SRD technique is general in nature, i.e., it can
Directory of Open Access Journals (Sweden)
Jinlu Sheng
2016-07-01
Full Text Available To effectively extract the typical features of the bearing, a new method that related the local mean decomposition Shannon entropy and improved kernel principal component analysis model was proposed. First, the features are extracted by time–frequency domain method, local mean decomposition, and using the Shannon entropy to process the original separated product functions, so as to get the original features. However, the features been extracted still contain superfluous information; the nonlinear multi-features process technique, kernel principal component analysis, is introduced to fuse the characters. The kernel principal component analysis is improved by the weight factor. The extracted characteristic features were inputted in the Morlet wavelet kernel support vector machine to get the bearing running state classification model, bearing running state was thereby identified. Cases of test and actual were analyzed.
Medina, José M.; Díaz, José A.
2013-05-01
We have applied principal component analysis to examine trial-to-trial variability of reflectances of automotive coatings that contain effect pigments. Reflectance databases were measured from different color batch productions using a multi-angle spectrophotometer. A method to classify the principal components was used based on the eigenvalue spectra. It was found that the eigenvalue spectra follow distinct power laws and depend on the detection angle. The scaling exponent provided an estimation of the correlation between reflectances and it was higher near specular reflection, suggesting a contribution from the deposition of effect pigments. Our findings indicate that principal component analysis can be a useful tool to classify different sources of spectral variability in color engineering.
Directory of Open Access Journals (Sweden)
Grahić Jasmin
2013-01-01
Full Text Available In order to analyze morphological characteristics of locally cultivated common bean landraces from Bosnia and Herzegovina (B&H, thirteen quantitative and qualitative traits of 40 P. vulgaris accessions, collected from four geographical regions (Northwest B&H, Northeast B&H, Central B&H and Sarajevo and maintained at the Gene bank of the Faculty of Agriculture and Food Sciences in Sarajevo, were examined. Principal component analysis (PCA showed that the proportion of variance retained in the first two principal components was 54.35%. The first principal component had high contributing factor loadings from seed width, seed height and seed weight, whilst the second principal component had high contributing factor loadings from the analyzed traits seed per pod and pod length. PCA plot, based on the first two principal components, displayed a high level of variability among the analyzed material. The discriminant analysis of principal components (DAPC created 3 discriminant functions (DF, whereby the first two discriminant functions accounted for 90.4% of the variance retained. Based on the retained DFs, DAPC provided group membership probabilities which showed that 70% of the accessions examined were correctly classified between the geographically defined groups. Based on the taxonomic distance, 40 common bean accessions analyzed in this study formed two major clusters, whereas two accessions Acc304 and Acc307 didn’t group in any of those. Acc360 and Acc362, as well as Acc324 and Acc371 displayed a high level of similarity and are probably the same landrace. The present diversity of Bosnia and Herzegovina’s common been landraces could be useful in future breeding programs.
Krefis, Anne Caroline; Schwarz, Norbert Georg; Nkrumah, Bernard; Acquah, Samuel; Loag, Wibke; Sarpong, Nimako; Adu-Sarkodie, Yaw; Ranft, Ulrich; May, Jürgen
2010-07-13
The socioeconomic and sociodemographic situation are important components for the design and assessment of malaria control measures. In malaria endemic areas, however, valid classification of socioeconomic factors is difficult due to the lack of standardized tax and income data. The objective of this study was to quantify household socioeconomic levels using principal component analyses (PCA) to a set of indicator variables and to use a classification scheme for the multivariate analysis of children<15 years of age presented with and without malaria to an outpatient department of a rural hospital. In total, 1,496 children presenting to the hospital were examined for malaria parasites and interviewed with a standardized questionnaire. The information of eleven indicators of the family's housing situation was reduced by PCA to a socioeconomic score, which was then classified into three socioeconomic status (poor, average and rich). Their influence on the malaria occurrence was analysed together with malaria risk co-factors, such as sex, parent's educational and ethnic background, number of children living in a household, applied malaria protection measures, place of residence and age of the child and the mother. The multivariate regression analysis demonstrated that the proportion of children with malaria decreased with increasing socioeconomic status as classified by PCA (p<0.05). Other independent factors for malaria risk were the use of malaria protection measures (p<0.05), the place of residence (p<0.05), and the age of the child (p<0.05). The socioeconomic situation is significantly associated with malaria even in holoendemic rural areas where economic differences are not much pronounced. Valid classification of the socioeconomic level is crucial to be considered as confounder in intervention trials and in the planning of malaria control measures.
Li, Wu; Hu, Bing; Wang, Ming-wei
2014-12-01
In the present paper, the terahertz time-domain spectroscopy (THz-TDS) identification model of borneol based on principal component analysis (PCA) and support vector machine (SVM) was established. As one Chinese common agent, borneol needs a rapid, simple and accurate detection and identification method for its different source and being easily confused in the pharmaceutical and trade links. In order to assure the quality of borneol product and guard the consumer's right, quickly, efficiently and correctly identifying borneol has significant meaning to the production and transaction of borneol. Terahertz time-domain spectroscopy is a new spectroscopy approach to characterize material using terahertz pulse. The absorption terahertz spectra of blumea camphor, borneol camphor and synthetic borneol were measured in the range of 0.2 to 2 THz with the transmission THz-TDS. The PCA scores of 2D plots (PC1 X PC2) and 3D plots (PC1 X PC2 X PC3) of three kinds of borneol samples were obtained through PCA analysis, and both of them have good clustering effect on the 3 different kinds of borneol. The value matrix of the first 10 principal components (PCs) was used to replace the original spectrum data, and the 60 samples of the three kinds of borneol were trained and then the unknown 60 samples were identified. Four kinds of support vector machine model of different kernel functions were set up in this way. Results show that the accuracy of identification and classification of SVM RBF kernel function for three kinds of borneol is 100%, and we selected the SVM with the radial basis kernel function to establish the borneol identification model, in addition, in the noisy case, the classification accuracy rates of four SVM kernel function are above 85%, and this indicates that SVM has strong generalization ability. This study shows that PCA with SVM method of borneol terahertz spectroscopy has good classification and identification effects, and provides a new method for species
2L-PCA: a two-level principal component analyzer for quantitative drug design and its applications.
Du, Qi-Shi; Wang, Shu-Qing; Xie, Neng-Zhong; Wang, Qing-Yan; Huang, Ri-Bo; Chou, Kuo-Chen
2017-09-19
A two-level principal component predictor (2L-PCA) was proposed based on the principal component analysis (PCA) approach. It can be used to quantitatively analyze various compounds and peptides about their functions or potentials to become useful drugs. One level is for dealing with the physicochemical properties of drug molecules, while the other level is for dealing with their structural fragments. The predictor has the self-learning and feedback features to automatically improve its accuracy. It is anticipated that 2L-PCA will become a very useful tool for timely providing various useful clues during the process of drug development.
International Nuclear Information System (INIS)
Vogt, Frank
2013-01-01
Graphical abstract: Analysis Task: Determine the albumin (= protein) concentration in microalgae cells as a function of the cells’ nutrient availability. Left Panel: The predicted albumin concentrations as obtained by conventional principal component regression features low reproducibility and are partially higher than the concentrations of algae in which albumin is contained. Right Panel: Augmenting an incomplete PCR calibration with additional expert information derives reasonable albumin concentrations which now reveal a significant dependency on the algae's nutrient situation. -- Highlights: •Make quantitative analyses of compounds embedded in largely unknown chemical matrices robust. •Improved concentration prediction with originally insufficient calibration models. •Chemometric approach for incorporating expertise from other fields and/or researchers. •Ensure chemical, biological, or medicinal meaningfulness of quantitative analyses. -- Abstract: Incomplete calibrations are encountered in many applications and hamper chemometric data analyses. Such situations arise when target analytes are embedded in a chemically complex matrix from which calibration concentrations cannot be determined with reasonable efforts. In other cases, the samples’ chemical composition may fluctuate in an unpredictable way and thus cannot be comprehensively covered by calibration samples. The reason for calibration model to fail is the regression principle itself which seeks to explain measured data optimally in terms of the (potentially incomplete) calibration model but does not consider chemical meaningfulness. This study presents a novel chemometric approach which is based on experimentally feasible calibrations, i.e. concentration series of the target analytes outside the chemical matrix (‘ex situ calibration’). The inherent lack-of-information is then compensated by incorporating additional knowledge in form of regression constraints. Any outside knowledge can be
International Nuclear Information System (INIS)
Seo, In Yong; Ha, Bok Nam; Lee, Sung Woo; Shin, Chang Hoon; Kim, Seong Jun
2010-01-01
In nuclear power plants (NPPs), periodic sensor calibrations are required to assure that sensors are operating correctly. By checking the sensor's operating status at every fuel outage, faulty sensors may remain undetected for periods of up to 24 months. Moreover, typically, only a few faulty sensors are found to be calibrated. For the safe operation of NPP and the reduction of unnecessary calibration, on-line instrument calibration monitoring is needed. In this study, principal component based auto-associative support vector regression (PCSVR) using response surface methodology (RSM) is proposed for the sensor signal validation of NPPs. This paper describes the design of a PCSVR-based sensor validation system for a power generation system. RSM is employed to determine the optimal values of SVR hyperparameters and is compared to the genetic algorithm (GA). The proposed PCSVR model is confirmed with the actual plant data of Kori Nuclear Power Plant Unit 3 and is compared with the Auto-Associative support vector regression (AASVR) and the auto-associative neural network (AANN) model. The auto-sensitivity of AASVR is improved by around six times by using a PCA, resulting in good detection of sensor drift. Compared to AANN, accuracy and cross-sensitivity are better while the auto-sensitivity is almost the same. Meanwhile, the proposed RSM for the optimization of the PCSVR algorithm performs even better in terms of accuracy, auto-sensitivity, and averaged maximum error, except in averaged RMS error, and this method is much more time efficient compared to the conventional GA method
Hearty, Aine P; Gibney, Michael J
2009-02-01
The aims of the present study were to examine and compare dietary patterns in adults using cluster and factor analyses and to examine the format of the dietary variables on the pattern solutions (i.e. expressed as grams/day (g/d) of each food group or as the percentage contribution to total energy intake). Food intake data were derived from the North/South Ireland Food Consumption Survey 1997-9, which was a randomised cross-sectional study of 7 d recorded food and nutrient intakes of a representative sample of 1379 Irish adults aged 18-64 years. Cluster analysis was performed using the k-means algorithm and principal component analysis (PCA) was used to extract dietary factors. Food data were reduced to thirty-three food groups. For cluster analysis, the most suitable format of the food-group variable was found to be the percentage contribution to energy intake, which produced six clusters: 'Traditional Irish'; 'Continental'; 'Unhealthy foods'; 'Light-meal foods & low-fat milk'; 'Healthy foods'; 'Wholemeal bread & desserts'. For PCA, food groups in the format of g/d were found to be the most suitable format, and this revealed four dietary patterns: 'Unhealthy foods & high alcohol'; 'Traditional Irish'; 'Healthy foods'; 'Sweet convenience foods & low alcohol'. In summary, cluster and PCA identified similar dietary patterns when presented with the same dataset. However, the two dietary pattern methods required a different format of the food-group variable, and the most appropriate format of the input variable should be considered in future studies.
Investigation of inversion polymorphisms in the human genome using principal components analysis.
Ma, Jianzhong; Amos, Christopher I
2012-01-01
Despite the significant advances made over the last few years in mapping inversions with the advent of paired-end sequencing approaches, our understanding of the prevalence and spectrum of inversions in the human genome has lagged behind other types of structural variants, mainly due to the lack of a cost-efficient method applicable to large-scale samples. We propose a novel method based on principal components analysis (PCA) to characterize inversion polymorphisms using high-density SNP genotype data. Our method applies to non-recurrent inversions for which recombination between the inverted and non-inverted segments in inversion heterozygotes is suppressed due to the loss of unbalanced gametes. Inside such an inversion region, an effect similar to population substructure is thus created: two distinct "populations" of inversion homozygotes of different orientations and their 1:1 admixture, namely the inversion heterozygotes. This kind of substructure can be readily detected by performing PCA locally in the inversion regions. Using simulations, we demonstrated that the proposed method can be used to detect and genotype inversion polymorphisms using unphased genotype data. We applied our method to the phase III HapMap data and inferred the inversion genotypes of known inversion polymorphisms at 8p23.1 and 17q21.31. These inversion genotypes were validated by comparing with literature results and by checking Mendelian consistency using the family data whenever available. Based on the PCA-approach, we also performed a preliminary genome-wide scan for inversions using the HapMap data, which resulted in 2040 candidate inversions, 169 of which overlapped with previously reported inversions. Our method can be readily applied to the abundant SNP data, and is expected to play an important role in developing human genome maps of inversions and exploring associations between inversions and susceptibility of diseases.
International Nuclear Information System (INIS)
Constantinou, M.A.; Papakonstantinou, E.; Benaki, D.; Spraul, M.; Shulpis, K.; Koupparis, M.A.; Mikros, E.
2004-01-01
NMR spectra of extracted blood spots were used to investigate the possibility for the development of a new method for mass screening concerning the diagnosis of inborn errors of metabolism (IEM). Blood spots were collected on filter papers from normal, phenylketonuric (PKU) and maple syrup urine disease (MSUD) subjects and their Carr-Purcell-Meiboom-Gill (CPMG) 1 H NMR spectra were acquired. The spectra were reduced to a number of spectral descriptors and principal component analysis (PCA) was performed. The scores plot showed that PKU and MSUD samples were well discriminated from the main cluster of points
Guo, H; Wang, T; Louie, P K K
2004-06-01
Receptor-oriented source apportionment models are often used to identify sources of ambient air pollutants and to estimate source contributions to air pollutant concentrations. In this study, a PCA/APCS model was applied to the data on non-methane hydrocarbons (NMHCs) measured from January to December 2001 at two sampling sites: Tsuen Wan (TW) and Central & Western (CW) Toxic Air Pollutants Monitoring Stations in Hong Kong. This multivariate method enables the identification of major air pollution sources along with the quantitative apportionment of each source to pollutant species. The PCA analysis identified four major pollution sources at TW site and five major sources at CW site. The extracted pollution sources included vehicular internal engine combustion with unburned fuel emissions, use of solvent particularly paints, liquefied petroleum gas (LPG) or natural gas leakage, and industrial, commercial and domestic sources such as solvents, decoration, fuel combustion, chemical factories and power plants. The results of APCS receptor model indicated that 39% and 48% of the total NMHCs mass concentrations measured at CW and TW were originated from vehicle emissions, respectively. 32% and 36.4% of the total NMHCs were emitted from the use of solvent and 11% and 19.4% were apportioned to the LPG or natural gas leakage, respectively. 5.2% and 9% of the total NMHCs mass concentrations were attributed to other industrial, commercial and domestic sources, respectively. It was also found that vehicle emissions and LPG or natural gas leakage were the main sources of C(3)-C(5) alkanes and C(3)-C(5) alkenes while aromatics were predominantly released from paints. Comparison of source contributions to ambient NMHCs at the two sites indicated that the contribution of LPG or natural gas at CW site was almost twice that at TW site. High correlation coefficients (R(2) > 0.8) between the measured and predicted values suggested that the PCA/APCS model was applicable for estimation of sources of NMHCs in ambient air.
International Nuclear Information System (INIS)
Feng Junting; Xu Mi; Wang Guizeng
2003-01-01
The fault diagnosis method based on principal component analysis is studied. The fault character direction storeroom of fifteen parameters abnormity is built in the simulation for the main coolant pump of nuclear power station. The measuring data are analyzed, and the results show that it is feasible for the fault diagnosis system of main coolant pump in the nuclear power station
Directory of Open Access Journals (Sweden)
Yunfeng Dong
2017-01-01
Full Text Available The weighted sum and genetic algorithm-based hybrid method (WSGA-based HM, which has been applied to multiobjective orbit optimizations, is negatively influenced by human factors through the artificial choice of the weight coefficients in weighted sum method and the slow convergence of GA. To address these two problems, a cluster and principal component analysis-based optimization method (CPC-based OM is proposed, in which many candidate orbits are gradually randomly generated until the optimal orbit is obtained using a data mining method, that is, cluster analysis based on principal components. Then, the second cluster analysis of the orbital elements is introduced into CPC-based OM to improve the convergence, developing a novel double cluster and principal component analysis-based optimization method (DCPC-based OM. In DCPC-based OM, the cluster analysis based on principal components has the advantage of reducing the human influences, and the cluster analysis based on six orbital elements can reduce the search space to effectively accelerate convergence. The test results from a multiobjective numerical benchmark function and the orbit design results of an Earth observation satellite show that DCPC-based OM converges more efficiently than WSGA-based HM. And DCPC-based OM, to some degree, reduces the influence of human factors presented in WSGA-based HM.
CSIR Research Space (South Africa)
Nel, W
2009-10-01
Full Text Available to estimate the 3-D position of scatterers as a by-product of the analysis. The technique is based on principal component analysis of accurate scatterer range histories and is shown only in simulation. Future research should focus on practical application....
Ueki, Kenta; Iwamori, Hikaru
2017-10-01
In this study, with a view of understanding the structure of high-dimensional geochemical data and discussing the chemical processes at work in the evolution of arc magmas, we employed principal component analysis (PCA) to evaluate the compositional variations of volcanic rocks from the Sengan volcanic cluster of the Northeastern Japan Arc. We analyzed the trace element compositions of various arc volcanic rocks, sampled from 17 different volcanoes in a volcanic cluster. The PCA results demonstrated that the first three principal components accounted for 86% of the geochemical variation in the magma of the Sengan region. Based on the relationships between the principal components and the major elements, the mass-balance relationships with respect to the contributions of minerals, the composition of plagioclase phenocrysts, geothermal gradient, and seismic velocity structure in the crust, the first, the second, and the third principal components appear to represent magma mixing, crystallizations of olivine/pyroxene, and crystallizations of plagioclase, respectively. These represented 59%, 20%, and 6%, respectively, of the variance in the entire compositional range, indicating that magma mixing accounted for the largest variance in the geochemical variation of the arc magma. Our result indicated that crustal processes dominate the geochemical variation of magma in the Sengan volcanic cluster.
Model Reduction via Principe Component Analysis and Markov Chain Monte Carlo (MCMC) Methods
Gong, R.; Chen, J.; Hoversten, M. G.; Luo, J.
2011-12-01
Geophysical and hydrogeological inverse problems often include a large number of unknown parameters, ranging from hundreds to millions, depending on parameterization and problems undertaking. This makes inverse estimation and uncertainty quantification very challenging, especially for those problems in two- or three-dimensional spatial domains. Model reduction technique has the potential of mitigating the curse of dimensionality by reducing total numbers of unknowns while describing the complex subsurface systems adequately. In this study, we explore the use of principal component analysis (PCA) and Markov chain Monte Carlo (MCMC) sampling methods for model reduction through the use of synthetic datasets. We compare the performances of three different but closely related model reduction approaches: (1) PCA methods with geometric sampling (referred to as 'Method 1'), (2) PCA methods with MCMC sampling (referred to as 'Method 2'), and (3) PCA methods with MCMC sampling and inclusion of random effects (referred to as 'Method 3'). We consider a simple convolution model with five unknown parameters as our goal is to understand and visualize the advantages and disadvantages of each method by comparing their inversion results with the corresponding analytical solutions. We generated synthetic data with noise added and invert them under two different situations: (1) the noised data and the covariance matrix for PCA analysis are consistent (referred to as the unbiased case), and (2) the noise data and the covariance matrix are inconsistent (referred to as biased case). In the unbiased case, comparison between the analytical solutions and the inversion results show that all three methods provide good estimates of the true values and Method 1 is computationally more efficient. In terms of uncertainty quantification, Method 1 performs poorly because of relatively small number of samples obtained, Method 2 performs best, and Method 3 overestimates uncertainty due to inclusion
Health status monitoring for ICU patients based on locally weighted principal component analysis.
Ding, Yangyang; Ma, Xin; Wang, Youqing
2018-03-01
Intelligent status monitoring for critically ill patients can help medical stuff quickly discover and assess the changes of disease and then make appropriate treatment strategy. However, general-type monitoring model now widely used is difficult to adapt the changes of intensive care unit (ICU) patients' status due to its fixed pattern, and a more robust, efficient and fast monitoring model should be developed to the individual. A data-driven learning approach combining locally weighted projection regression (LWPR) and principal component analysis (PCA) is firstly proposed and applied to monitor the nonlinear process of patients' health status in ICU. LWPR is used to approximate the complex nonlinear process with local linear models, in which PCA could be further applied to status monitoring, and finally a global weighted statistic will be acquired for detecting the possible abnormalities. Moreover, some improved versions are developed, such as LWPR-MPCA and LWPR-JPCA, which also have superior performance. Eighteen subjects were selected from the Physiobank's Multi-parameter Intelligent Monitoring for Intensive Care II (MIMIC II) database, and two vital signs of each subject were chosen for online monitoring. The proposed method was compared with several existing methods including traditional PCA, Partial least squares (PLS), just in time learning combined with modified PCA (L-PCA), and Kernel PCA (KPCA). The experimental results demonstrated that the mean fault detection rate (FDR) of PCA can be improved by 41.7% after adding LWPR. The mean FDR of LWPR-MPCA was increased by 8.3%, compared with the latest reported method L-PCA. Meanwhile, LWPR spent less training time than others, especially KPCA. LWPR is first introduced into ICU patients monitoring and achieves the best monitoring performance including adaptability to changes in patient status, sensitivity for abnormality detection as well as its fast learning speed and low computational complexity. The algorithm
Kernel Principal Component Analysis for dimensionality reduction in fMRI-based diagnosis of ADHD
Directory of Open Access Journals (Sweden)
Gagan S Sidhu
2012-11-01
Full Text Available This article explores various preprocessing tools that select/create features to help a learner produce a classifier that can use fMRI data to effectively discriminate Attention-Deficit Hyperactivity Disorder (ADHD patients from healthy controls. We consider four different learning tasks: predicting either two (ADHD vs control or three classes (ADHD-1 vs ADHD-3 vs control, where each use either the imaging data only, or the phenotypic and imaging data. After averaging, BOLD-signal normalization, and masking of the fMRI images, we considered applying Fast Fourier Transform (FFT, possibly followed by some Principal Component Analysis (PCA variant (over time: PCA-t; over space and time: PCA-st or the kernelized variant, kPCA-st, to produce inputs to a learner, to determine which learned classifier performs the best – or at least better than the baseline of 64.2%, which is the proportion of the majority class (here, controls.In the two-class setting, PCA-t and PCA-st did not perform statistically better than baseline, whereas FFT and kPCA-st did (FFT, 68.4%; kPCA-st, 70.3%; when combined with the phenotypic data, which by itself produces 72.9% accuracy, all methods performed statistically better than the baseline, but none did better than using the phenotypic data. In the three-class setting, neither the PCA variants, or the phenotypic data classifiers, performed statistically better than the baseline.We next used the FFT output as input to the PCA variants. In the two-class setting, the PCA variants performed statistically better than the baseline using either the FFTed waveforms only (FFT+PCA-t, 69.6%,; FFT+PCA-st, 69.3% ; FFT+kPCA-st, 68.7%, or combining them with the phenotypic data (FFT+PCA-t, 70.6%; FFT+PCA-st, 70.6%; kPCA-st, 76%. In both settings, combining FFT+kPCA-st’s features with the phenotypic data was better than using only the phenotypic data, with the result in the two-class setting being statistically better.
Richman, Michael B.; Gong, Xiaofeng
1999-06-01
When applying eigenanalysis, one decision analysts make is the determination of what magnitude an eigenvector coefficient (e.g., principal component (PC) loading) must achieve to be considered as physically important. Such coefficients can be displayed on maps or in a time series or tables to gain a fuller understanding of a large array of multivariate data. Previously, such a decision on what value of loading designates a useful signal (hereafter called the loading `cutoff') for each eigenvector has been purely subjective. The importance of selecting such a cutoff is apparent since those loading elements in the range of zero to the cutoff are ignored in the interpretation and naming of PCs since only the absolute values of loadings greater than the cutoff are physically analyzed. This research sets out to objectify the problem of best identifying the cutoff by application of matching between known correlation/covariance structures and their corresponding eigenpatterns, as this cutoff point (known as the hyperplane width) is varied.A Monte Carlo framework is used to resample at five sample sizes. Fourteen different hyperplane cutoff widths are tested, bootstrap resampled 50 times to obtain stable results. The key findings are that the location of an optimal hyperplane cutoff width (one which maximized the information content match between the eigenvector and the parent dispersion matrix from which it was derived) is a well-behaved unimodal function. On an individual eigenvector, this enables the unique determination of a hyperplane cutoff value to be used to separate those loadings that best reflect the relationships from those that do not. The effects of sample size on the matching accuracy are dramatic as the values for all solutions (i.e., unrotated, rotated) rose steadily from 25 through 250 observations and then weakly thereafter. The specific matching coefficients are useful to assess the penalties incurred when one analyzes eigenvector coefficients of a
Improving Cross-Day EEG-Based Emotion Classification Using Robust Principal Component Analysis
Directory of Open Access Journals (Sweden)
Yuan-Pin Lin
2017-07-01
Full Text Available Constructing a robust emotion-aware analytical framework using non-invasively recorded electroencephalogram (EEG signals has gained intensive attentions nowadays. However, as deploying a laboratory-oriented proof-of-concept study toward real-world applications, researchers are now facing an ecological challenge that the EEG patterns recorded in real life substantially change across days (i.e., day-to-day variability, arguably making the pre-defined predictive model vulnerable to the given EEG signals of a separate day. The present work addressed how to mitigate the inter-day EEG variability of emotional responses with an attempt to facilitate cross-day emotion classification, which was less concerned in the literature. This study proposed a robust principal component analysis (RPCA-based signal filtering strategy and validated its neurophysiological validity and machine-learning practicability on a binary emotion classification task (happiness vs. sadness using a five-day EEG dataset of 12 subjects when participated in a music-listening task. The empirical results showed that the RPCA-decomposed sparse signals (RPCA-S enabled filtering off the background EEG activity that contributed more to the inter-day variability, and predominately captured the EEG oscillations of emotional responses that behaved relatively consistent along days. Through applying a realistic add-day-in classification validation scheme, the RPCA-S progressively exploited more informative features (from 12.67 ± 5.99 to 20.83 ± 7.18 and improved the cross-day binary emotion-classification accuracy (from 58.31 ± 12.33% to 64.03 ± 8.40% as trained the EEG signals from one to four recording days and tested against one unseen subsequent day. The original EEG features (prior to RPCA processing neither achieved the cross-day classification (the accuracy was around chance level nor replicated the encouraging improvement due to the inter-day EEG variability. This result
Directory of Open Access Journals (Sweden)
Shuai Sun
2014-06-01
Full Text Available Due to the scarcity of resources of Ziziphi spinosae semen (ZSS, many inferior goods and even adulterants are generally found in medicine markets. To strengthen the quality control, HPLC fingerprint common pattern established in this paper showed three main bioactive compounds in one chromatogram simultaneously. Principal component analysis based on DAD signals could discriminate adulterants and inferiorities. Principal component analysis indicated that all samples could be mainly regrouped into two main clusters according to the first principal component (PC1, redefined as Vicenin II and the second principal component (PC2, redefined as zizyphusine. PC1 and PC2 could explain 91.42% of the variance. Content of zizyphusine fluctuated more greatly than that of spinosin, and this result was also confirmed by the HPTLC result. Samples with low content of jujubosides and two common adulterants could not be used equivalently with authenticated ones in clinic, while one reference standard extract could substitute the crude drug in pharmaceutical production. Giving special consideration to the well-known bioactive saponins but with low response by end absorption, a fast and cheap HPTLC method for quality control of ZSS was developed and the result obtained was commensurate well with that of HPLC analysis. Samples having similar fingerprints to HPTLC common pattern targeting at saponins could be regarded as authenticated ones. This work provided a faster and cheaper way for quality control of ZSS and laid foundation for establishing a more effective quality control method for ZSS. Keywords: Adulterant, Common pattern, Principal component analysis, Quality control, Ziziphi spinosae semen
Batis, Carolina; Mendez, Michelle A; Gordon-Larsen, Penny; Sotres-Alvarez, Daniela; Adair, Linda; Popkin, Barry
2016-02-01
We examined the association between dietary patterns and diabetes using the strengths of two methods: principal component analysis (PCA) to identify the eating patterns of the population and reduced rank regression (RRR) to derive a pattern that explains the variation in glycated Hb (HbA1c), homeostasis model assessment of insulin resistance (HOMA-IR) and fasting glucose. We measured diet over a 3 d period with 24 h recalls and a household food inventory in 2006 and used it to derive PCA and RRR dietary patterns. The outcomes were measured in 2009. Adults (n 4316) from the China Health and Nutrition Survey. The adjusted odds ratio for diabetes prevalence (HbA1c≥6·5 %), comparing the highest dietary pattern score quartile with the lowest, was 1·26 (95 % CI 0·76, 2·08) for a modern high-wheat pattern (PCA; wheat products, fruits, eggs, milk, instant noodles and frozen dumplings), 0·76 (95 % CI 0·49, 1·17) for a traditional southern pattern (PCA; rice, meat, poultry and fish) and 2·37 (95 % CI 1·56, 3·60) for the pattern derived with RRR. By comparing the dietary pattern structures of RRR and PCA, we found that the RRR pattern was also behaviourally meaningful. It combined the deleterious effects of the modern high-wheat pattern (high intakes of wheat buns and breads, deep-fried wheat and soya milk) with the deleterious effects of consuming the opposite of the traditional southern pattern (low intakes of rice, poultry and game, fish and seafood). Our findings suggest that using both PCA and RRR provided useful insights when studying the association of dietary patterns with diabetes.
Osis, Sean T; Hettinga, Blayne A; Leitch, Jessica; Ferber, Reed
2014-08-22
As 3-dimensional (3D) motion-capture for clinical gait analysis continues to evolve, new methods must be developed to improve the detection of gait cycle events based on kinematic data. Recently, the application of principal component analysis (PCA) to gait data has shown promise in detecting important biomechanical features. Therefore, the purpose of this study was to define a new foot strike detection method for a continuum of striking techniques, by applying PCA to joint angle waveforms. In accordance with Newtonian mechanics, it was hypothesized that transient features in the sagittal-plane accelerations of the lower extremity would be linked with the impulsive application of force to the foot at foot strike. Kinematic and kinetic data from treadmill running were selected for 154 subjects, from a database of gait biomechanics. Ankle, knee and hip sagittal plane angular acceleration kinematic curves were chained together to form a row input to a PCA matrix. A linear polynomial was calculated based on PCA scores, and a 10-fold cross-validation was performed to evaluate prediction accuracy against gold-standard foot strike as determined by a 10 N rise in the vertical ground reaction force. Results show 89-94% of all predicted foot strikes were within 4 frames (20 ms) of the gold standard with the largest error being 28 ms. It is concluded that this new foot strike detection is an improvement on existing methods and can be applied regardless of whether the runner exhibits a rearfoot, midfoot, or forefoot strike pattern. Copyright © 2014 Elsevier Ltd. All rights reserved.
Chattopadhyay, Goutami; Chattopadhyay, Surajit; Chakraborthy, Parthasarathi
2012-07-01
The present study deals with daily total ozone concentration time series over four metro cities of India namely Kolkata, Mumbai, Chennai, and New Delhi in the multivariate environment. Using the Kaiser-Meyer-Olkin measure, it is established that the data set under consideration are suitable for principal component analysis. Subsequently, by introducing rotated component matrix for the principal components, the predictors suitable for generating artificial neural network (ANN) for daily total ozone prediction are identified. The multicollinearity is removed in this way. Models of ANN in the form of multilayer perceptron trained through backpropagation learning are generated for all of the study zones, and the model outcomes are assessed statistically. Measuring various statistics like Pearson correlation coefficients, Willmott's indices, percentage errors of prediction, and mean absolute errors, it is observed that for Mumbai and Kolkata the proposed ANN model generates very good predictions. The results are supported by the linearly distributed coordinates in the scatterplots.
Grimbergen, M C M; van Swol, C F P; Kendall, C; Verdaasdonk, R M; Stone, N; Bosch, J L H R
2010-01-01
The overall quality of Raman spectra in the near-infrared region, where biological samples are often studied, has benefited from various improvements to optical instrumentation over the past decade. However, obtaining ample spectral quality for analysis is still challenging due to device requirements and short integration times required for (in vivo) clinical applications of Raman spectroscopy. Multivariate analytical methods, such as principal component analysis (PCA) and linear discriminant analysis (LDA), are routinely applied to Raman spectral datasets to develop classification models. Data compression is necessary prior to discriminant analysis to prevent or decrease the degree of over-fitting. The logical threshold for the selection of principal components (PCs) to be used in discriminant analysis is likely to be at a point before the PCs begin to introduce equivalent signal and noise and, hence, include no additional value. Assessment of the signal-to-noise ratio (SNR) at a certain peak or over a specific spectral region will depend on the sample measured. Therefore, the mean SNR over the whole spectral region (SNR(msr)) is determined in the original spectrum as well as for spectra reconstructed from an increasing number of principal components. This paper introduces a method of assessing the influence of signal and noise from individual PC loads and indicates a method of selection of PCs for LDA. To evaluate this method, two data sets with different SNRs were used. The sets were obtained with the same Raman system and the same measurement parameters on bladder tissue collected during white light cystoscopy (set A) and fluorescence-guided cystoscopy (set B). This method shows that the mean SNR over the spectral range in the original Raman spectra of these two data sets is related to the signal and noise contribution of principal component loads. The difference in mean SNR over the spectral range can also be appreciated since fewer principal components can
Segil, Jacob L; Weir, Richard F ff
2014-03-01
An ideal myoelectric prosthetic hand should have the ability to continuously morph between any posture like an anatomical hand. This paper describes the design and validation of a morphing myoelectric hand controller based on principal component analysis of human grasping. The controller commands continuously morphing hand postures including functional grasps using between two and four surface electromyography (EMG) electrodes pairs. Four unique maps were developed to transform the EMG control signals in the principal component domain. A preliminary validation experiment was performed by 10 nonamputee subjects to determine the map with highest performance. The subjects used the myoelectric controller to morph a virtual hand between functional grasps in a series of randomized trials. The number of joints controlled accurately was evaluated to characterize the performance of each map. Additional metrics were studied including completion rate, time to completion, and path efficiency. The highest performing map controlled over 13 out of 15 joints accurately.
Penttilä, Antti; Martikainen, Julia; Gritsevich, Maria; Muinonen, Karri
2018-02-01
Meteorite samples are measured with the University of Helsinki integrating-sphere UV-vis-NIR spectrometer. The resulting spectra of 30 meteorites are compared with selected spectra from the NASA Planetary Data System meteorite spectra database. The spectral measurements are transformed with the principal component analysis, and it is shown that different meteorite types can be distinguished from the transformed data. The motivation is to improve the link between asteroid spectral observations and meteorite spectral measurements.
Czech Academy of Sciences Publication Activity Database
Alán, Lukáš; Špaček, Tomáš; Ježek, Petr
2016-01-01
Roč. 45, č. 5 (2016), s. 443-461 ISSN 0175-7571 R&D Projects: GA ČR(CZ) GA13-02033S; GA MŠk(CZ) ED1.1.00/02.0109 Institutional support: RVO:67985823 Keywords : 3D object segmentation * Delaunay algorithm * principal component analysis * 3D super-resolution microscopy * nucleoids * mitochondrial DNA replication Subject RIV: BO - Biophysics Impact factor: 1.472, year: 2016
Byrne, Patrick; Runkel, Robert L.; Walton-Day, Katherine
2017-01-01
Combining the synoptic mass balance approach with principal components analysis (PCA) can be an effective method for discretising the chemistry of inflows and source areas in watersheds where contamination is diffuse in nature and/or complicated by groundwater interactions. This paper presents a field-scale study in which synoptic sampling and PCA are employed in a mineralized watershed (Lion Creek, Colorado, USA) under low flow conditions to (i) quantify the impacts of mining activity on str...
Directory of Open Access Journals (Sweden)
S. Saravanan
2012-07-01
Full Text Available Power System planning starts with Electric load (demand forecasting. Accurate electricity load forecasting is one of the most important challenges in managing supply and demand of the electricity, since the electricity demand is volatile in nature; it cannot be stored and has to be consumed instantly. The aim of this study deals with electricity consumption in India, to forecast future projection of demand for a period of 19 years from 2012 to 2030. The eleven input variables used are Amount of CO2 emission, Population, Per capita GDP, Per capita gross national income, Gross Domestic savings, Industry, Consumer price index, Wholesale price index, Imports, Exports and Per capita power consumption. A new methodology based on Artificial Neural Networks (ANNs using principal components is also used. Data of 29 years used for training and data of 10 years used for testing the ANNs. Comparison made with multiple linear regression (based on original data and the principal components and ANNs with original data as input variables. The results show that the use of ANNs with principal components (PC is more effective.
International Nuclear Information System (INIS)
Spurr, R.; Natraj, V.; Lerot, C.; Van Roozendael, M.; Loyola, D.
2013-01-01
Principal Component Analysis (PCA) is a promising tool for enhancing radiative transfer (RT) performance. When applied to binned optical property data sets, PCA exploits redundancy in the optical data, and restricts the number of full multiple-scatter calculations to those optical states corresponding to the most important principal components, yet still maintaining high accuracy in the radiance approximations. We show that the entire PCA RT enhancement process is analytically differentiable with respect to any atmospheric or surface parameter, thus allowing for accurate and fast approximations of Jacobian matrices, in addition to radiances. This linearization greatly extends the power and scope of the PCA method to many remote sensing retrieval applications and sensitivity studies. In the first example, we examine accuracy for PCA-derived UV-backscatter radiance and Jacobian fields over a 290–340 nm window. In a second application, we show that performance for UV-based total ozone column retrieval is considerably improved without compromising the accuracy. -- Highlights: •Principal Component Analysis (PCA) of spectrally-binned atmospheric optical properties. •PCA-based accelerated radiative transfer with 2-stream model for fast multiple-scatter. •Atmospheric and surface property linearization of this PCA performance enhancement. •Accuracy of PCA enhancement for radiances and bulk-property Jacobians, 290–340 nm. •Application of PCA speed enhancement to UV backscatter total ozone retrievals
Matsen IV, Frederick A.; Evans, Steven N.
2013-01-01
Principal components analysis (PCA) and hierarchical clustering are two of the most heavily used techniques for analyzing the differences between nucleic acid sequence samples taken from a given environment. They have led to many insights regarding the structure of microbial communities. We have developed two new complementary methods that leverage how this microbial community data sits on a phylogenetic tree. Edge principal components analysis enables the detection of important differences between samples that contain closely related taxa. Each principal component axis is a collection of signed weights on the edges of the phylogenetic tree, and these weights are easily visualized by a suitable thickening and coloring of the edges. Squash clustering outputs a (rooted) clustering tree in which each internal node corresponds to an appropriate “average” of the original samples at the leaves below the node. Moreover, the length of an edge is a suitably defined distance between the averaged samples associated with the two incident nodes, rather than the less interpretable average of distances produced by UPGMA, the most widely used hierarchical clustering method in this context. We present these methods and illustrate their use with data from the human microbiome. PMID:23505415
DEFF Research Database (Denmark)
Giese, E.B.; Ding, M.; Dalstra, M.
2003-01-01
embalmed mandibular condyles; the angle of the first principal direction and the axis of the specimen, expressing the orientation of the trabeculae, ranged from 10 degrees to 87 degrees. Morphological parameters were determined by a method based on Archimedes' principle and by micro-CT scanning......-like trabeculae, and not with more or thicker trabeculae. The trabecular orientation was most determinative (about 50%) in explaining stiffness, strength, and failure energy. The amount of bone was second most determinative and increased the explained variance to about 72%. These results suggest that trabecular...
International Nuclear Information System (INIS)
Li, Yanfu; Liu, Hongli; Ma, Ziji
2016-01-01
Rail corrugation dynamic measurement techniques are critical to guarantee transport security and guide rail maintenance. During the inspection process, low-frequency trends caused by rail fluctuation are usually superimposed on rail corrugation and seriously affect the assessment of rail maintenance quality. In order to extract and remove the nonlinear and non-stationary trends from original mixed signals, a hybrid model based ensemble empirical mode decomposition (EEMD) and modified principal component analysis (MPCA) is proposed in this paper. Compared with the existing de-trending methods based on EMD, this method first considers low-frequency intrinsic mode functions (IMFs) thought to be underlying trend components that maybe contain some unrelated components, such as white noise and low-frequency signal itself, and proposes to use PCA to accurately extract the pure trends from the IMFs containing multiple components. On the other hand, due to the energy contribution ratio between trends and mixed signals is prior unknown, and the principal components (PCs) decomposed by PCA are arranged in order of energy reduction without considering frequency distribution, the proposed method modifies traditional PCA and just selects relevant low-frequency PCs to reconstruct the trends based on the zero-crossing numbers (ZCN) of each PC. Extensive tests are presented to illustrate the effectiveness of the proposed method. The results show the proposed EEMD-PCA-ZCN is an effective tool for trend extraction of rail corrugation measured dynamically. (paper)
International Nuclear Information System (INIS)
Gu Haiwei; Pan Zhengzheng; Xi Bowei; Asiago, Vincent; Musselman, Brian; Raftery, Daniel
2011-01-01
Nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS) are the two most commonly used analytical tools in metabolomics, and their complementary nature makes the combination particularly attractive. A combined analytical approach can improve the potential for providing reliable methods to detect metabolic profile alterations in biofluids or tissues caused by disease, toxicity, etc. In this paper, 1 H NMR spectroscopy and direct analysis in real time (DART)-MS were used for the metabolomics analysis of serum samples from breast cancer patients and healthy controls. Principal component analysis (PCA) of the NMR data showed that the first principal component (PC1) scores could be used to separate cancer from normal samples. However, no such obvious clustering could be observed in the PCA score plot of DART-MS data, even though DART-MS can provide a rich and informative metabolic profile. Using a modified multivariate statistical approach, the DART-MS data were then reevaluated by orthogonal signal correction (OSC) pretreated partial least squares (PLS), in which the Y matrix in the regression was set to the PC1 score values from the NMR data analysis. This approach, and a similar one using the first latent variable from PLS-DA of the NMR data resulted in a significant improvement of the separation between the disease samples and normals, and a metabolic profile related to breast cancer could be extracted from DART-MS. The new approach allows the disease classification to be expressed on a continuum as opposed to a binary scale and thus better represents the disease and healthy classifications. An improved metabolic profile obtained by combining MS and NMR by this approach may be useful to achieve more accurate disease detection and gain more insight regarding disease mechanisms and biology.
Gu, Haiwei; Pan, Zhengzheng; Xi, Bowei; Asiago, Vincent; Musselman, Brian; Raftery, Daniel
2011-02-07
Nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS) are the two most commonly used analytical tools in metabolomics, and their complementary nature makes the combination particularly attractive. A combined analytical approach can improve the potential for providing reliable methods to detect metabolic profile alterations in biofluids or tissues caused by disease, toxicity, etc. In this paper, (1)H NMR spectroscopy and direct analysis in real time (DART)-MS were used for the metabolomics analysis of serum samples from breast cancer patients and healthy controls. Principal component analysis (PCA) of the NMR data showed that the first principal component (PC1) scores could be used to separate cancer from normal samples. However, no such obvious clustering could be observed in the PCA score plot of DART-MS data, even though DART-MS can provide a rich and informative metabolic profile. Using a modified multivariate statistical approach, the DART-MS data were then reevaluated by orthogonal signal correction (OSC) pretreated partial least squares (PLS), in which the Y matrix in the regression was set to the PC1 score values from the NMR data analysis. This approach, and a similar one using the first latent variable from PLS-DA of the NMR data resulted in a significant improvement of the separation between the disease samples and normals, and a metabolic profile related to breast cancer could be extracted from DART-MS. The new approach allows the disease classification to be expressed on a continuum as opposed to a binary scale and thus better represents the disease and healthy classifications. An improved metabolic profile obtained by combining MS and NMR by this approach may be useful to achieve more accurate disease detection and gain more insight regarding disease mechanisms and biology. Copyright © 2010 Elsevier B.V. All rights reserved.
Takegami, Shigehiko; Ueyama, Keita; Konishi, Atsuko; Kitade, Tatsuya
2018-06-06
The lipid fluidity of various lipid nanoemulsions (LNEs) without and with flutamide (FT) and containing one of two neutral lipids, one of four phosphatidylcholines as a surfactant, and sodium palmitate as a cosurfactant was investigated by the combination of 1 H nuclear magnetic resonance (NMR) spectroscopy and principal component analysis (PCA). In the 1 H NMR spectra, the peaks from the methylene groups of the neutral lipids and surfactants for all LNE preparations showed downfield shifts with increasing temperature from 20 to 60 °C. PCA was applied to the 1 H NMR spectral data obtained for the LNEs. The PCA resulted in a model in which the first two principal components (PCs) extracted 88% of the total spectral variation; the first PC (PC-1) axis and second PC (PC-2) axis accounted for 73 and 15%, respectively, of the total spectral variation. The Score-1 values for PC-1 plotted against temperature revealed the existence of two clusters, which were defined by the neutral lipid of the LNE preparations. Meanwhile, the Score-2 values decreased with rising temperature and reflected the increase in lipid fluidity of each LNE preparation, consistent with fluorescence anisotropy measurements. In addition, the changes of Score-2 values with temperature for LNE preparations with FT were smaller than those for LNE preparations without FT. This indicates that FT encapsulated in LNE particles markedly suppressed the increase in lipid fluidity of LNE particles with rising temperature. Thus, PCA of 1 H NMR spectra will become a powerful tool to analyze the lipid fluidity of lipid nanoparticles. Graphical abstract ᅟ.
Directory of Open Access Journals (Sweden)
Keqin Xu
2018-05-01
Full Text Available There are three key medicinal components (phellodendrine, berberine and palmatine in the extracts of Phellodendron bark, as one of the fundamental herbs of traditional Chinese medicine. Different extraction methods and solvent combinations were investigated to obtain the optimal technologies for high-efficient extraction of these medicinal components. Results: The results showed that combined solvents have higher extracting effect of phellodendrine, berberine and palmatine than single solvent, and the effect of ultrasonic extraction is distinctly better than those of distillation and soxhlet extraction. Conclusion: The hydrochloric acid/methanol-ultrasonic extraction has the best effect for three medicinal components of fresh Phellodendron bark, providing an extraction yield of 103.12 mg/g berberine, 24.41 mg/g phellodendrine, 1.25 mg/g palmatine. Keywords: Phellodendron, Cortex phellodendri, Extraction methods, Medicinal components
Schwedhelm, Carolina; Iqbal, Khalid; Knüppel, Sven; Schwingshackl, Lukas; Boeing, Heiner
2018-02-01
Principal component analysis (PCA) is a widely used exploratory method in epidemiology to derive dietary patterns from habitual diet. Such dietary patterns seem to originate from intakes on multiple days and eating occasions. Therefore, analyzing food intake of study populations with different levels of food consumption can provide additional insights as to how habitual dietary patterns are formed. We analyzed the food intake data of German adults in terms of the relations among food groups from three 24-h dietary recalls (24hDRs) on the habitual, single-day, and main-meal levels, and investigated the contribution of each level to the formation of PCA-derived habitual dietary patterns. Three 24hDRs were collected in 2010-2012 from 816 adults for an European Prospective Investigation into Cancer and Nutrition (EPIC)-Potsdam subcohort study. We identified PCA-derived habitual dietary patterns and compared cross-sectional food consumption data in terms of correlation (Spearman), consistency (intraclass correlation coefficient), and frequency of consumption across all days and main meals. Contribution to the formation of the dietary patterns was obtained through Spearman correlation of the dietary pattern scores. Among the meals, breakfast appeared to be the most consistent eating occasion within individuals. Dinner showed the strongest correlations with "Prudent" (Spearman correlation = 0.60), "Western" (Spearman correlation = 0.59), and "Traditional" (Spearman correlation = 0.60) dietary patterns identified on the habitual level, and lunch showed the strongest correlations with the "Cereals and legumes" (Spearman correlation = 0.60) habitual dietary pattern. Higher meal consistency was related to lower contributions to the formation of PCA-derived habitual dietary patterns. Absolute amounts of food consumption did not strongly conform to the habitual dietary patterns by meals, suggesting that these patterns are formed by complex combinations of variable food
Shaffer, John R; Polk, Deborah E; Feingold, Eleanor; Wang, Xiaojing; Cuenco, Karen T; Weeks, Daniel E; DeSensi, Rebecca S; Weyant, Robert J; Crout, Richard; McNeil, Daniel W; Marazita, Mary L
2013-08-01
Dental caries of the permanent dentition is a multifactorial disease resulting from the complex interplay of endogenous and environmental risk factors. The disease is not easily quantitated due to the innumerable possible combinations of carious lesions across individual tooth surfaces of the permanent dentition. Global measures of decay, such as the DMFS index (which was developed for surveillance applications), may not be optimal for studying the epidemiology of dental caries because they ignore the distinct patterns of decay across the dentition. We hypothesize that specific risk factors may manifest their effects on specific tooth surfaces leading to patterns of decay that can be identified and studied. In this study, we utilized two statistical methods of extracting patterns of decay from surface-level caries data to create novel phenotypes with which to study the risk factors affecting dental caries. Intra-oral dental examinations were performed on 1068 participants aged 18-75 years to assess dental caries. The 128 tooth surfaces of the permanent dentition were scored as carious or not and used as input for principal components analysis (PCA) and factor analysis (FA), two methods of identifying underlying patterns without a priori knowledge of the patterns. Demographic (age, sex, birth year, race/ethnicity, and educational attainment), anthropometric (height, body mass index, waist circumference), endogenous (saliva flow), and environmental (tooth brushing frequency, home water source, and home water fluoride) risk factors were tested for association with the caries patterns identified by PCA and FA, as well as DMFS, for comparison. The ten strongest patterns (i.e. those that explain the most variation in the data set) extracted by PCA and FA were considered. The three strongest patterns identified by PCA reflected (i) global extent of decay (i.e. comparable to DMFS index), (ii) pit and fissure surface caries and (iii) smooth surface caries, respectively. The
Directory of Open Access Journals (Sweden)
Hesse Morten
2005-05-01
Full Text Available Abstract Background Personality disorders are common in substance abusers. Self-report questionnaires that can aid in the assessment of personality disorders are commonly used in assessment, but are rarely validated. Methods The Danish DIP-Q as a measure of co-morbid personality disorders in substance abusers was validated through principal components factor analysis and canonical correlation analysis. A 4 components structure was constructed based on 238 protocols, representing antagonism, neuroticism, introversion and conscientiousness. The structure was compared with (a a 4-factor solution from the DIP-Q in a sample of Swedish drug and alcohol abusers (N = 133, and (b a consensus 4-components solution based on a meta-analysis of published correlation matrices of dimensional personality disorder scales. Results It was found that the 4-factor model of personality was congruent across the Danish and Swedish samples, and showed good congruence with the consensus model. A canonical correlation analysis was conducted on a subset of the Danish sample with staff ratings of pathology. Three factors that correlated highly between the two variable sets were found. These variables were highly similar to the three first factors from the principal components analysis, antagonism, neuroticism and introversion. Conclusion The findings support the validity of the DIP-Q as a measure of DSM-IV personality disorders in substance abusers.
Directory of Open Access Journals (Sweden)
Selvin J. PITCHAIKANI
2017-06-01
Full Text Available Principal component analysis (PCA is a technique used to emphasize variation and bring out strong patterns in a dataset. It is often used to make data easy to explore and visualize. The primary objective of the present study was to record information of zooplankton diversity in a systematic way and to study the variability and relationships among seasons prevailed in Gulf of Mannar. The PCA for the zooplankton seasonal diversity was investigated using the four seasonal datasets to understand the statistical significance among the four seasons. Two different principal components (PC were segregated in all the seasons homogeneously. PCA analyses revealed that Temora turbinata is an opportunistic species and zooplankton diversity was significantly different from season to season and principally, the zooplankton abundance and its dynamics in Gulf of Mannar is structured by seasonal current patterns. The factor loadings of zooplankton for different seasons in Tiruchendur coastal water (GOM is different compared with the Southwest coast of India; particularly, routine and opportunistic species were found within the positive and negative factors. The copepods Acrocalanus gracilis and Acartia erythrea were dominant in summer and Southwest monsoon due to the rainfall and freshwater discharge during the summer season; however, these species were replaced by Temora turbinata during Northeast monsoon season.
Xu, Keqin; He, Gongxiu; Qin, Jieming; Cheng, Xuexiang; He, Hanjie; Zhang, Dangquan; Peng, Wanxi
2018-05-01
There are three key medicinal components (phellodendrine, berberine and palmatine) in the extracts of Phellodendron bark, as one of the fundamental herbs of traditional Chinese medicine. Different extraction methods and solvent combinations were investigated to obtain the optimal technologies for high-efficient extraction of these medicinal components. The results showed that combined solvents have higher extracting effect of phellodendrine, berberine and palmatine than single solvent, and the effect of ultrasonic extraction is distinctly better than those of distillation and soxhlet extraction. The hydrochloric acid/methanol-ultrasonic extraction has the best effect for three medicinal components of fresh Phellodendron bark, providing an extraction yield of 103.12 mg/g berberine, 24.41 mg/g phellodendrine, 1.25 mg/g palmatine.
DEFF Research Database (Denmark)
Schreiber, Norman; Garcia, Emanuel; Kroon, Aart
2014-01-01
Principle Component Analysis (PCA) was performed on chemical data of two sediment cores from an urban fresh-water lake in Copenhagen, Denmark. X-ray fluorescence (XRF) core scanning provided the underlying datasets on 13 variables (Si, K, Ca, Ti, Cr, Mn, Fe, Ni, Cu, Zn, Rb, Cd, Pb). Principle......, Fe, Rb) and characterized the content of minerogenic material in the sediment. In case of both cores, PC2 was a good descriptor emphasized as the contamination component. It showed strong linkages with heavy metals (Cu, Zn, Pb), disclosing changing heavy-metal contamination trends across different...
Gál, Lukáš; Oravec, Michal; Gemeiner, Pavol; Čeppan, Michal
2015-12-01
Nineteen black inkjet inks of six different brands were examined by fibre optics reflection spectroscopy in Visible and Near Infrared Region (Vis-NIR FORS) directly on paper with a view to achieving good resolution between them. These different inks were tested on nineteen different inkjet printers from three brands. Samples were obtained from prints by reflection probe. Processed reflection spectra in the range 500-1000 nm were used as samples in principal component analysis. Variability between spectra of the same ink obtained from different prints, as well as between spectra of square areas and lines was examined. For both spectra obtained from square areas and lines reference, Principal Component Analysis (PCA) models were created. According to these models, the inkjet inks were divided into clusters. PCA method is able to separate inks containing carbon black as main colorant from the other inks using other colorants. Some spectra were recorded from another piece of printer and used as validation samples. Spectra of validation samples were projected onto reference PCA models. According to position of validation samples in score plots it can be concluded that PCA based on Vis-NIR FORS can reliably differentiate inkjet inks which are included in the reference database. The presented method appears to be a suitable tool for forensic examination of questioned documents containing inkjet inks. Inkjet inks spectra were obtained without extraction or cutting sample with possibility to measure out of the laboratory. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
International Nuclear Information System (INIS)
Li, Boyan; Calvet, Amandine; Casamayou-Boucau, Yannick; Ryder, Alan G.
2016-01-01
A new, fully automated, rapid method, referred to as kernel principal component analysis residual diagnosis (KPCARD), is proposed for removing cosmic ray artifacts (CRAs) in Raman spectra, and in particular for large Raman imaging datasets. KPCARD identifies CRAs via a statistical analysis of the residuals obtained at each wavenumber in the spectra. The method utilizes the stochastic nature of CRAs; therefore, the most significant components in principal component analysis (PCA) of large numbers of Raman spectra should not contain any CRAs. The process worked by first implementing kernel PCA (kPCA) on all the Raman mapping data and second accurately estimating the inter- and intra-spectrum noise to generate two threshold values. CRA identification was then achieved by using the threshold values to evaluate the residuals for each spectrum and assess if a CRA was present. CRA correction was achieved by spectral replacement where, the nearest neighbor (NN) spectrum, most spectroscopically similar to the CRA contaminated spectrum and principal components (PCs) obtained by kPCA were both used to generate a robust, best curve fit to the CRA contaminated spectrum. This best fit spectrum then replaced the CRA contaminated spectrum in the dataset. KPCARD efficacy was demonstrated by using simulated data and real Raman spectra collected from solid-state materials. The results showed that KPCARD was fast ( 1 million) Raman datasets. - Highlights: • New rapid, automatable method for cosmic ray artifact correction of Raman spectra. • Uses combination of kernel PCA and noise estimation for artifact identification. • Implements a best fit spectrum replacement correction approach.
Directory of Open Access Journals (Sweden)
Alaba Boluwade
2016-09-01
Full Text Available Accurate characterization of soil properties such as soil water content (SWC and bulk density (BD is vital for hydrologic processes and thus, it is importance to estimate θ (water content and ρ (soil bulk density among other soil surface parameters involved in water retention and infiltration, runoff generation and water erosion, etc. The spatial estimation of these soil properties are important in guiding agricultural management decisions. These soil properties vary both in space and time and are correlated. Therefore, it is important to find an efficient and robust technique to simulate spatially correlated variables. Methods such as principal component analysis (PCA and independent component analysis (ICA can be used for the joint simulations of spatially correlated variables, but they are not without their flaws. This study applied a variant of PCA called independent principal component analysis (IPCA that combines the strengths of both PCA and ICA for spatial simulation of SWC and BD using the soil data set from an 11 km2 Castor watershed in southern Quebec, Canada. Diagnostic checks using the histograms and cumulative distribution function (cdf both raw and back transformed simulations show good agreement. Therefore, the results from this study has potential in characterization of water content variability and bulk density variation for precision agriculture.
Azevedo, Mônia Stremel; Valentim-Neto, Pedro Alexandre; Seraglio, Siluana Katia Tischer; da Luz, Cynthia Fernandes Pinto; Arisi, Ana Carolina Maisonnave; Costa, Ana Carolina Oliveira
2017-10-01
Due to the increasing valuation and appreciation of honeydew honey in many European countries and also to existing contamination among different types of honeys, authentication is an important aspect of quality control with regard to guaranteeing the origin in terms of source (honeydew or floral) and needs to be determined. Furthermore, proteins are minor components of the honey, despite the importance of their physiological effects, and can differ according to the source of the honey. In this context, the aims of this study were to carry out protein extraction from honeydew and floral honeys and to discriminate these honeys from the same botanical species, Mimosa scabrella Bentham, through proteome comparison using two-dimensional gel electrophoresis and principal component analysis. The results showed that the proteome profile and principal component analysis can be a useful tool for discrimination between these types of honey using matched proteins (45 matched spots). Also, the proteome profile showed 160 protein spots in honeydew honey and 84 spots in the floral honey. The protein profile can be a differential characteristic of this type of honey, in view of the importance of proteins as bioactive compounds in honey. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
International Nuclear Information System (INIS)
Keng, S.E.; Abbas Fadhl Mubarek Al-Karkhi; Mohd Khairuddin Mohd Talib; Azhar Mat Easa; Hoong, C.L.
2015-01-01
This study was triggered by Malaysia Ministry of Health to monitor quality of commercial orange juice products sold in Malaysia market. A total of 19 orange juice samples from 14 different brands of packed orange juice products and 5 different brands of fresh orange fruit juices were analyzed for total soluble solids content, total titratable acidity, sugar composition and amino acid profiles. Hierarchical Cluster analysis (HCA) and Principal component analysis (PCA) on amino acid composition alone allowed visual discrimination between fresh squeezed orange juices and commercial packed orange juices. Suspicion of mislabel was raised in cases of miss-classification. (author)
Directory of Open Access Journals (Sweden)
Alia Colniță
2017-09-01
Full Text Available Raman scattering and its particular effect, surface-enhanced Raman scattering (SERS, are whole-organism fingerprinting spectroscopic techniques that gain more and more popularity in bacterial detection. In this work, two relevant Gram-positive bacteria species, Lactobacillus casei (L. casei and Listeria monocytogenes (L. monocytogenes were characterized based on their Raman and SERS spectral fingerprints. The SERS spectra were used to identify the biochemical structures of the bacterial cell wall. Two synthesis methods of the SERS-active nanomaterials were used and the recorded spectra were analyzed. L. casei and L. monocytogenes were successfully discriminated by applying Principal Component Analysis (PCA to their specific spectral data.
International Nuclear Information System (INIS)
Dragovic, Snezana; Onjia, Antonije
2006-01-01
A principal component analysis (PCA) was used for classification of soil samples from different locations in Serbia and Montenegro. Based on activities of radionuclides ( 226 Ra, 238 U, 235 U, 4 K, 134 Cs, 137 Cs, 232 Th and 7 Be) detected by gamma-ray spectrometry, the classification of soils according to their geographical origin was performed. Application of PCA to our experimental data resulted in satisfactory classification rate (86.0% correctly classified samples). The obtained results indicate that gamma-ray spectrometry in conjunction with PCA is a viable tool for soil classification
Trusiak, Maciej; Służewski, Łukasz; Patorski, Krzysztof
2016-02-22
Hybrid single shot algorithm for accurate phase demodulation of complex fringe patterns is proposed. It employs empirical mode decomposition based adaptive fringe pattern enhancement (i.e., denoising, background removal and amplitude normalization) and subsequent boosted phase demodulation using 2D Hilbert spiral transform aided by the Principal Component Analysis method for novel, correct and accurate local fringe direction map calculation. Robustness to fringe pattern significant noise, uneven background and amplitude modulation as well as local fringe period and shape variations is corroborated by numerical simulations and experiments. Proposed automatic, adaptive, fast and comprehensive fringe analysis solution compares favorably with other previously reported techniques.
Directory of Open Access Journals (Sweden)
Piotr CZECH
2007-01-01
Full Text Available This paper presents the results of an experimental application of artificial neural network as a classifier of the degree of cracking of a tooth root in a gear wheel. The neural classifier was based on the artificial neural network of Probabilistic Neural Network type (PNN. The input data for the classifier was in a form of matrix composedof statistical measures, obtained from fast Fourier transform (FFT and principal component analysis (PCA. The identified model of toothed gear transmission, operating in a circulating power system, served for generation of the teaching and testing set applied for the experiment.
Directory of Open Access Journals (Sweden)
Jyh-Woei Lin
2011-01-01
Full Text Available The goal of this study is to determine whether principal component analysis (PCA can be used to process latitude-time ionospheric TEC data on a monthly basis to identify earthquake associated TEC anomalies. PCA is applied to latitude-time (mean-of-a-month ionospheric total electron content (TEC records collected from the Japan GEONET network to detect TEC anomalies associated with 18 earthquakes in Japan (M≥6.0 from 2000 to 2005. According to the results, PCA was able to discriminate clear TEC anomalies in the months when all 18 earthquakes occurred. After reviewing months when no M≥6.0 earthquakes occurred but geomagnetic storm activity was present, it is possible that the maximal principal eigenvalues PCA returned for these 18 earthquakes indicate earthquake associated TEC anomalies. Previously PCA has been used to discriminate earthquake-associated TEC anomalies recognized by other researchers, who found that statistical association between large earthquakes and TEC anomalies could be established in the 5 days before earthquake nucleation; however, since PCA uses the characteristics of principal eigenvalues to determine earthquake related TEC anomalies, it is possible to show that such anomalies existed earlier than this 5-day statistical window.
Processing of spectral X-ray data with principal components analysis
Butler, A P H; Cook, N J; Butzer, J; Schleich, N; Tlustos, L; Scott, N; Grasset, R; de Ruiter, N; Anderson, N G
2011-01-01
The goal of the work was to develop a general method for processing spectral x-ray image data. Principle component analysis (PCA) is a well understood technique for multivariate data analysis and so was investigated. To assess this method, spectral (multi-energy) computed tomography (CT) data was obtained using a Medipix2 detector in a MARS-CT (Medipix All Resolution System). PCA was able to separate bone (calcium) from two elements with k-edges in the X-ray spectrum used (iodine and barium) within a mouse. This has potential clinical application in dual-energy CT systems and future Medipix3 based spectral imaging where up to eight energies can be recorded simultaneously with excellent energy resolution. (c) 2010 Elsevier B.V. All rights reserved.
International Nuclear Information System (INIS)
Kawaguchi, Osamu; Kunieda, Etsuo; Nyui, Yoshiyuki
2009-01-01
One of the most important factors in stereotactic radiosurgery (SRS) for intracranial arteriovenous malformation (AVM) is to determine accurate target delineation of the nidus. However, since intracranial AVMs are complicated in structure, it is often difficult to clearly determine the target delineation. The purpose of this study was to investigate the usefulness of principal component analysis (PCA) on intra-arterial contrast enhanced dynamic CT (IADCT) images as a tool for delineating accurate target volumes for stereotactic radiosurgery of AVMs. IADCT and intravenous contrast-enhanced CT (IVCT) were used to examine 4 randomly selected cases of AVM. PCA images were generated from the IADCT data. The first component images were considered feeding artery predominant, the second component images were considered draining vein predominant, and the third component images were considered background. Target delineations were first carried out from IVCT, and then again while referring to the first and second components of the PCA images. Dose calculation simulations for radiosurgical treatment plans with IVCT and PCA images were performed. Dose volume histograms of the vein areas as well as the target volumes were compared. In all cases, the calculated target volumes based on IVCT images were larger than those based on PCA images, and the irradiation doses for the vein areas were reduced. In this study, we simulated radiosurgical treatment planning for intracranial AVM based on PCA images. By using PCA images, the irradiation doses for the vein areas were substantially reduced. (author)
Directory of Open Access Journals (Sweden)
Arthur Nanni
2008-12-01
Full Text Available Principal component analysis is applied to 309 groundwater chemical data information from wells in the Serra Geral Aquifer System. Correlations among seven hydrochemical parameters are statistically examined. A four-component model is suggested and explains 81% of total variance. Component 1 represents calcium-magnesium bicarbonated groundwaters with long time of residence. Component 2 represents sulfated and chlorinated calcium and sodium groundwaters; Component 3 represents sodium bicarbonated groundwaters; and Component 4 is characterized by sodium sulfated with high fluoride facies. The components' spatial distribution shows high fluoride concentration along analyzed tectonic fault system and aligned on northeast direction in other areas, suggesting other hydrogeological fault systems. High fluoride concentration increases according to groundwater pumping depth. The Principal Component Analysis reveals features of the groundwater mixture and individualizes water facies. In this scenery, it can be determined hydrogeological blocks associated with tectonic fault system here introduced.A Análise de Componentes Principais foi aplicada em 309 dados químicos de águas subterrâneas de poços do Sistema Aqüífero Serra Geral. Correlações entre sete parâmetros hidroquímicos foram examinadas através da estatística. O modelo de quatro componentes foi utilizado por explicar 81% da variância total. A Componente 1 é representada por águas cálcio-magnesianas com longo tempo de residência, a Componente 2 representa águas bicarbonatadas sulfatadas e cloretadas, a Componente 3 representa águas bicarbonatadas sódicas e a Componente 4 é caracterizada por águas de fácies sódica e sulfatada com alto fluoreto. A distribuição espacial das componentes mostra águas com concentrações anômalas ao longo dos sistemas tectônicos de falhas, analisados e alinhados a NE em algumas áreas, sugerindo outros sistemas de falhas hidrogeológicos. As
Directory of Open Access Journals (Sweden)
Das Lalita
2012-08-01
Full Text Available Abstract Background The chemotherapeutic agent paclitaxel arrests cell division by binding to the hetero-dimeric protein tubulin. Subtle differences in tubulin sequences, across eukaryotes and among β-tubulin isotypes, can have profound impact on paclitaxel-tubulin binding. To capture the experimentally observed paclitaxel-resistance of human βIII tubulin isotype and yeast β-tubulin, within a common theoretical framework, we have performed structural principal component analyses of β-tubulin sequences across eukaryotes. Results The paclitaxel-resistance of human βIII tubulin isotype and yeast β-tubulin uniquely mapped on to the lowest two principal components, defining the paclitaxel-binding site residues of β-tubulin. The molecular mechanisms behind paclitaxel-resistance, mediated through key residues, were identified from structural consequences of characteristic mutations that confer paclitaxel-resistance. Specifically, Ala277 in βIII isotype was shown to be crucial for paclitaxel-resistance. Conclusions The present analysis captures the origin of two apparently unrelated events, paclitaxel-insensitivity of yeast tubulin and human βIII tubulin isotype, through two common collective sequence vectors.
Directory of Open Access Journals (Sweden)
M. Imran
2017-09-01
Full Text Available A blind adaptive color image watermarking scheme based on principal component analysis, singular value decomposition, and human visual system is proposed. The use of principal component analysis to decorrelate the three color channels of host image, improves the perceptual quality of watermarked image. Whereas, human visual system and fuzzy inference system helped to improve both imperceptibility and robustness by selecting adaptive scaling factor, so that, areas more prone to noise can be added with more information as compared to less prone areas. To achieve security, location of watermark embedding is kept secret and used as key at the time of watermark extraction, whereas, for capacity both singular values and vectors are involved in watermark embedding process. As a result, four contradictory requirements; imperceptibility, robustness, security and capacity are achieved as suggested by results. Both subjective and objective methods are acquired to examine the performance of proposed schemes. For subjective analysis the watermarked images and watermarks extracted from attacked watermarked images are shown. For objective analysis of proposed scheme in terms of imperceptibility, peak signal to noise ratio, structural similarity index, visual information fidelity and normalized color difference are used. Whereas, for objective analysis in terms of robustness, normalized correlation, bit error rate, normalized hamming distance and global authentication rate are used. Security is checked by using different keys to extract the watermark. The proposed schemes are compared with state-of-the-art watermarking techniques and found better performance as suggested by results.
Ecological Safety Evaluation of Land Use in Ji’an City Based on the Principal Component Analysis
Institute of Scientific and Technical Information of China (English)
2010-01-01
According to the ecological safety evaluation index data of land-use change in Ji’an City from 1999 to 2008,positive treatment on selected reverse indices is conducted by Reciprocal Method.Meanwhile,Index Method is used to standardize the selected indices,and Principal Component Analysis is applied by using year as a unit.FB is obtained,which is related with the ecological safety of land-use change from 1999 to 2008.According to the scientific,integrative,hierarchical,practical and dynamic principles,ecological safety evaluation index system of land-use change in Ji’an City is established.Principal Component Analysis and evaluation model are used to calculate four parameters,including the natural resources safety index of land use,the socio-economic safety indicators of land use,the eco-environmental safety index of land use,and the ecological safety degree of land use in Ji’an City.Result indicates that the ecological safety degree of land use in Ji’an City shows a slow upward trend as a whole.At the same time,ecological safety degree of land-use change is relatively low in Ji’an City with the safety value of 0.645,which is at a weak safety zone and needs further monitoring and maintenance.
Directory of Open Access Journals (Sweden)
Yihang Yin
2015-08-01
Full Text Available Wireless sensor networks (WSNs have been widely used to monitor the environment, and sensors in WSNs are usually power constrained. Because inner-node communication consumes most of the power, efficient data compression schemes are needed to reduce the data transmission to prolong the lifetime of WSNs. In this paper, we propose an efficient data compression model to aggregate data, which is based on spatial clustering and principal component analysis (PCA. First, sensors with a strong temporal-spatial correlation are grouped into one cluster for further processing with a novel similarity measure metric. Next, sensor data in one cluster are aggregated in the cluster head sensor node, and an efficient adaptive strategy is proposed for the selection of the cluster head to conserve energy. Finally, the proposed model applies principal component analysis with an error bound guarantee to compress the data and retain the definite variance at the same time. Computer simulations show that the proposed model can greatly reduce communication and obtain a lower mean square error than other PCA-based algorithms.
Feng, Ssj; Sechopoulos, I
2012-06-01
To develop an objective model of the shape of the compressed breast undergoing mammographic or tomosynthesis acquisition. Automated thresholding and edge detection was performed on 984 anonymized digital mammograms (492 craniocaudal (CC) view mammograms and 492 medial lateral oblique (MLO) view mammograms), to extract the edge of each breast. Principal Component Analysis (PCA) was performed on these edge vectors to identify a limited set of parameters and eigenvectors that. These parameters and eigenvectors comprise a model that can be used to describe the breast shapes present in acquired mammograms and to generate realistic models of breasts undergoing acquisition. Sample breast shapes were then generated from this model and evaluated. The mammograms in the database were previously acquired for a separate study and authorized for use in further research. The PCA successfully identified two principal components and their corresponding eigenvectors, forming the basis for the breast shape model. The simulated breast shapes generated from the model are reasonable approximations of clinically acquired mammograms. Using PCA, we have obtained models of the compressed breast undergoing mammographic or tomosynthesis acquisition based on objective analysis of a large image database. Up to now, the breast in the CC view has been approximated as a semi-circular tube, while there has been no objectively-obtained model for the MLO view breast shape. Such models can be used for various breast imaging research applications, such as x-ray scatter estimation and correction, dosimetry estimates, and computer-aided detection and diagnosis. © 2012 American Association of Physicists in Medicine.
Kholodov, V. A.; Yaroslavtseva, N. V.; Lazarev, V. I.; Frid, A. S.
2016-09-01
Cluster analysis and principal component analysis (PCA) have been used for the interpretation of dry sieving data. Chernozems from the treatments of long-term field experiments with different land-use patterns— annually mowed steppe, continuous potato culture, permanent black fallow, and untilled fallow since 1998 after permanent black fallow—have been used. Analysis of dry sieving data by PCA has shown that the treatments of untilled fallow after black fallow and annually mowed steppe differ most in the series considered; the content of dry aggregates of 10-7 mm makes the largest contribution to the distribution of objects along the first principal component. This fraction has been sieved in water and analyzed by PCA. In contrast to dry sieving data, the wet sieving data showed the closest mathematical distance between the treatment of untilled fallow after black fallow and the undisturbed treatment of annually mowed steppe, while the untilled fallow after black fallow and the permanent black fallow were the most distant treatments. Thus, it may be suggested that the water stability of structure is first restored after the removal of destructive anthropogenic load. However, the restoration of the distribution of structural separates to the parameters characteristic of native soils is a significantly longer process.
Gasson, Peter; Miller, Regis; Stekel, Dov J; Whinder, Frances; Zieminska, Kasia
2010-01-01
Dalbergia nigra is one of the most valuable timber species of its genus, having been traded for over 300 years. Due to over-exploitation it is facing extinction and trade has been banned under CITES Appendix I since 1992. Current methods, primarily comparative wood anatomy, are inadequate for conclusive species identification. This study aims to find a set of anatomical characters that distinguish the wood of D. nigra from other commercially important species of Dalbergia from Latin America. Qualitative and quantitative wood anatomy, principal components analysis and naïve Bayes classification were conducted on 43 specimens of Dalbergia, eight D. nigra and 35 from six other Latin American species. Dalbergia cearensis and D. miscolobium can be distinguished from D. nigra on the basis of vessel frequency for the former, and ray frequency for the latter. Principal components analysis was unable to provide any further basis for separating the species. Naïve Bayes classification using the four characters: minimum vessel diameter; frequency of solitary vessels; mean ray width; and frequency of axially fused rays, classified all eight D. nigra correctly with no false negatives, but there was a false positive rate of 36.36 %. Wood anatomy alone cannot distinguish D. nigra from all other commercially important Dalbergia species likely to be encountered by customs officials, but can be used to reduce the number of specimens that would need further study.
Yin, Yihang; Liu, Fengzheng; Zhou, Xiang; Li, Quanzhong
2015-08-07
Wireless sensor networks (WSNs) have been widely used to monitor the environment, and sensors in WSNs are usually power constrained. Because inner-node communication consumes most of the power, efficient data compression schemes are needed to reduce the data transmission to prolong the lifetime of WSNs. In this paper, we propose an efficient data compression model to aggregate data, which is based on spatial clustering and principal component analysis (PCA). First, sensors with a strong temporal-spatial correlation are grouped into one cluster for further processing with a novel similarity measure metric. Next, sensor data in one cluster are aggregated in the cluster head sensor node, and an efficient adaptive strategy is proposed for the selection of the cluster head to conserve energy. Finally, the proposed model applies principal component analysis with an error bound guarantee to compress the data and retain the definite variance at the same time. Computer simulations show that the proposed model can greatly reduce communication and obtain a lower mean square error than other PCA-based algorithms.
Directory of Open Access Journals (Sweden)
Benoit Parmentier
2014-12-01
Full Text Available Characterizing biophysical changes in land change areas over large regions with short and noisy multivariate time series and multiple temporal parameters remains a challenging task. Most studies focus on detection rather than the characterization, i.e., the manner by which surface state variables are altered by the process of changes. In this study, a procedure is presented to extract and characterize simultaneous temporal changes in MODIS multivariate times series from three surface state variables the Normalized Difference Vegetation Index (NDVI, land surface temperature (LST and albedo (ALB. The analysis involves conducting a seasonal trend analysis (STA to extract three seasonal shape parameters (Amplitude 0, Amplitude 1 and Amplitude 2 and using principal component analysis (PCA to contrast trends in change and no-change areas. We illustrate the method by characterizing trends in burned and unburned pixels in Alaska over the 2001–2009 time period. Findings show consistent and meaningful extraction of temporal patterns related to fire disturbances. The first principal component (PC1 is characterized by a decrease in mean NDVI (Amplitude 0 with a concurrent increase in albedo (the mean and the annual amplitude and an increase in LST annual variability (Amplitude 1. These results provide systematic empirical evidence of surface changes associated with one type of land change, fire disturbances, and suggest that STA with PCA may be used to characterize many other types of land transitions over large landscape areas using multivariate Earth observation time series.
Zhang, Jian; Hou, Dibo; Wang, Ke; Huang, Pingjie; Zhang, Guangxin; Loáiciga, Hugo
2017-05-01
The detection of organic contaminants in water distribution systems is essential to protect public health from potential harmful compounds resulting from accidental spills or intentional releases. Existing methods for detecting organic contaminants are based on quantitative analyses such as chemical testing and gas/liquid chromatography, which are time- and reagent-consuming and involve costly maintenance. This study proposes a novel procedure based on discrete wavelet transform and principal component analysis for detecting organic contamination events from ultraviolet spectral data. Firstly, the spectrum of each observation is transformed using discrete wavelet with a coiflet mother wavelet to capture the abrupt change along the wavelength. Principal component analysis is then employed to approximate the spectra based on capture and fusion features. The significant value of Hotelling's T 2 statistics is calculated and used to detect outliers. An alarm of contamination event is triggered by sequential Bayesian analysis when the outliers appear continuously in several observations. The effectiveness of the proposed procedure is tested on-line using a pilot-scale setup and experimental data.
McNabola, Aonghus; Broderick, Brian M; Gill, Laurence W
2009-10-01
Principal component analysis was used to examine air pollution personal exposure data of four urban commuter transport modes for their interrelationships between pollutants and relationships with traffic and meteorological data. Air quality samples of PM2.5 and VOCs were recorded during peak traffic congestion for the car, bus, cyclist and pedestrian between January 2005 and June 2006 on a busy route in Dublin, Ireland. In total, 200 personal exposure samples were recorded each comprising 17 variables describing the personal exposure concentrations, meteorological conditions and traffic conditions. The data reduction technique, principal component analysis (PCA), was used to create weighted linear combinations of the data and these were subsequently examined for interrelationships between the many variables recorded. The results of the PCA found that personal exposure concentrations in non-motorised forms of transport were influenced to a higher degree by wind speed, whereas personal exposure concentrations in motorised forms of transport were influenced to a higher degree by traffic congestion. The findings of the investigation show that the most effective mechanisms of personal exposure reduction differ between motorised and non-motorised modes of commuter transport.
International Nuclear Information System (INIS)
Lee, Yun Hee; Im, Hee Jung; Song, Byung ChoI; Park, Yong Joon; Kim, Won Ho; Cho, Jung Hwan
2005-01-01
This work demonstrates a developed program to reduce noises of a prompt gamma-ray spectrum measured by irradiating neutrons into baggage. The noises refer to random variations mainly caused by electrical fluctuations and also by a measurement time. Especially, since the short measurement time yields such a noisy spectrum in which its special peak can not be observed, it is necessary to extract its characteristic signals from the spectrum to identify an explosive hidden in luggage. Principal component analysis(PCA) that is a multivariate statistical technique is closely related to singular value decomposition(SVD). The SVD-based PCA decreases the noise by reconstructing the spectrum after determining the number of principal components corresponding important signals based on the history data that sufficiently describe its population. In this study, we present a visualized program of the above procedure using the MATLAB 7.04 programming language. When our program is started, it requires an arbitrary measured spectrum to be reduced and history spectra as input files. If user selects the files with menu, our program automatically carries out the PCA procedure and provides its noise-reduced spectrum plot as well as the original spectrum plot into an output window. In addition, user can obtain signal-to-noise ratio of an interesting peak by defining the peak and noise ranges with menu
Towards the generation of a parametric foot model using principal component analysis: A pilot study.
Scarton, Alessandra; Sawacha, Zimi; Cobelli, Claudio; Li, Xinshan
2016-06-01
There have been many recent developments in patient-specific models with their potential to provide more information on the human pathophysiology and the increase in computational power. However they are not yet successfully applied in a clinical setting. One of the main challenges is the time required for mesh creation, which is difficult to automate. The development of parametric models by means of the Principle Component Analysis (PCA) represents an appealing solution. In this study PCA has been applied to the feet of a small cohort of diabetic and healthy subjects, in order to evaluate the possibility of developing parametric foot models, and to use them to identify variations and similarities between the two populations. Both the skin and the first metatarsal bones have been examined. Besides the reduced sample of subjects considered in the analysis, results demonstrated that the method adopted herein constitutes a first step towards the realization of a parametric foot models for biomechanical analysis. Furthermore the study showed that the methodology can successfully describe features in the foot, and evaluate differences in the shape of healthy and diabetic subjects. Copyright © 2016 IPEM. Published by Elsevier Ltd. All rights reserved.
A Fault Prognosis Strategy Based on Time-Delayed Digraph Model and Principal Component Analysis
Directory of Open Access Journals (Sweden)
Ningyun Lu
2012-01-01
Full Text Available Because of the interlinking of process equipments in process industry, event information may propagate through the plant and affect a lot of downstream process variables. Specifying the causality and estimating the time delays among process variables are critically important for data-driven fault prognosis. They are not only helpful to find the root cause when a plant-wide disturbance occurs, but to reveal the evolution of an abnormal event propagating through the plant. This paper concerns with the information flow directionality and time-delay estimation problems in process industry and presents an information synchronization technique to assist fault prognosis. Time-delayed mutual information (TDMI is used for both causality analysis and time-delay estimation. To represent causality structure of high-dimensional process variables, a time-delayed signed digraph (TD-SDG model is developed. Then, a general fault prognosis strategy is developed based on the TD-SDG model and principle component analysis (PCA. The proposed method is applied to an air separation unit and has achieved satisfying results in predicting the frequently occurred “nitrogen-block” fault.
Directory of Open Access Journals (Sweden)
Vincenti Matthew P
2005-11-01
Full Text Available Abstract Background The responses to interleukin 1 (IL-1 in human chondrocytes constitute a complex regulatory mechanism, where multiple transcription factors interact combinatorially to transcription-factor binding motifs (TFBMs. In order to select a critical set of TFBMs from genomic DNA information and an array-derived data, an efficient algorithm to solve a combinatorial optimization problem is required. Although computational approaches based on evolutionary algorithms are commonly employed, an analytical algorithm would be useful to predict TFBMs at nearly no computational cost and evaluate varying modelling conditions. Singular value decomposition (SVD is a powerful method to derive primary components of a given matrix. Applying SVD to a promoter matrix defined from regulatory DNA sequences, we derived a novel method to predict the critical set of TFBMs. Results The promoter matrix was defined to establish a quantitative relationship between the IL-1-driven mRNA alteration and genomic DNA sequences of the IL-1 responsive genes. The matrix was decomposed with SVD, and the effects of 8 potential TFBMs (5'-CAGGC-3', 5'-CGCCC-3', 5'-CCGCC-3', 5'-ATGGG-3', 5'-GGGAA-3', 5'-CGTCC-3', 5'-AAAGG-3', and 5'-ACCCA-3' were predicted from a pool of 512 random DNA sequences. The prediction included matches to the core binding motifs of biologically known TFBMs such as AP2, SP1, EGR1, KROX, GC-BOX, ABI4, ETF, E2F, SRF, STAT, IK-1, PPARγ, STAF, ROAZ, and NFκB, and their significance was evaluated numerically using Monte Carlo simulation and genetic algorithm. Conclusion The described SVD-based prediction is an analytical method to provide a set of potential TFBMs involved in transcriptional regulation. The results would be useful to evaluate analytically a contribution of individual DNA sequences.
Spatial control of groundwater contamination, using principal
Indian Academy of Sciences (India)
Spatial control of groundwater contamination, using principal component analysis ... anthropogenic (agricultural activities and domestic wastewaters), and marine ... The PC scores reflect the change of groundwater quality of geogenic origin ...
Todhunter, Fern
2015-07-01
To report on the relationship between competence and confidence in nursing students as users of information and communication technologies, using principal components analysis. In nurse education, learning about and learning using information and communication technologies is well established. Nursing students are one of the undergraduate populations in higher education required to use these resources for academic work and practice learning. Previous studies showing mixed experiences influenced the choice of an exploratory study to find out about information and communication technologies competence and confidence. A 48-item survey questionnaire was administered to a volunteer sample of first- and second-year nursing students between July 2008-April 2009. The cohort ( N = 375) represented 18·75% of first- and second-year undergraduates. A comparison between this work and subsequent studies reveal some similar ongoing issues and ways to address them. A principal components analysis (PCA) was carried out to determine the strength of the correlation between information and communication technologies competence and confidence. The aim was to show the presence of any underlying dimensions in the transformed data that would explain any variations in information and communication technologies competence and confidence. Cronbach's alpha values showed fair to good internal consistency. The five component structure gave medium to high results and explained 44·7% of the variance in the original data. Confidence had a high representation. The findings emphasized the shift towards social learning approaches for information and communication technologies. Informal social collaboration found favour with nursing students. Learning through talking, watching and listening all play a crucial role in the development of computing skills.
Ou, Hua-Se; Wei, Chao-Hai; Deng, Yang; Gao, Nai-Yun; Ren, Yuan; Hu, Yun
2014-02-01
A novel dual coagulant system of polyaluminum chloride sulfate (PACS) and polydiallyldimethylammonium chloride (PDADMAC) was used to treat natural algae-laden water from Meiliang Gulf, Lake Taihu. PACS (Aln(OH)mCl3n-m-2k(SO4)k) has a mass ratio of 10 %, a SO4 (2-)/Al3 (+) mole ratio of 0.0664, and an OH/Al mole ratio of 2. The PDADMAC ([C8H16NCl]m) has a MW which ranges from 5 × 10(5) to 20 × 10(5) Da. The variations of contaminants in water samples during treatments were estimated in the form of principal component analysis (PCA) factor scores and conventional variables (turbidity, DOC, etc.). Parallel factor analysis determined four chromophoric dissolved organic matters (CDOM) components, and PCA identified four integrated principle factors. PCA factor 1 had significant correlations with chlorophyll-a (r=0.718), protein-like CDOM C1 (0.689), and C2 (0.756). Factor 2 correlated with UV254 (0.672), humic-like CDOM component C3 (0.716), and C4 (0.758). Factors 3 and 4 had correlations with NH3-N (0.748) and T-P (0.769), respectively. The variations of PCA factors scores revealed that PACS contributed less aluminum dissolution than PAC to obtain equivalent removal efficiency of contaminants. This might be due to the high cationic charge and pre-hydrolyzation of PACS. Compared with PACS coagulation (20 mg L(-1)), the removal of PCA factors 1, 2, and 4 increased 45, 33, and 12 %, respectively, in combined PACS-PDADMAC treatment (0.8 mg L(-1) +20 mg L(-1)). Since PAC contained more Al (0.053 g/1 g) than PACS (0.028 g/1 g), the results indicated that PACS contributed less Al dissolution into the water to obtain equivalent removal efficiency.
International Nuclear Information System (INIS)
Schulz, H.; Miller, A.
1995-01-01
The programme of the OECD-NEA Principal Working Group No.3 on reactor component integrity is described including the following issues: regular Committee meetings; non-destructive testing; fracture analysis; aging; related activities
Directory of Open Access Journals (Sweden)
István P. Sugár
2010-01-01
Full Text Available Lipid lateral organization in binary-constituent monolayers consisting of fluorescent and nonfluorescent lipids has been investigated by acquiring multiple emission spectra during measurement of each force-area isotherm. The emission spectra reflect BODIPY-labeled lipid surface concentration and lateral mixing with different nonfluorescent lipid species. Using principal component analysis (PCA each spectrum could be approximated as the linear combination of only two principal vectors. One point on a plane could be associated with each spectrum, where the coordinates of the point are the coefficients of the linear combination. Points belonging to the same lipid constituents and experimental conditions form a curve on the plane, where each point belongs to a different mole fraction. The location and shape of the curve reflects the lateral organization of the fluorescent lipid mixed with a specific nonfluorescent lipid. The method provides massive data compression that preserves and emphasizes key information pertaining to lipid distribution in different lipid monolayer phases. Collectively, the capacity of PCA for handling large spectral data sets, the nanoscale resolution afforded by the fluorescence signal, and the inherent versatility of monolayers for characterization of lipid lateral interactions enable significantly enhanced resolution of lipid lateral organizational changes induced by different lipid compositions.
Energy Technology Data Exchange (ETDEWEB)
Igwenagu, C.M. [Enugu State University of Science and Technology (Nigeria). Dept. of Industrial Mathematics, Applied Statistics and Demography
2011-07-01
This study has examined the position of Nigeria in relation to carbon dioxide (CO{sub 2}) emission in readiness for emission trading as proposed in the Kyoto protocol as a measure of reducing global warming. It was discovered that Nigeria emits only 0.4% of the world's total CO{sub 2} emission indicating that they will be possible sellers of emission as contained in the Kyoto protocol. Fifty countries were selected for the analysis and some possible correlates of CO{sub 2} were considered. Correlation analysis and principal component analysis revealed that gross domestic product and industrial output accounted for 93% of the total variation. Based on this, a very low economic activity is being experienced in the country.
Silveira, Landulfo, Jr.; Silveira, Fabrício L.; Bodanese, Benito; Pacheco, Marcos Tadeu T.; Zângaro, Renato A.
2012-02-01
This work demonstrated the discrimination among basal cell carcinoma (BCC) and normal human skin in vivo using near-infrared Raman spectroscopy. Spectra were obtained in the suspected lesion prior resectional surgery. After tissue withdrawn, biopsy fragments were submitted to histopathology. Spectra were also obtained in the adjacent, clinically normal skin. Raman spectra were measured using a Raman spectrometer (830 nm) with a fiber Raman probe. By comparing the mean spectra of BCC with the normal skin, it has been found important differences in the 800-1000 cm-1 and 1250-1350 cm-1 (vibrations of C-C and amide III, respectively, from lipids and proteins). A discrimination algorithm based on Principal Components Analysis and Mahalanobis distance (PCA/MD) could discriminate the spectra of both tissues with high sensitivity and specificity.
Directory of Open Access Journals (Sweden)
Renfu Jia
2016-01-01
Full Text Available This paper introduces an integrated approach to find out the major factors influencing efficiency of irrigation water use in China. It combines multiple stepwise regression (MSR and principal component analysis (PCA to obtain more realistic results. In real world case studies, classical linear regression model often involves too many explanatory variables and the linear correlation issue among variables cannot be eliminated. Linearly correlated variables will cause the invalidity of the factor analysis results. To overcome this issue and reduce the number of the variables, PCA technique has been used combining with MSR. As such, the irrigation water use status in China was analyzed to find out the five major factors that have significant impacts on irrigation water use efficiency. To illustrate the performance of the proposed approach, the calculation based on real data was conducted and the results were shown in this paper.
International Nuclear Information System (INIS)
Yang Hong-Xing; Fu Hong-Bo; Wang Hua-Dong; Jia Jun-Wei; Dong Feng-Zhong; Sigrist, Markus W
2016-01-01
Laser-induced breakdown spectroscopy (LIBS) is a versatile tool for both qualitative and quantitative analysis. In this paper, LIBS combined with principal component analysis (PCA) and support vector machine (SVM) is applied to rock analysis. Fourteen emission lines including Fe, Mg, Ca, Al, Si, and Ti are selected as analysis lines. A good accuracy (91.38% for the real rock) is achieved by using SVM to analyze the spectroscopic peak area data which are processed by PCA. It can not only reduce the noise and dimensionality which contributes to improving the efficiency of the program, but also solve the problem of linear inseparability by combining PCA and SVM. By this method, the ability of LIBS to classify rock is validated. (paper)
Directory of Open Access Journals (Sweden)
Septa Firmansyah Putra
2017-01-01
Full Text Available Penelitian ini bertujuan untuk mengetahui atribut-atribut apa yang akan digunakan untuk klasterisasi provinsi di Indonesia berdasarkan faktor kesiapan dalam menghadapi bencana. Data yang digunakan terdiri dari tiga kelompok data yaitu data jumlah kejadian bencana yang terdiri dari 19 sub-atribut, data jumlah fasilitas kesehatan yang terdiri dari 14 sub-atribut dan data jumlah tenaga kesehatan yang terdiri dari 11 sub atribut. Penelitian ini dapat menjadi gambaran tentang bagaimana melakukan pembersihan dan pemilihan data sebelum digunakan dalam proses klasterisasi. Data-data ini akan dibersihkan dan dipilih sebelum nantinya digunakan pada proses klasterisasi. Proses pembersihan dan pemilihan data dilakukan dengan bantuan PCA (Principal Component Analysis namun sebelumnya dibersihkan telebih dahulu dengan cara manual. Penelitian dibagi menjadi 3 percobaan. Pada percobaan pertama didapatkan 31 sub-atribut yang siap digunakan, percobaan kedua didapatkan 29 sub-atribut yang siap digunakan dan pada percobaan ketiga didapatkan 24 sub-atribut yang siap digunakan.
Ma, Mengli; Lei, En; Meng, Hengling; Wang, Tiantao; Xie, Linyan; Shen, Dong; Xianwang, Zhou; Lu, Bingyue
2017-08-01
Amomum tsao-ko is a commercial plant that used for various purposes in medicinal and food industries. For the present investigation, 44 germplasm samples were collected from Jinping County of Yunnan Province. Clusters analysis and 2-dimensional principal component analysis (PCA) was used to represent the genetic relations among Amomum tsao-ko by using simple sequence repeat (SSR) markers. Clustering analysis clearly distinguished the samples groups. Two major clusters were formed; first (Cluster I) consisted of 34 individuals, the second (Cluster II) consisted of 10 individuals, Cluster I as the main group contained multiple sub-clusters. PCA also showed 2 groups: PCA Group 1 included 29 individuals, PCA Group 2 included 12 individuals, consistent with the results of cluster analysis. The purpose of the present investigation was to provide information on genetic relationship of Amomum tsao-ko germplasm resources in main producing areas, also provide a theoretical basis for the protection and utilization of Amomum tsao-ko resources.
Directory of Open Access Journals (Sweden)
Othman Nasri
2015-01-01
Full Text Available This paper presents a fault detection and isolation (FDI approach in order to detect and isolate actuators (thrusters and reaction wheels faults of an autonomous spacecraft involved in the rendez-vous phase of the Mars Sample Return (MSR mission. The principal component analysis (PCA has been adopted to estimate the relationships between the various variables of the process. To ensure the feasibility of the proposed FDI approach, a set of data provided by the industrial “high-fidelity” simulator of the MSR and representing the opening (resp., the rotation rates of the spacecraft thrusters (resp., reaction wheels has been considered. The test results demonstrate that the fault detection and isolation are successfully accomplished.
Directory of Open Access Journals (Sweden)
Shaukat S. Shahid
2016-06-01
Full Text Available In this study, we used bootstrap simulation of a real data set to investigate the impact of sample size (N = 20, 30, 40 and 50 on the eigenvalues and eigenvectors resulting from principal component analysis (PCA. For each sample size, 100 bootstrap samples were drawn from environmental data matrix pertaining to water quality variables (p = 22 of a small data set comprising of 55 samples (stations from where water samples were collected. Because in ecology and environmental sciences the data sets are invariably small owing to high cost of collection and analysis of samples, we restricted our study to relatively small sample sizes. We focused attention on comparison of first 6 eigenvectors and first 10 eigenvalues. Data sets were compared using agglomerative cluster analysis using Ward’s method that does not require any stringent distributional assumptions.
Liu, Ming; Zhao, Jing; Lu, XiaoZuo; Li, Gang; Wu, Taixia; Zhang, LiFu
2018-05-10
With spectral methods, noninvasive determination of blood hyperviscosity in vivo is very potential and meaningful in clinical diagnosis. In this study, 67 male subjects (41 health, and 26 hyperviscosity according to blood sample analysis results) participate. Reflectance spectra of subjects' tongue tips is measured, and a classification method bases on principal component analysis combined with artificial neural network model is built to identify hyperviscosity. Hold-out and Leave-one-out methods are used to avoid significant bias and lessen overfitting problem, which are widely accepted in the model validation. To measure the performance of the classification, sensitivity, specificity, accuracy and F-measure are calculated, respectively. The accuracies with 100 times Hold-out method and 67 times Leave-one-out method are 88.05% and 97.01%, respectively. Experimental results indicate that the built classification model has certain practical value and proves the feasibility of using spectroscopy to identify hyperviscosity by noninvasive determination.
Fang, Leyuan; Wang, Chong; Li, Shutao; Yan, Jun; Chen, Xiangdong; Rabbani, Hossein
2017-11-01
We present an automatic method, termed as the principal component analysis network with composite kernel (PCANet-CK), for the classification of three-dimensional (3-D) retinal optical coherence tomography (OCT) images. Specifically, the proposed PCANet-CK method first utilizes the PCANet to automatically learn features from each B-scan of the 3-D retinal OCT images. Then, multiple kernels are separately applied to a set of very important features of the B-scans and these kernels are fused together, which can jointly exploit the correlations among features of the 3-D OCT images. Finally, the fused (composite) kernel is incorporated into an extreme learning machine for the OCT image classification. We tested our proposed algorithm on two real 3-D spectral domain OCT (SD-OCT) datasets (of normal subjects and subjects with the macular edema and age-related macular degeneration), which demonstrated its effectiveness. (2017) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).
International Nuclear Information System (INIS)
Critto, Andrea; Carlon, Claudio; Marcomini, Antonio
2003-01-01
Information on soil and groundwater contamination was used to develop a site conceptual model and to identify exposure scenarios. - The characterization of a hydrologically complex contaminated site bordering the lagoon of Venice (Italy) was undertaken by investigating soils and groundwaters affected by the chemical contaminants originated by the wastes dumped into an illegal landfill. Statistical tools such as principal components analysis and geostatistical techniques were applied to obtain the spatial distribution of chemical contaminants. Dissolved organic carbon (DOC), SO 4 2- and Cl - were used to trace the migration of the contaminants from the top soil to the underlying groundwaters. The chemical and hydrogeological available information was assembled to obtain the schematic of the conceptual model of the contaminated site capable to support the formulation of major exposure scenarios, which are also provided
International Nuclear Information System (INIS)
Burns, W.A.; Mankiewicz, P.J.; Bence, A.E.; Page, D.S.; Parker, K.R.
1997-01-01
A method was developed to allocate polycyclic aromatic hydrocarbons (PAHs) in sediment samples to the PAH sources from which they came. The method uses principal-component analysis to identify possible sources and a least-squares model to find the source mix that gives the best fit of 36 PAH analytes in each sample. The method identified 18 possible PAH sources in a large set of field data collected in Prince William Sound, Alaska, USA, after the 1989 Exxon Valdez oil spill, including diesel oil, diesel soot, spilled crude oil in various weathering states, natural background, creosote, and combustion products from human activities and forest fires. Spill oil was generally found to be a small increment of the natural background in subtidal sediments, whereas combustion products were often the predominant sources for subtidal PAHs near sites of past or present human activity. The method appears to be applicable to other situations, including other spills
Directory of Open Access Journals (Sweden)
Marcelo A Lima
Full Text Available The year 2007 was marked by widespread adverse clinical responses to heparin use, leading to a global recall of potentially affected heparin batches in 2008. Several analytical methods have since been developed to detect impurities in heparin preparations; however, many are costly and dependent on instrumentation with only limited accessibility. A method based on a simple UV-scanning assay, combined with principal component analysis (PCA, was developed to detect impurities, such as glycosaminoglycans, other complex polysaccharides and aromatic compounds, in heparin preparations. Results were confirmed by NMR spectroscopy. This approach provides an additional, sensitive tool to determine heparin purity and safety, even when NMR spectroscopy failed, requiring only standard laboratory equipment and computing facilities.
International Nuclear Information System (INIS)
Gudiksen, P.H.; Walton, J.J.; Alpert, D.J.; Johnson, J.D.
1982-01-01
This work explores the use of principal components analysis coupled to three-dimensional atmospheric transport and dispersion models for evaluating the environmental consequences of reactor accidents. This permits the inclusion of meteorological data from multiple sites and the effects of topography in the consequence evaluation; features not normally included in such analyses. The technique identifies prevailing regional wind patterns and their frequencies for use in the transport and dispersion calculations. Analysis of a hypothetical accident scenario involving a release of radioactivity from a reactor situated in a river valley indicated the technique is quite useful whenever recurring wind patterns exist, as is often the case in complex terrain situations. Considerable differences were revealed in a comparison with results obtained from a more conventional Gaussian plume model using only the reactor site meteorology and no topographic effects
International Nuclear Information System (INIS)
Reid, M.K.; Spencer, K.L.
2009-01-01
Principal components analysis (PCA) is a multivariate statistical technique capable of discerning patterns in large environmental datasets. Although widely used, there is disparity in the literature with respect to data pre-treatment prior to PCA. This research examines the influence of commonly reported data pre-treatment methods on PCA outputs, and hence data interpretation, using a typical environmental dataset comprising sediment geochemical data from an estuary in SE England. This study demonstrated that applying the routinely used log (x + 1) transformation skewed the data and masked important trends. Removing outlying samples and correcting for the influence of grain size had the most significant effect on PCA outputs and data interpretation. Reducing the influence of grain size using granulometric normalisation meant that other factors affecting metal variability, including mineralogy, anthropogenic sources and distance along the salinity transect could be identified and interpreted more clearly. - Data pre-treatment can have a significant influence on the outcome of PCA.
DEFF Research Database (Denmark)
Hajati, S.; Walton, J.; Tougaard, S.
2013-01-01
In a previous article, we studied the influence of spectral noise on a new method for three-dimensional X-ray photoelectron spectroscopy (3D XPS) imaging, which is based on analysis of the XPS peak shape [Hajati, S., Tougaard, S., Walton, J. & Fairley, N. (2008). Surf Sci 602, 3064-3070]. Here, we...... study in more detail the influence of noise reduction by principal component analysis (PCA) on 3D XPS images of carbon contamination of a patterned oxidized silicon sample and on 3D XPS images of Ag covered by a nanoscale patterned octadiene layer. PCA is very efficient for noise reduction, and using...... acquisition time. A small additional amount of information is obtained by using up to five PCA factors, but due to the increased noise level, this information can only be extracted if the intensity of the start and end points for each spectrum are obtained as averages over several energy points....
International Nuclear Information System (INIS)
Cossement, Damien; Renaux, Fabian; Thiry, Damien; Ligot, Sylvie; Francq, Rémy; Snyders, Rony
2015-01-01
Graphical abstract: - Highlights: • Plasma polymer films have a chemical selectivity and a cross-linking degree which are known to vary in opposite trends. • Three plasma polymers families were used as model organic layers for cross-linking evaluation by ToF-SIMS and principal component analysis. • The data were cross-checked with related functional properties that are known to depend on the cross-linking degree (stability in solvent, mechanical properties, …). • The suggested cross-linking evaluation method was validated for different families of plasma polymers demonstrating that it can be seen as a “general” method. - Abstract: It is accepted that the macroscopic properties of functional plasma polymer films (PPF) are defined by their functional density and their crosslinking degree (χ) which are quantities that most of the time behave in opposite trends. If the PPF chemistry is relatively easy to evaluate, it is much more challenging for χ. This paper reviews the recent work developed in our group on the application of principal component analysis (PCA) to time-of-flight secondary ion mass spectrometric (ToF-SIMS) positive spectra data in order to extract the relative cross-linking degree (χ) of PPF. NH_2-, COOR- and SH-containing PPF synthesized in our group by plasma enhanced chemical vapor deposition (PECVD) varying the applied radiofrequency power (P_R_F), have been used as model surfaces. For the three plasma polymer families, the scores of the first computed principal component (PC1) highlighted significant differences in the chemical composition supported by X-Ray photoelectron spectroscopy (XPS) data. The most important fragments contributing to PC1 (loadings > 90%) were used to compute an average C/H ratio index for samples synthesized at low and high P_R_F. This ratio being an evaluation of χ, these data, accordingly to the literature, indicates an increase of χ with P_R_F excepted for the SH-PPF. These results have been cross
Directory of Open Access Journals (Sweden)
Mihail Lucian Birsa
2011-10-01
Full Text Available In this paper we present several expert systems that predict the class identity of the modeled compounds, based on a preprocessed spectral database. The expert systems were built using Artificial Neural Networks (ANN and are designed to predict if an unknown compound has the toxicological activity of amphetamines (stimulant and hallucinogen, or whether it is a nonamphetamine. In attempts to circumvent the laws controlling drugs of abuse, new chemical structures are very frequently introduced on the black market. They are obtained by slightly modifying the controlled molecular structures by adding or changing substituents at various positions on the banned molecules. As a result, no substance similar to those forming a prohibited class may be used nowadays, even if it has not been specifically listed. Therefore, reliable, fast and accessible systems capable of modeling and then identifying similarities at molecular level, are highly needed for epidemiological, clinical, and forensic purposes. In order to obtain the expert systems, we have preprocessed a concatenated spectral database, representing the GC-FTIR (gas chromatography-Fourier transform infrared spectrometry and GC-MS (gas chromatography-mass spectrometry spectra of 103 forensic compounds. The database was used as input for a Principal Component Analysis (PCA. The scores of the forensic compounds on the main principal components (PCs were then used as inputs for the ANN systems. We have built eight PC-ANN systems (principal component analysis coupled with artificial neural network with a different number of input variables: 15 PCs, 16 PCs, 17 PCs, 18 PCs, 19 PCs, 20 PCs, 21 PCs and 22 PCs. The best expert system was found to be the ANN network built with 18 PCs, which accounts for an explained variance of 77%. This expert system has the best sensitivity (a rate of classification C = 100% and a rate of true positives TP = 100%, as well as a good selectivity (a rate of true negatives TN
Energy Technology Data Exchange (ETDEWEB)
Cossement, Damien, E-mail: damien.cossement@materianova.be [Materia Nova Research Center, Parc Initialis, 1, Avenue Nicolas Copernic, B-7000 Mons (Belgium); Renaux, Fabian [Materia Nova Research Center, Parc Initialis, 1, Avenue Nicolas Copernic, B-7000 Mons (Belgium); Thiry, Damien; Ligot, Sylvie [Chimie des Interactions Plasma-Surface (ChIPS), CIRMAP, Université de Mons, 23 Place du Parc, B-7000 Mons (Belgium); Francq, Rémy; Snyders, Rony [Materia Nova Research Center, Parc Initialis, 1, Avenue Nicolas Copernic, B-7000 Mons (Belgium); Chimie des Interactions Plasma-Surface (ChIPS), CIRMAP, Université de Mons, 23 Place du Parc, B-7000 Mons (Belgium)
2015-11-15
Graphical abstract: - Highlights: • Plasma polymer films have a chemical selectivity and a cross-linking degree which are known to vary in opposite trends. • Three plasma polymers families were used as model organic layers for cross-linking evaluation by ToF-SIMS and principal component analysis. • The data were cross-checked with related functional properties that are known to depend on the cross-linking degree (stability in solvent, mechanical properties, …). • The suggested cross-linking evaluation method was validated for different families of plasma polymers demonstrating that it can be seen as a “general” method. - Abstract: It is accepted that the macroscopic properties of functional plasma polymer films (PPF) are defined by their functional density and their crosslinking degree (χ) which are quantities that most of the time behave in opposite trends. If the PPF chemistry is relatively easy to evaluate, it is much more challenging for χ. This paper reviews the recent work developed in our group on the application of principal component analysis (PCA) to time-of-flight secondary ion mass spectrometric (ToF-SIMS) positive spectra data in order to extract the relative cross-linking degree (χ) of PPF. NH{sub 2}-, COOR- and SH-containing PPF synthesized in our group by plasma enhanced chemical vapor deposition (PECVD) varying the applied radiofrequency power (P{sub RF}), have been used as model surfaces. For the three plasma polymer families, the scores of the first computed principal component (PC1) highlighted significant differences in the chemical composition supported by X-Ray photoelectron spectroscopy (XPS) data. The most important fragments contributing to PC1 (loadings > 90%) were used to compute an average C/H ratio index for samples synthesized at low and high P{sub RF}. This ratio being an evaluation of χ, these data, accordingly to the literature, indicates an increase of χ with P{sub RF} excepted for the SH-PPF. These results have
Cossement, Damien; Renaux, Fabian; Thiry, Damien; Ligot, Sylvie; Francq, Rémy; Snyders, Rony
2015-11-01
It is accepted that the macroscopic properties of functional plasma polymer films (PPF) are defined by their functional density and their crosslinking degree (χ) which are quantities that most of the time behave in opposite trends. If the PPF chemistry is relatively easy to evaluate, it is much more challenging for χ. This paper reviews the recent work developed in our group on the application of principal component analysis (PCA) to time-of-flight secondary ion mass spectrometric (ToF-SIMS) positive spectra data in order to extract the relative cross-linking degree (χ) of PPF. NH2-, COOR- and SH-containing PPF synthesized in our group by plasma enhanced chemical vapor deposition (PECVD) varying the applied radiofrequency power (PRF), have been used as model surfaces. For the three plasma polymer families, the scores of the first computed principal component (PC1) highlighted significant differences in the chemical composition supported by X-Ray photoelectron spectroscopy (XPS) data. The most important fragments contributing to PC1 (loadings > 90%) were used to compute an average C/H ratio index for samples synthesized at low and high PRF. This ratio being an evaluation of χ, these data, accordingly to the literature, indicates an increase of χ with PRF excepted for the SH-PPF. These results have been cross-checked by the evaluation of functional properties of the plasma polymers namely a linear correlation with the stability of NH2-PPF in ethanol and a correlation with the mechanical properties of the COOR-PPF. For the SH-PPF family, the peculiar evolution of χ is supported by the understanding of the growth mechanism of the PPF from plasma diagnostic. The whole set of data clearly demonstrates the potential of the PCA method for extracting information on the microstructure of plasma polymers from ToF-SIMS measurements.
Iqbal, Abdullah; Valous, Nektarios A; Sun, Da-Wen; Allen, Paul
2011-02-01
Lacunarity is about quantifying the degree of spatial heterogeneity in the visual texture of imagery through the identification of the relationships between patterns and their spatial configurations in a two-dimensional setting. The computed lacunarity data can designate a mathematical index of spatial heterogeneity, therefore the corresponding feature vectors should possess the necessary inter-class statistical properties that would enable them to be used for pattern recognition purposes. The objectives of this study is to construct a supervised parsimonious classification model of binary lacunarity data-computed by Valous et al. (2009)-from pork ham slice surface images, with the aid of kernel principal component analysis (KPCA) and artificial neural networks (ANNs), using a portion of informative salient features. At first, the dimension of the initial space (510 features) was reduced by 90% in order to avoid any noise effects in the subsequent classification. Then, using KPCA, the first nineteen kernel principal components (99.04% of total variance) were extracted from the reduced feature space, and were used as input in the ANN. An adaptive feedforward multilayer perceptron (MLP) classifier was employed to obtain a suitable mapping from the input dataset. The correct classification percentages for the training, test and validation sets were 86.7%, 86.7%, and 85.0%, respectively. The results confirm that the classification performance was satisfactory. The binary lacunarity spatial metric captured relevant information that provided a good level of differentiation among pork ham slice images. Copyright © 2010 The American Meat Science Association. Published by Elsevier Ltd. All rights reserved.
Directory of Open Access Journals (Sweden)
Miaomiao Jiang
Full Text Available Botanical primary metabolites extensively exist in herbal medicine injections (HMIs, but often were ignored to control. With the limitation of bias towards hydrophilic substances, the primary metabolites with strong polarity, such as saccharides, amino acids and organic acids, are usually difficult to detect by the routinely applied reversed-phase chromatographic fingerprint technology. In this study, a proton nuclear magnetic resonance (1H NMR profiling method was developed for efficient identification and quantification of small polar molecules, mostly primary metabolites in HMIs. A commonly used medicine, Danhong injection (DHI, was employed as a model. With the developed method, 23 primary metabolites together with 7 polyphenolic acids were simultaneously identified, of which 13 metabolites with fully separated proton signals were quantified and employed for further multivariate quality control assay. The quantitative 1H NMR method was validated with good linearity, precision, repeatability, stability and accuracy. Based on independence principal component analysis (IPCA, the contents of 13 metabolites were characterized and dimensionally reduced into the first two independence principal components (IPCs. IPC1 and IPC2 were then used to calculate the upper control limits (with 99% confidence ellipsoids of χ2 and Hotelling T2 control charts. Through the constructed upper control limits, the proposed method was successfully applied to 36 batches of DHI to examine the out-of control sample with the perturbed levels of succinate, malonate, glucose, fructose, salvianic acid and protocatechuic aldehyde. The integrated strategy has provided a reliable approach to identify and quantify multiple polar metabolites of DHI in one fingerprinting spectrum, and it has also assisted in the establishment of IPCA models for the multivariate statistical evaluation of HMIs.
Directory of Open Access Journals (Sweden)
Tomáš Fekete
2016-05-01
Full Text Available The subject of this work was to examine differences in chemical composition of sliced and whole stick Nitran salamis, purchased from various manufacturers. Nitran salamis are traditional dry fermented meat products of Slovak origin. Taking into account variations in raw materials, production process and potential adulteration, differences in chemical composition within one brand of salami from different manufacturers might be expected. Ten salamis were determined for basic chemical composition attributes and Principal Component Analysis was applied on data matrix to identify anomalous ones. It has been shown that six attributes, namely: protein without collagen of total protein, total protein, total meat, total fat, collagen of total protein and NaCl, were the most important for salamis as first two Principal Components together explained 70.16% of variance among them. Nitran D was found to be the most anomalous salami, as had the lowest value of protein without collagen of total protein (14.14% ±0.26%, total protein (17.42% ±0.44%, total meat (120.29% ±0.98% and the highest one of total fat (50.85% ±0.95%, collagen of total protein (18.83% ±0.50% and NaCl (9.55% ±1.93%, when compared to its whole stick variant Nitran C and other samples. In addition to collagen of total protein content, Nitran D together with Nitran A, F and H did not satisfied the legislatively determined criterion, which is ≤16%. This suggested that extra connective tissues were added to intermediate products, which resulted in high variability and inferior quality of final products. It is a common practice in the meat industry to increase the protein content or water binding properties of meat products.
Directory of Open Access Journals (Sweden)
Delong Feng
2016-05-01
Full Text Available Remaining useful life estimation of the prognostics and health management technique is a complicated and difficult research question for maintenance. In this article, we consider the problem of prognostics modeling and estimation of the turbofan engine under complicated circumstances and propose a kernel principal component analysis–based degradation model and remaining useful life estimation method for such aircraft engine. We first analyze the output data created by the turbofan engine thermodynamic simulation that is based on the kernel principal component analysis method and then distinguish the qualitative and quantitative relationships between the key factors. Next, we build a degradation model for the engine fault based on the following assumptions: the engine has only had constant failure (i.e. no sudden failure is included, and the engine has a Wiener process, which is a covariate stand for the engine system drift. To predict the remaining useful life of the turbofan engine, we built a health index based on the degradation model and used the method of maximum likelihood and the data from the thermodynamic simulation model to estimate the parameters of this degradation model. Through the data analysis, we obtained a trend model of the regression curve line that fits with the actual statistical data. Based on the predicted health index model and the data trend model, we estimate the remaining useful life of the aircraft engine as the index reaches zero. At last, a case study involving engine simulation data demonstrates the precision and performance advantages of this prediction method that we propose. At last, a case study involving engine simulation data demonstrates the precision and performance advantages of this proposed method, the precision of the method can reach to 98.9% and the average precision is 95.8%.
International Nuclear Information System (INIS)
Soehn, Matthias; Alber, Markus; Yan Di
2007-01-01
Purpose: The variability of dose-volume histogram (DVH) shapes in a patient population can be quantified using principal component analysis (PCA). We applied this to rectal DVHs of prostate cancer patients and investigated the correlation of the PCA parameters with late bleeding. Methods and Materials: PCA was applied to the rectal wall DVHs of 262 patients, who had been treated with a four-field box, conformal adaptive radiotherapy technique. The correlated changes in the DVH pattern were revealed as 'eigenmodes,' which were ordered by their importance to represent data set variability. Each DVH is uniquely characterized by its principal components (PCs). The correlation of the first three PCs and chronic rectal bleeding of Grade 2 or greater was investigated with uni- and multivariate logistic regression analyses. Results: Rectal wall DVHs in four-field conformal RT can primarily be represented by the first two or three PCs, which describe ∼94% or 96% of the DVH shape variability, respectively. The first eigenmode models the total irradiated rectal volume; thus, PC1 correlates to the mean dose. Mode 2 describes the interpatient differences of the relative rectal volume in the two- or four-field overlap region. Mode 3 reveals correlations of volumes with intermediate doses (∼40-45 Gy) and volumes with doses >70 Gy; thus, PC3 is associated with the maximal dose. According to univariate logistic regression analysis, only PC2 correlated significantly with toxicity. However, multivariate logistic regression analysis with the first two or three PCs revealed an increased probability of bleeding for DVHs with more than one large PC. Conclusions: PCA can reveal the correlation structure of DVHs for a patient population as imposed by the treatment technique and provide information about its relationship to toxicity. It proves useful for augmenting normal tissue complication probability modeling approaches
Directory of Open Access Journals (Sweden)
Birifdzi Zimisuhara
2015-06-01
Full Text Available Genetic structure and biodiversity of the medicinal plant Ficus deltoidea have rarely been scrutinized. To fill these lacunae, five varieties, consisting of 30 F. deltoidea accessions were collected across the country and studied on the basis of molecular and morphological data. Molecular analysis of the accessions was performed using nine Inter Simple Sequence Repeat (ISSR markers, seven of which were detected as polymorphic markers. ISSR-based clustering generated four clusters supporting the geographical distribution of the accessions to some extent. The Jaccard’s similarity coefficient implied the existence of low diversity (0.50–0.75 in the studied population. STRUCTURE analysis showed a low differentiation among the sampling sites, while a moderate varietal differentiation was unveiled with two main populations of F. deltoidea. Our observations confirmed the occurrence of gene flow among the accessions; however, the highest degree of this genetic interference was related to the three accessions of FDDJ10, FDTT16 and FDKT25. These three accessions may be the genetic intervarietal fusion points of the plant’s population. Principal Components Analysis (PCA relying on quantitative morphological characteristics resulted in two principal components with Eigenvalue >1 which made up 89.96% of the total variation. The cluster analysis performed by the eight quantitative characteristics led to grouping the accessions into four clusters with a Euclidean distance ranged between 0.06 and 1.10. Similarly, a four-cluster dendrogram was generated using qualitative traits. The qualitative characteristics were found to be more discriminating in the cluster and PCA analyses, while ISSRs were more informative on the evolution and genetic structure of the population.
Generalized principal component analysis
Vidal, René; Sastry, S S
2016-01-01
This book provides a comprehensive introduction to the latest advances in the mathematical theory and computational tools for modeling high-dimensional data drawn from one or multiple low-dimensional subspaces (or manifolds) and potentially corrupted by noise, gross errors, or outliers. This challenging task requires the development of new algebraic, geometric, statistical, and computational methods for efficient and robust estimation and segmentation of one or multiple subspaces. The book also presents interesting real-world applications of these new methods in image processing, image and video segmentation, face recognition and clustering, and hybrid system identification etc. This book is intended to serve as a textbook for graduate students and beginning researchers in data science, machine learning, computer vision, image and signal processing, and systems theory. It contains ample illustrations, examples, and exercises and is made largely self-contained with three Appendices which survey basic concepts ...
Paolini, Enrico; Moretti, Patrizia; Compton, Michael T
2016-09-30
Although delusions represent one of the core symptoms of psychotic disorders, it is remarkable that few studies have investigated distinct delusional themes. We analyzed data from a large sample of first-episode psychosis patients (n=245) to understand relations between delusion types and demographic and clinical correlates. First, we conducted a principal component analysis (PCA) of the 12 delusion items within the Scale for the Assessment of Positive Symptoms (SAPS). Then, using the domains derived via PCA, we tested a priori hypotheses and answered exploratory research questions related to delusional content. PCA revealed five distinct components: Delusions of Influence, Grandiose/Religious Delusions, Paranoid Delusions, Negative Affect Delusions (jealousy, and sin or guilt), and Somatic Delusions. The most prevalent type of delusion was Paranoid Delusions, and such delusions were more common at older ages at onset of psychosis. The level of Delusions of Influence was correlated with the severity of hallucinations and negative symptoms. We ascertained a general relationship between different childhood adversities and delusional themes, and a specific relationship between Somatic Delusions and childhood neglect. Moreover, we found higher scores on Delusions of Influence and Negative Affect Delusions among cannabis and stimulant users. Our results support considering delusions as varied experiences with varying prevalences and correlates. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
DEFF Research Database (Denmark)
Aukland, Preben; Lando, Martin; Vilholm, Ole
2016-01-01
BACKGROUND: The "Status Epilepticus Severity Score" (STESS) is the most important clinical score to predict in-hospital mortality of patients with status epilepticus (SE), but its prognostic relevance for long-term survival is unknown. This study therefore examined if STESS and its components...
Otero, Federico; Norte, Federico; Araneo, Diego
2018-01-01
The aim of this work is to obtain an index for predicting the probability of occurrence of zonda event at surface level from sounding data at Mendoza city, Argentine. To accomplish this goal, surface zonda wind events were previously found with an objective classification method (OCM) only considering the surface station values. Once obtained the dates and the onset time of each event, the prior closest sounding for each event was taken to realize a principal component analysis (PCA) that is used to identify the leading patterns of the vertical structure of the atmosphere previously to a zonda wind event. These components were used to construct the index model. For the PCA an entry matrix of temperature ( T) and dew point temperature (Td) anomalies for the standard levels between 850 and 300 hPa was build. The analysis yielded six significant components with a 94 % of the variance explained and the leading patterns of favorable weather conditions for the development of the phenomenon were obtained. A zonda/non-zonda indicator c can be estimated by a logistic multiple regressions depending on the PCA component loadings, determining a zonda probability index \\widehat{c} calculable from T and Td profiles and it depends on the climatological features of the region. The index showed 74.7 % efficiency. The same analysis was performed by adding surface values of T and Td from Mendoza Aero station increasing the index efficiency to 87.8 %. The results revealed four significantly correlated PCs with a major improvement in differentiating zonda cases and a reducing of the uncertainty interval.
International Nuclear Information System (INIS)
Villalobos Chaves, Alberto E.
2006-01-01
Principal Components Analysis (PCA) was applied to the determination of the origin of samples of vodkas. Analytical parameters used were: the alcoholic degree, the difference between the alcoholic experimental degree and declared in the etiquette, the dried extract, the relative intensities of calcium atomic emission (beak area at 422,67 nm), sodium (sum of beaks areas Ca, Na / 588,99 and 589,59 nm) and potassium (sum of beaks areas to K/766,49 nm and 769,89 nm) and finally the ultraviolet absorbency to 200 nm. The accumulation of K-averages was used. The hypothesis of item is that the sample was constituted, approximately, for two big natural groupings, this is, national vodkas and foreign vodkas. Of the application of the above mentioned procedure there was obtained that really the components of the sample were distinguishable according to the national or foreign origin in two groups, which ellipses of confidence to 95 % not achieving , even if there were eliminated the variables of alcoholic degree and difference of the alcoholic degree. (author) [es
Directory of Open Access Journals (Sweden)
Golzarian Mahmood R
2011-09-01
Full Text Available Abstract Wheat is one of the most important crops in Australia, and the identification of young plants is an important step towards developing an automated system for monitoring crop establishment and also for differentiating crop from weeds. In this paper, a framework to differentiate early narrow-leaf wheat from two common weeds from their digital images is developed. A combination of colour, texture and shape features is used. These features are reduced to three descriptors using Principal Component Analysis. The three components provide an effective and significant means for distinguishing the three grasses. Further analysis enables threshold levels to be set for the discrimination of the plant species. The PCA model was evaluated on an independent data set of plants and the results show accuracy of 88% and 85% in the differentiation of ryegrass and brome grass from wheat, respectively. The outcomes of this study can be integrated into new knowledge in developing computer vision systems used in automated weed management.
International Nuclear Information System (INIS)
Araujo, Janeo Severino C. de; Dantas, Carlos Costa; Santos, Valdemir A. dos; Souza, Jose Edson G. de; Luna-Finkler, Christine L.
2009-01-01
The fluid dynamic behavior of riser of a cold flow model of a Fluid Catalytic Cracking Unit (FCCU) was investigated. The experimental data were obtained by the nuclear technique of gamma transmission. A gamma source was placed diametrically opposite to a detector in any straight section of the riser. The gas-solid flow through riser was monitored with a source of Americium-241 what allowed obtaining information of the axial solid concentration without flow disturbance and also identifying the dependence of this concentration profile with several independent variables. The MatLab R and Statistica R software were used. Statistica tool employed was the Principal Components Analysis (PCA), that consisted of the job of the data organization, through two-dimensional head offices to allow extract relevant information about the importance of the independent variables on axial solid concentration in a cold flow riser. The variables investigated were mass flow rate of solid, mass flow rate of gas, pressure in the riser base and the relative height in the riser. The first two components reached about 98 % of accumulated percentage of explained variance. (author)
International Nuclear Information System (INIS)
Na, Man Gyun; Oh, Seungrohk
2002-01-01
A neuro-fuzzy inference system combined with the wavelet denoising, principal component analysis (PCA), and sequential probability ratio test (SPRT) methods has been developed to monitor the relevant sensor using the information of other sensors. The parameters of the neuro-fuzzy inference system that estimates the relevant sensor signal are optimized by a genetic algorithm and a least-squares algorithm. The wavelet denoising technique was applied to remove noise components in input signals into the neuro-fuzzy system. By reducing the dimension of an input space into the neuro-fuzzy system without losing a significant amount of information, the PCA was used to reduce the time necessary to train the neuro-fuzzy system, simplify the structure of the neuro-fuzzy inference system, and also, make easy the selection of the input signals into the neuro-fuzzy system. By using the residual signals between the estimated signals and the measured signals, the SPRT is applied to detect whether the sensors are degraded or not. The proposed sensor-monitoring algorithm was verified through applications to the pressurizer water level, the pressurizer pressure, and the hot-leg temperature sensors in pressurized water reactors
International Nuclear Information System (INIS)
Saveleva, E. I.; Koryagina, N. L.; Radilov, A. S.; Khlebnikova, N. S.; Khrustaleva, V. S.
2007-01-01
A package of chemical analytical procedures was developed for the detection of products indicative of the presence of damped chemical weapons in the Baltic Sea. The principal requirements imposed upon the procedures were the following: high sensitivity, reliable identification of target compounds, wide range of components covered by survey analysis, and lack of interferences from sea salts. Thiodiglycol, a product of hydrolysis of sulfur mustard reportedly always detected in the sites of damping chemical weapons in the Baltic Sea, was considered the principal marker. We developed a high-sensitivity procedure for the determination of thiodiglycol in sea water, involving evaporation of samples to dryness in a vacuum concentrator, followed by tert-butyldimethylsilylation of the residue and GCMS analysis in the SIM mode with meta-fluorobenzoic acid as internal reference. The detection limit of thiodiglycol was 0.001 mg/l, and the procedure throughput was up to 30 samples per day. The same procedure, but with BSTFA as derivatizing agent instead of MTBSTFA, was used for preparing samples for survey analysis of nonvolatile components. In this case, full mass spectra were measured in the GCMS analysis. The use of BSTFA was motivated by the fact that trimethylsilyl derivatives are much wider represented in electronic mass spectral databases. The identification of sulfur mustard, volatile transformation products of sulfur mustard and lewisite, as well as chloroacetophenone in sea water was performed by means of GCMS in combination with SPME. The survey GC-MS analysis was focused on the identification of volatile and nonvolatile toxic chemicals whose mass spectra are included in the OPCW database (3219 toxic chemicals, precursors, and transformation products) with the use of AMDIS software (version 2.62). Using 2 GC-MS instruments, we could perform the survey analysis for volatile and nonvolatile components of up to 20 samples per day. Thus, the package of three procedures
International Nuclear Information System (INIS)
Razifar, Pasha; Hennings, Joakim; Monazzam, Azita; Hellman, Per; Långström, Bengt; Sundin, Anders
2009-01-01
In previous clinical Positron Emission Tomography (PET) studies novel approaches for application of Principal Component Analysis (PCA) on dynamic PET images such as Masked Volume Wise PCA (MVW-PCA) have been introduced. MVW-PCA was shown to be a feasible multivariate analysis technique, which, without modeling assumptions, could extract and separate organs and tissues with different kinetic behaviors into different principal components (MVW-PCs) and improve the image quality. In this study, MVW-PCA was applied to 14 dynamic 11C-metomidate-PET (MTO-PET) examinations of 7 patients with small adrenocortical tumours. MTO-PET was performed before and 3 days after starting per oral cortisone treatment. The whole dataset, reconstructed by filtered back projection (FBP) 0–45 minutes after the tracer injection, was used to study the tracer pharmacokinetics. Early, intermediate and late pharmacokinetic phases could be isolated in this manner. The MVW-PC1 images correlated well to the conventionally summed image data (15–45 minutes) but the image noise in the former was considerably lower. PET measurements performed by defining 'hot spot' regions of interest (ROIs) comprising 4 contiguous pixels with the highest radioactivity concentration showed a trend towards higher SUVs when the ROIs were outlined in the MVW-PC1 component than in the summed images. Time activity curves derived from '50% cut-off' ROIs based on an isocontour function whereby the pixels with SUVs between 50 to 100% of the highest radioactivity concentration were delineated, showed a significant decrease of the SUVs in normal adrenal glands and in adrenocortical adenomas after cortisone treatment. In addition to the clear decrease in image noise and the improved contrast between different structures with MVW-PCA, the results indicate that the definition of ROIs may be more accurate and precise in MVW-PC1 images than in conventional summed images. This might improve the precision of PET
Directory of Open Access Journals (Sweden)
Mohammad Kashkoei Jahroomi
2016-07-01
Full Text Available Introduction In remote sensing studies, especially those in which multi-spectral image data are used, (i.e., Landsat-7 Enhanced Thematic Mapper, various statistical methods are often applied for image enhancement and feature extraction (Reddy, 2008. Principal component analysis is a multivariate statistical technique which is frequently used in multidimensional data analysis. This method attempts to extract and place the spectral information into a smaller set of new components that are more interpretable. However, the results obtained from this method are not so straightforward and require somewhat sophisticated techniques to interpret (Drury, 2001. In this paper we present a new approach for mapping of hydrothermal alteration by analyzing and selecting the principal components extracted through processing of Landsat ETM+ images. The study area is located in a mountainous region of southern Kerman. Geologically, it lies in the volcanic belt of central Iran adjacent to the Gogher-Baft ophiolite zone. The region is highly altered with sericitic, propyliticand argillic alterationwell developed, and argillic alteration is limited (Jafari, 2009; Masumi and Ranjbar, 2011. Multispectral data of Landsat ETM+ was acquired (path 181, row 34 in this study. In these images the color composites of Band 7, Band 4 and Band 1 in RGB indicate the lithology outcropping in the study area. The principal component analysis (PCA ofimage data is often implemented computationally using three steps: (1 Calculation of the variance, covariance matrix or correlation matrix of the satellite sensor data. (2 Computation of the eigenvalues and eigenvectors of the variance-covariance matrix or correlation matrix, and (3 Linear transformation of the image data using the coefficients of the eigenvector matrix. Results By applying PCA to the spectral data, according to the eigenvectors obtained, 6 principal components were extracted from the data set. In the PCA matrix, theeigen
Zhang, Zhiming; Ouyang, Zhiyun; Xiao, Yi; Xiao, Yang; Xu, Weihua
2017-06-01
Increasing exploitation of karst resources is causing severe environmental degradation because of the fragility and vulnerability of karst areas. By integrating principal component analysis (PCA) with annual seasonal trend analysis (ASTA), this study assessed karst rocky desertification (KRD) within a spatial context. We first produced fractional vegetation cover (FVC) data from a moderate-resolution imaging spectroradiometer normalized difference vegetation index using a dimidiate pixel model. Then, we generated three main components of the annual FVC data using PCA. Subsequently, we generated the slope image of the annual seasonal trends of FVC using median trend analysis. Finally, we combined the three PCA components and annual seasonal trends of FVC with the incidence of KRD for each type of carbonate rock to classify KRD into one of four categories based on K-means cluster analysis: high, moderate, low, and none. The results of accuracy assessments indicated that this combination approach produced greater accuracy and more reasonable KRD mapping than the average FVC based on the vegetation coverage standard. The KRD map for 2010 indicated that the total area of KRD was 78.76 × 10 3 km 2 , which constitutes about 4.06% of the eight southwest provinces of China. The largest KRD areas were found in Yunnan province. The combined PCA and ASTA approach was demonstrated to be an easily implemented, robust, and flexible method for the mapping and assessment of KRD, which can be used to enhance regional KRD management schemes or to address assessment of other environmental issues.
Rodrigue, Christine M.
2011-01-01
This paper presents a laboratory exercise used to teach principal components analysis (PCA) as a means of surface zonation. The lab was built around abundance data for 16 oxides and elements collected by the Mars Exploration Rover Spirit in Gusev Crater between Sol 14 and Sol 470. Students used PCA to reduce 15 of these into 3 components, which,…
Directory of Open Access Journals (Sweden)
Chen Shih-Wei
2011-11-01
Full Text Available Abstract Background The computer-aided identification of specific gait patterns is an important issue in the assessment of Parkinson's disease (PD. In this study, a computer vision-based gait analysis approach is developed to assist the clinical assessments of PD with kernel-based principal component analysis (KPCA. Method Twelve PD patients and twelve healthy adults with no neurological history or motor disorders within the past six months were recruited and separated according to their "Non-PD", "Drug-On", and "Drug-Off" states. The participants were asked to wear light-colored clothing and perform three walking trials through a corridor decorated with a navy curtain at their natural pace. The participants' gait performance during the steady-state walking period was captured by a digital camera for gait analysis. The collected walking image frames were then transformed into binary silhouettes for noise reduction and compression. Using the developed KPCA-based method, the features within the binary silhouettes can be extracted to quantitatively determine the gait cycle time, stride length, walking velocity, and cadence. Results and Discussion The KPCA-based method uses a feature-extraction approach, which was verified to be more effective than traditional image area and principal component analysis (PCA approaches in classifying "Non-PD" controls and "Drug-Off/On" PD patients. Encouragingly, this method has a high accuracy rate, 80.51%, for recognizing different gaits. Quantitative gait parameters are obtained, and the power spectrums of the patients' gaits are analyzed. We show that that the slow and irregular actions of PD patients during walking tend to transfer some of the power from the main lobe frequency to a lower frequency band. Our results indicate the feasibility of using gait performance to evaluate the motor function of patients with PD. Conclusion This KPCA-based method requires only a digital camera and a decorated corridor setup
International Nuclear Information System (INIS)
Pagani, M.M.E.; Sanchez-Creaspo, A.; Jonsson, C.; Engelin, L.; Danielsson, A.M.; Larsson, S.A.; Jacobsson, H.; Waegner, A.; Salmaso, D.
2002-01-01
Aim: In its development into severe Alzheimer's Disease (AD), early Alzheimer Disease (eAD) involves progressively larger regions of the brain. These regions share close anatomo-functional relationships. The aim of this study was to investigate the rCBF changes occurring in eAD and AD as compared to a group of normal individuals by means of SPECT and Principal Component Analysis. Materials and Methods. Thirty eAD, 17 AD and 66 normal controls (CTR) were included in the study. 99m Tc-HMPAO SPECT, using a three-headed gamma camera, was performed at rest and the uptake in 27 functional bilateral sub-volumes of the brain was assessed by a standardised digitalised brain atlas. Data were grouped into anatomo-functionally connected regions by means of PCA analysis performed on all 113 individuals. Analysis of variance (ANOVA) was used to test the significance of the differences in flow in such functional regions and data were co-variated for age differences. Results. In the global analysis, rCBF significantly differed between groups (0.001) with a progressive reduction of flow from CTR to AD. PCA reduced the 54 variables to 11 anatomo-functional regions that interacted with groups (p<0.001) and gender (p<0.001). In the overall analysis the three groups differed significantly in all functional regions except for bilateral occipital cortex, anterior cingulated cortex, thalamus and putamen. In both CTR/eAD and CTR/AD comparisons the largest rCBF reductions were found in functional regions including left (p<0.0001) and right (p<0.0001) temporo-parietal cortex and associative parietal cortex (p<0.0001). When eAD was compared to AD, this latter showed the largest reductions in right temporo-parietal cortex (p<0.0001) and in right prefrontal cortex (p<0.005). Conclusions. In this study the rCBF was investigated in early and severe Alzheimer's Disease taking into account the functional connectivity among brain regions. Our results confirm previous findings on the progression of
Sung, Yun Ju; Di, Yanming; Fu, Audrey Q; Rothstein, Joseph H; Sieh, Weiva; Tong, Liping; Thompson, Elizabeth A; Wijsman, Ellen M
2007-01-01
We performed multipoint linkage analyses with multiple programs and models for several gene expression traits in the Centre d'Etude du Polymorphisme Humain families. All analyses provided consistent results for both peak location and shape. Variance-components (VC) analysis gave wider peaks and Bayes factors gave fewer peaks. Among programs from the MORGAN package, lm_multiple performed better than lm_markers, resulting in less Markov-chain Monte Carlo (MCMC) variability between runs, and the program lm_twoqtl provided higher LOD scores by also including either a polygenic component or an additional quantitative trait locus.
Pes, Giovanni Mario; Delitala, Alessandro Palmerio; Errigo, Alessandra; Delitala, Giuseppe; Dore, Maria Pina
2016-06-01
Latent autoimmune diabetes in adults (LADA) which accounts for more than 10 % of all cases of diabetes is characterized by onset after age 30, absence of ketoacidosis, insulin independence for at least 6 months, and presence of circulating islet-cell antibodies. Its marked heterogeneity in clinical features and immunological markers suggests the existence of multiple mechanisms underlying its pathogenesis. The principal component (PC) analysis is a statistical approach used for finding patterns in data of high dimension. In this study the PC analysis was applied to a set of variables from a cohort of Sardinian LADA patients to identify a smaller number of latent patterns. A list of 11 variables including clinical (gender, BMI, lipid profile, systolic and diastolic blood pressure and insulin-free time period), immunological (anti-GAD65, anti-IA-2 and anti-TPO antibody titers) and genetic features (predisposing gene variants previously identified as risk factors for autoimmune diabetes) retrieved from clinical records of 238 LADA patients referred to the Internal Medicine Unit of University of Sassari, Italy, were analyzed by PC analysis. The predictive value of each PC on the further development of insulin dependence was evaluated using Kaplan-Meier curves. Overall 4 clusters were identified by PC analysis. In component PC-1, the dominant variables were: BMI, triglycerides, systolic and diastolic blood pressure and duration of insulin-free time period; in PC-2: genetic variables such as Class II HLA, CTLA-4 as well as anti-GAD65, anti-IA-2 and anti-TPO antibody titers, and the insulin-free time period predominated; in PC-3: gender and triglycerides; and in PC-4: total cholesterol. These components explained 18, 15, 12, and 12 %, respectively, of the total variance in the LADA cohort. The predictive power of insulin dependence of the four components was different. PC-2 (characterized mostly by high antibody titers and presence of predisposing genetic markers
Energy Technology Data Exchange (ETDEWEB)
Reddy, T.A. (Energy Systems Lab., Texas A and M Univ., College Station, TX (United States)); Claridge, D.E. (Energy Systems Lab., Texas A and M Univ., College Station, TX (United States))
1994-01-01
Multiple regression modeling of monitored building energy use data is often faulted as a reliable means of predicting energy use on the grounds that multicollinearity between the regressor variables can lead both to improper interpretation of the relative importance of the various physical regressor parameters and to a model with unstable regressor coefficients. Principal component analysis (PCA) has the potential to overcome such drawbacks. While a few case studies have already attempted to apply this technique to building energy data, the objectives of this study were to make a broader evaluation of PCA and multiple regression analysis (MRA) and to establish guidelines under which one approach is preferable to the other. Four geographic locations in the US with different climatic conditions were selected and synthetic data sequence representative of daily energy use in large institutional buildings were generated in each location using a linear model with outdoor temperature, outdoor specific humidity and solar radiation as the three regression variables. MRA and PCA approaches were then applied to these data sets and their relative performances were compared. Conditions under which PCA seems to perform better than MRA were identified and preliminary recommendations on the use of either modeling approach formulated. (orig.)
International Nuclear Information System (INIS)
Zhao, Fengjun; Liu, Junting; Qu, Xiaochao; Xu, Xianhui; Chen, Xueli; Yang, Xiang; Liang, Jimin; Tian, Jie; Cao, Feng
2014-01-01
To solve the multicollinearity issue and unequal contribution of vascular parameters for the quantification of angiogenesis, we developed a quantification evaluation method of vascular parameters for angiogenesis based on in vivo micro-CT imaging of hindlimb ischemic model mice. Taking vascular volume as the ground truth parameter, nine vascular parameters were first assembled into sparse principal components (PCs) to reduce the multicolinearity issue. Aggregated boosted trees (ABTs) were then employed to analyze the importance of vascular parameters for the quantification of angiogenesis via the loadings of sparse PCs. The results demonstrated that vascular volume was mainly characterized by vascular area, vascular junction, connectivity density, segment number and vascular length, which indicated they were the key vascular parameters for the quantification of angiogenesis. The proposed quantitative evaluation method was compared with both the ABTs directly using the nine vascular parameters and Pearson correlation, which were consistent. In contrast to the ABTs directly using the vascular parameters, the proposed method can select all the key vascular parameters simultaneously, because all the key vascular parameters were assembled into the sparse PCs with the highest relative importance. (paper)
Ping-Keng Jao; Yuan-Pin Lin; Yi-Hsuan Yang; Tzyy-Ping Jung
2015-08-01
An emerging challenge for emotion classification using electroencephalography (EEG) is how to effectively alleviate day-to-day variability in raw data. This study employed the robust principal component analysis (RPCA) to address the problem with a posed hypothesis that background or emotion-irrelevant EEG perturbations lead to certain variability across days and somehow submerge emotion-related EEG dynamics. The empirical results of this study evidently validated our hypothesis and demonstrated the RPCA's feasibility through the analysis of a five-day dataset of 12 subjects. The RPCA allowed tackling the sparse emotion-relevant EEG dynamics from the accompanied background perturbations across days. Sequentially, leveraging the RPCA-purified EEG trials from more days appeared to improve the emotion-classification performance steadily, which was not found in the case using the raw EEG features. Therefore, incorporating the RPCA with existing emotion-aware machine-learning frameworks on a longitudinal dataset of each individual may shed light on the development of a robust affective brain-computer interface (ABCI) that can alleviate ecological inter-day variability.
Directory of Open Access Journals (Sweden)
Chang-Qing Duan
2008-11-01
Full Text Available Color is one of the key characteristics used to evaluate the sensory quality of red wine, and anthocyanins are the main contributors to color. Monomeric anthocyanins and CIELAB color values were investigated by HPLC-MS and spectrophotometry during fermentation of Cabernet Sauvignon red wine, and principal component regression (PCR, a statistical tool, was used to establish a linkage between the detected anthocyanins and wine coloring. The results showed that 14 monomeric anthocyanins could be identified in wine samples, and all of these anthocyanins were negatively correlated with the L*, b* and H*ab values, but positively correlated with a* and C*ab values. On an equal concentration basis for each detected anthocyanin, cyanidin-3-O-glucoside (Cy3-glu had the most influence on CIELAB color value, while malvidin 3-O-glucoside (Mv3-glu had the least. The color values of various monomeric anthocyanins were influenced by their structures, substituents on the B-ring, acyl groups on the glucoside and the molecular steric structure. This work develops a statistical method for evaluating correlation between wine color and monomeric anthocyanins, and also provides a basis for elucidating the effect of intramolecular copigmentation on wine coloring.
Szabo, J.K.; Fedriani, E.M.; Segovia-Gonzalez, M. M.; Astheimer, L.B.; Hooper, M.J.
2010-01-01
This paper introduces a new technique in ecology to analyze spatial and temporal variability in environmental variables. By using simple statistics, we explore the relations between abiotic and biotic variables that influence animal distributions. However, spatial and temporal variability in rainfall, a key variable in ecological studies, can cause difficulties to any basic model including time evolution. The study was of a landscape scale (three million square kilometers in eastern Australia), mainly over the period of 19982004. We simultaneously considered qualitative spatial (soil and habitat types) and quantitative temporal (rainfall) variables in a Geographical Information System environment. In addition to some techniques commonly used in ecology, we applied a new method, Functional Principal Component Analysis, which proved to be very suitable for this case, as it explained more than 97% of the total variance of the rainfall data, providing us with substitute variables that are easier to manage and are even able to explain rainfall patterns. The main variable came from a habitat classification that showed strong correlations with rainfall values and soil types. ?? 2010 World Scientific Publishing Company.
Plazas-Nossa, Leonardo; Hofer, Thomas; Gruber, Günter; Torres, Andres
2017-02-01
This work proposes a methodology for the forecasting of online water quality data provided by UV-Vis spectrometry. Therefore, a combination of principal component analysis (PCA) to reduce the dimensionality of a data set and artificial neural networks (ANNs) for forecasting purposes was used. The results obtained were compared with those obtained by using discrete Fourier transform (DFT). The proposed methodology was applied to four absorbance time series data sets composed by a total number of 5705 UV-Vis spectra. Absolute percentage errors obtained by applying the proposed PCA/ANN methodology vary between 10% and 13% for all four study sites. In general terms, the results obtained were hardly generalizable, as they appeared to be highly dependent on specific dynamics of the water system; however, some trends can be outlined. PCA/ANN methodology gives better results than PCA/DFT forecasting procedure by using a specific spectra range for the following conditions: (i) for Salitre wastewater treatment plant (WWTP) (first hour) and Graz West R05 (first 18 min), from the last part of UV range to all visible range; (ii) for Gibraltar pumping station (first 6 min) for all UV-Vis absorbance spectra; and (iii) for San Fernando WWTP (first 24 min) for all of UV range to middle part of visible range.
Choudhary, Shagun; Singh, Manisha; Sharma, Deepak; Attri, Sampan; Sharma, Kavita; Goel, Gunjan
2018-06-02
The present study aimed to screen the indigenous probiotic cultures for their effect on total phenolic contents (TPC) and associated antioxidant activities in synbiotic fermented soymilk during storage. Among 16 cultures, subtractive screening was conducted based on different tests such as acidification rate and proliferation of lactic acid bacteria (LAB) on supplementation of inulin (0-20 mM) and fructooligosaccharide (0-0.45 mM). Lactobacillus paracasei CD4 was selected as potential strain after principal component analysis (PCA) of different strains with prebiotic substrates at different concentrations. The strain was used for production of synbiotic soymilk product containing 10 mM inulin. The storage study was conducted at 4 °C for 21 days. During storage, the pH, titratable acidity, TPC, antioxidant activities, and viable cell counts (VCC) were determined. The fermentation of soymilk supplemented with 10 mM inulin did not alter the VCC; however, a decrease in pH and TPC and an increase in acidity and antioxidant activity were observed (p inulin in soymilk enhanced the viability of Lactobacillus paracasei CD4 and antioxidant activity during storage under refrigeration conditions.
Directory of Open Access Journals (Sweden)
Néstor Rodríguez-Padial
2017-01-01
Full Text Available The uncertainty of demand has led production systems to become increasingly complex; this can affect the availability of the machines and thus their maintenance. Therefore, it is necessary to adequately manage the information that facilitates decision-making. This paper presents a system for making decisions related to the design of customized maintenance plans in a production plant. This paper addresses this tactical goal and aims to provide greater knowledge and better predictions by projecting reliable behavior in the medium-term, integrating this new functionality into classic Balance Scorecards, and making it possible to extend their current measuring function to a new aptitude: predicting evolution based on historical data. In the proposed Custom Balance Scorecard design, an exploratory data phase is integrated with another analysis and prediction phase using Principal Component Analysis algorithms and Machine Learning that uses Artificial Neural Network algorithms. This new extension allows better control over the maintenance function of an industrial plant in the medium-term with a yearly horizon taken over monthly intervals which allows the measurement of the indicators of strategic productive areas and the discovery of hidden behavior patterns in work orders. In addition, this extension enables the prediction of indicator outcomes such as overall equipment efficiency and mean time to failure.
Prata, Paloma S; Alexandrino, Guilherme L; Mogollón, Noroska Gabriela S; Augusto, Fabio
2016-11-11
The geochemical characterization of petroleum is an essential task to develop new strategies and technologies when analyzing the commercial potential of crude oils for exploitation. Due to the chemical complexity of these samples, the use of modern analytical techniques along with multivariate exploratory data analysis approaches is an interesting strategy to extract relevant geochemical characteristics about the oils. In this work, important geochemical information obtained from crude oils from different production basins were obtained analyzing the maltene fraction of the oils by comprehensive two-dimensional gas chromatography coupled to quadrupole mass spectrometry (GC×GC-QMS), and performing multiway principal component analysis (MPCA) of the chromatographic data. The results showed that four MPC explained 93.57% of the data variance, expressing mainly the differences on the profiles of the saturated hydrocarbon fraction of the oils (C 13 -C 18 and C 19 -C 30 n-alkanes and the pristane/phytane ratio). The MPC1 grouped the samples severely biodegraded oils, while the type of the depositional paleoenvironments of the oils and its oxidation conditions (as well as their thermal maturity) could be inferred analysing others relevant MPC. Additionally, considerations about the source of the oil samples was also possible based on the overall distribution of relevant biomarkers such as the phenanthrene derivatives, tri-, tetra- and pentacyclic terpanes. Copyright © 2016 Elsevier B.V. All rights reserved.
Directory of Open Access Journals (Sweden)
Kejian Chu
2018-01-01
Full Text Available Identification of the key environmental indicators (KEIs from a large number of environmental variables is important for environmental management in tidal flat reclamation areas. In this study, a modified principal component analysis approach (MPCA has been developed for determining the KEIs. The MPCA accounts for the two important attributes of the environmental variables: pollution status and temporal variation, in addition to the commonly considered numerical divergence attribute. It also incorporates the distance correlation (dCor to replace the Pearson’s correlation to measure the nonlinear interrelationship between the variables. The proposed method was applied to the Tiaozini sand shoal, a large-scale tidal flat reclamation region in China. Five KEIs were identified as dissolved inorganic nitrogen, Cd, petroleum in the water column, Hg, and total organic carbon in the sediment. The identified KEIs were shown to respond well to the biodiversity of phytoplankton. This demonstrated that the identified KEIs adequately represent the environmental condition in the coastal marine system. Therefore, the MPCA is a practicable method for extracting effective indicators that have key roles in the coastal and marine environment.
Energy Technology Data Exchange (ETDEWEB)
Pineda-Martinez, L.F.; Carbajal, N.; Medina-Roldan, E. [Instituto Potosino de Investigacion Cientifica y Tecnologica, A. C., San Luis Potosi (Mexico)]. E-mail: lpineda@ipicyt.edu.mx
2007-04-15
Applying principal component analysis (PCA), we determined climate zones in a topographic gradient in the central-northeastern part of Mexico. We employed nearly 30 years of monthly temperature and precipitation data at 173 meteorological stations. The climate classification was carried out applying the Koeppen system modified for the conditions of Mexico. PCA indicates a regionalization in agreement with topographic characteristics and vegetation. We describe the different bioclimatic zones, associated with typical vegetation, for each climate using geographical information systems (GIS). [Spanish] Utilizando un analisis de componentes principales, determinamos zonas climaticas en un gradiente topografico en la zona centro-noreste de Mexico. Se emplearon datos de precipitacion y temperatura medias mensuales por un periodo de 30 anos de 173 estaciones meteorologicas. La clasificacion del clima fue llevada a cabo de acuerdo con el sistema de Koeppen modificado para las condiciones de Mexico. El analisis de componentes principales indico una regionalizacion que concuerda con caracteristicas de topografia y vegetacion. Se describen zonas bioclimaticas, asociadas a vegetacion tipica para cada clima, usando sistemas de informacion geografica (SIG).
Lu, Mingyu; Qu, Yongwei; Lu, Ye; Ye, Lin; Zhou, Limin; Su, Zhongqing
2012-04-01
An experimental study is reported in this paper demonstrating monitoring of surface-fatigue crack propagation in a welded steel angle structure using Lamb waves generated by an active piezoceramic transducer (PZT) network which was freely surface-mounted for each PZT transducer to serve as either actuator or sensor. The fatigue crack was initiated and propagated in welding zone of a steel angle structure by three-point bending fatigue tests. Instead of directly comparing changes between a series of specific signal segments such as S0 and A0 wave modes scattered from fatigue crack tips, a variety of signal statistical parameters representing five different structural status obtained from marginal spectrum in Hilbert-huang transform (HHT), indicating energy progressive distribution along time period in the frequency domain including all wave modes of one wave signal were employed to classify and distinguish different structural conditions due to fatigue crack initiation and propagation with the combination of using principal component analysis (PCA). Results show that PCA based on marginal spectrum is effective and sensitive for monitoring the growth of fatigue crack although the received signals are extremely complicated due to wave scattered from weld, multi-boundaries, notch and fatigue crack. More importantly, this method indicates good potential for identification of integrity status of complicated structures which cause uncertain wave patterns and ambiguous sensor network arrangement.
Paolucci, Enrico; Lunedei, Enrico; Albarello, Dario
2017-10-01
In this work, we propose a procedure based on principal component analysis on data sets consisting of many horizontal to vertical spectral ratio (HVSR or H/V) curves obtained by single-station ambient vibration acquisitions. This kind of analysis aimed at the seismic characterization of the investigated area by identifying sites characterized by similar HVSR curves. It also allows to extract the typical HVSR patterns of the explored area and to establish their relative importance, providing an estimate of the level of heterogeneity under the seismic point of view. In this way, an automatic explorative seismic characterization of the area becomes possible by only considering ambient vibration data. This also implies that the relevant outcomes can be safely compared with other available information (geological data, borehole measurements, etc.) without any conceptual trade-off. The whole algorithm is remarkably fast: on a common personal computer, the processing time takes few seconds for a data set including 100-200 HVSR measurements. The procedure has been tested in three study areas in the Central-Northern Italy characterized by different geological settings. Outcomes demonstrate that this technique is effective and well correlates with most significant seismostratigraphical heterogeneities present in each of the study areas.
Byrne, Patrick; Runkel, Robert L; Walton-Day, Katherine
2017-07-01
Combining the synoptic mass balance approach with principal components analysis (PCA) can be an effective method for discretising the chemistry of inflows and source areas in watersheds where contamination is diffuse in nature and/or complicated by groundwater interactions. This paper presents a field-scale study in which synoptic sampling and PCA are employed in a mineralized watershed (Lion Creek, Colorado, USA) under low flow conditions to (i) quantify the impacts of mining activity on stream water quality; (ii) quantify the spatial pattern of constituent loading; and (iii) identify inflow sources most responsible for observed changes in stream chemistry and constituent loading. Several of the constituents investigated (Al, Cd, Cu, Fe, Mn, Zn) fail to meet chronic aquatic life standards along most of the study reach. The spatial pattern of constituent loading suggests four primary sources of contamination under low flow conditions. Three of these sources are associated with acidic (pH mine water in the Minnesota Mine shaft located to the north-east of the river channel. In addition, water chemistry data during a rainfall-runoff event suggests the spatial pattern of constituent loading may be modified during rainfall due to dissolution of efflorescent salts or erosion of streamside tailings. These data point to the complexity of contaminant mobilisation processes and constituent loading in mining-affected watersheds but the combined synoptic sampling and PCA approach enables a conceptual model of contaminant dynamics to be developed to inform remediation.
Energy Technology Data Exchange (ETDEWEB)
Liu Yihua [Institute of Pesticide and Environmental Toxicology, Zhejiang University, Hangzhou 310029 (China); Jin Maojun [Institute of Pesticide and Environmental Toxicology, Zhejiang University, Hangzhou 310029 (China); Gui Wenjun [Institute of Pesticide and Environmental Toxicology, Zhejiang University, Hangzhou 310029 (China); Cheng Jingli [Institute of Pesticide and Environmental Toxicology, Zhejiang University, Hangzhou 310029 (China); Guo Yirong [Institute of Pesticide and Environmental Toxicology, Zhejiang University, Hangzhou 310029 (China); Zhu Guonian [Institute of Pesticide and Environmental Toxicology, Zhejiang University, Hangzhou 310029 (China)]. E-mail: zhugn@zju.edu.cn
2007-05-22
A novel procedure for parathion hapten design is described. The optimal antigen for parathion was selected after molecular modeling studies of six types of potentially immunizing haptens with the aim to identify the best mimicking target analyte. Heterologous competitive indirect enzyme-linked immunosorbent assay (ELISA) was developed after screening a battery of competitors as coating antigens. The relationship between the heterology degree of the competitor and the resulting immunoassay detectability was investigated according to the electronic similarities of the competitor haptens and the target analyte. Molecular modeling and principal component analysis were performed to understand the electronic distribution and steric parameters of the haptens at their minimum energetic levels. The results suggested that the competitors should have a high heterology to produce assays with good detectability values. An indirect competitive ELISA was finally selected for further investigation. The immunoassay had an IC{sub 50} value of 4.79 ng mL{sup -1} and a limit of detection of 0.31 ng mL{sup -1}. There was little or no cross-reactivity to similar compounds tested except for the insecticide parathion-methyl, which showed a cross-reactivity of 7.8%.