WorldWideScience

Sample records for svm classification accuracy

  1. Customer and performance rating in QFD using SVM classification

    Science.gov (United States)

    Dzulkifli, Syarizul Amri; Salleh, Mohd Najib Mohd; Leman, A. M.

    2017-09-01

    In a classification problem, where each input is associated to one output. Training data is used to create a model which predicts values to the true function. SVM is a popular method for binary classification due to their theoretical foundation and good generalization performance. However, when trained with noisy data, the decision hyperplane might deviate from optimal position because of the sum of misclassification errors in the objective function. In this paper, we introduce fuzzy in weighted learning approach for improving the accuracy of Support Vector Machine (SVM) classification. The main aim of this work is to determine appropriate weighted for SVM to adjust the parameters of learning method from a given set of noisy input to output data. The performance and customer rating in Quality Function Deployment (QFD) is used as our case study to determine implementing fuzzy SVM is highly scalable for very large data sets and generating high classification accuracy.

  2. Adaptive SVM for Data Stream Classification

    Directory of Open Access Journals (Sweden)

    Isah A. Lawal

    2017-07-01

    Full Text Available In this paper, we address the problem of learning an adaptive classifier for the classification of continuous streams of data. We present a solution based on incremental extensions of the Support Vector Machine (SVM learning paradigm that updates an existing SVM whenever new training data are acquired. To ensure that the SVM effectiveness is guaranteed while exploiting the newly gathered data, we introduce an on-line model selection approach in the incremental learning process. We evaluated the proposed method on real world applications including on-line spam email filtering and human action classification from videos. Experimental results show the effectiveness and the potential of the proposed approach.

  3. Quality-Oriented Classification of Aircraft Material Based on SVM

    Directory of Open Access Journals (Sweden)

    Hongxia Cai

    2014-01-01

    Full Text Available The existing material classification is proposed to improve the inventory management. However, different materials have the different quality-related attributes, especially in the aircraft industry. In order to reduce the cost without sacrificing the quality, we propose a quality-oriented material classification system considering the material quality character, Quality cost, and Quality influence. Analytic Hierarchy Process helps to make feature selection and classification decision. We use the improved Kraljic Portfolio Matrix to establish the three-dimensional classification model. The aircraft materials can be divided into eight types, including general type, key type, risk type, and leveraged type. Aiming to improve the classification accuracy of various materials, the algorithm of Support Vector Machine is introduced. Finally, we compare the SVM and BP neural network in the application. The results prove that the SVM algorithm is more efficient and accurate and the quality-oriented material classification is valuable.

  4. Multiclass Posterior Probability Twin SVM for Motor Imagery EEG Classification.

    Science.gov (United States)

    She, Qingshan; Ma, Yuliang; Meng, Ming; Luo, Zhizeng

    2015-01-01

    Motor imagery electroencephalography is widely used in the brain-computer interface systems. Due to inherent characteristics of electroencephalography signals, accurate and real-time multiclass classification is always challenging. In order to solve this problem, a multiclass posterior probability solution for twin SVM is proposed by the ranking continuous output and pairwise coupling in this paper. First, two-class posterior probability model is constructed to approximate the posterior probability by the ranking continuous output techniques and Platt's estimating method. Secondly, a solution of multiclass probabilistic outputs for twin SVM is provided by combining every pair of class probabilities according to the method of pairwise coupling. Finally, the proposed method is compared with multiclass SVM and twin SVM via voting, and multiclass posterior probability SVM using different coupling approaches. The efficacy on the classification accuracy and time complexity of the proposed method has been demonstrated by both the UCI benchmark datasets and real world EEG data from BCI Competition IV Dataset 2a, respectively.

  5. Multiclass Posterior Probability Twin SVM for Motor Imagery EEG Classification

    Directory of Open Access Journals (Sweden)

    Qingshan She

    2015-01-01

    Full Text Available Motor imagery electroencephalography is widely used in the brain-computer interface systems. Due to inherent characteristics of electroencephalography signals, accurate and real-time multiclass classification is always challenging. In order to solve this problem, a multiclass posterior probability solution for twin SVM is proposed by the ranking continuous output and pairwise coupling in this paper. First, two-class posterior probability model is constructed to approximate the posterior probability by the ranking continuous output techniques and Platt’s estimating method. Secondly, a solution of multiclass probabilistic outputs for twin SVM is provided by combining every pair of class probabilities according to the method of pairwise coupling. Finally, the proposed method is compared with multiclass SVM and twin SVM via voting, and multiclass posterior probability SVM using different coupling approaches. The efficacy on the classification accuracy and time complexity of the proposed method has been demonstrated by both the UCI benchmark datasets and real world EEG data from BCI Competition IV Dataset 2a, respectively.

  6. Melancholia EEG classification based on CSSD and SVM

    Science.gov (United States)

    Shi, Jian-Jun; Yuan, Qing-Wu; Zhou, La-Wu

    2011-10-01

    It takes an important role to get the disease information from melancholia electroencephalograph (EEG). Firstly, A common spatial subspace decomposition (CSSD) method was used to extract features from 16-channel EEG of melancholia and normal healthy persons. Then based on support vector machines (SVM), a classifier was designed to train and test its classification capability between Melancholia and healthy persons. The results indicated that the proposed method can reach a higher accuracy as 95% in EEG classification, while the accuracy of the method based on wavelet is only 88%.That is, the proposed method is feasible for the melancholia diagnosis and research.

  7. Research on Classification of Chinese Text Data Based on SVM

    Science.gov (United States)

    Lin, Yuan; Yu, Hongzhi; Wan, Fucheng; Xu, Tao

    2017-09-01

    Data Mining has important application value in today’s industry and academia. Text classification is a very important technology in data mining. At present, there are many mature algorithms for text classification. KNN, NB, AB, SVM, decision tree and other classification methods all show good classification performance. Support Vector Machine’ (SVM) classification method is a good classifier in machine learning research. This paper will study the classification effect based on the SVM method in the Chinese text data, and use the support vector machine method in the chinese text to achieve the classify chinese text, and to able to combination of academia and practical application.

  8. [Application of SVM and wavelet analysis in EEG classification].

    Science.gov (United States)

    Zhao, Jianlin; Zhou, Weidong; Liu, Kai; Cai, Dongmei

    2011-04-01

    We employed two methods of support vector machines (SVM) combined with two kinds of wavelet analysis to classify these EEG signals, on the basis of the different profiles, energy, and frequency characteristics of the EEG during the seizures. One method was to classify these signals using waveform characteristics of the EEG signal. The other was to classify these signals based on fluctuation index and variation coefficient of the EEG signal. We compared the classification accuracies of these two methods with the intermittent EEG and epileptic EEG. The results of the experiments showed that both the two methods for distinguishing epileptic EEG and interictal EEG can achieve an effective performance. It was also confirmed that the latter, the method based on the fluctuation index and variation coefficient, possesses a better effect of classification.

  9. Linear regression-based efficient SVM learning for large-scale classification.

    Science.gov (United States)

    Wu, Jianxin; Yang, Hao

    2015-10-01

    For large-scale classification tasks, especially in the classification of images, additive kernels have shown a state-of-the-art accuracy. However, even with the recent development of fast algorithms, learning speed and the ability to handle large-scale tasks are still open problems. This paper proposes algorithms for large-scale support vector machines (SVM) classification and other tasks using additive kernels. First, a linear regression SVM framework for general nonlinear kernel is proposed using linear regression to approximate gradient computations in the learning process. Second, we propose a power mean SVM (PmSVM) algorithm for all additive kernels using nonsymmetric explanatory variable functions. This nonsymmetric kernel approximation has advantages over the existing methods: 1) it does not require closed-form Fourier transforms and 2) it does not require extra training for the approximation either. Compared on benchmark large-scale classification data sets with millions of examples or millions of dense feature dimensions, PmSVM has achieved the highest learning speed and highest accuracy among recent algorithms in most cases.

  10. SVM classification model in depression recognition based on mutation PSO parameter optimization

    Directory of Open Access Journals (Sweden)

    Zhang Ming

    2017-01-01

    Full Text Available At present, the clinical diagnosis of depression is mainly through structured interviews by psychiatrists, which is lack of objective diagnostic methods, so it causes the higher rate of misdiagnosis. In this paper, a method of depression recognition based on SVM and particle swarm optimization algorithm mutation is proposed. To address on the problem that particle swarm optimization (PSO algorithm easily trap in local optima, we propose a feedback mutation PSO algorithm (FBPSO to balance the local search and global exploration ability, so that the parameters of the classification model is optimal. We compared different PSO mutation algorithms about classification accuracy for depression, and found the classification accuracy of support vector machine (SVM classifier based on feedback mutation PSO algorithm is the highest. Our study promotes important reference value for establishing auxiliary diagnostic used in depression recognition of clinical diagnosis.

  11. Feature Selection and Parameters Optimization of SVM Using Particle Swarm Optimization for Fault Classification in Power Distribution Systems

    Directory of Open Access Journals (Sweden)

    Ming-Yuan Cho

    2017-01-01

    Full Text Available Fast and accurate fault classification is essential to power system operations. In this paper, in order to classify electrical faults in radial distribution systems, a particle swarm optimization (PSO based support vector machine (SVM classifier has been proposed. The proposed PSO based SVM classifier is able to select appropriate input features and optimize SVM parameters to increase classification accuracy. Further, a time-domain reflectometry (TDR method with a pseudorandom binary sequence (PRBS stimulus has been used to generate a dataset for purposes of classification. The proposed technique has been tested on a typical radial distribution network to identify ten different types of faults considering 12 given input features generated by using Simulink software and MATLAB Toolbox. The success rate of the SVM classifier is over 97%, which demonstrates the effectiveness and high efficiency of the developed method.

  12. A method of neighbor classes based SVM classification for optical printed Chinese character recognition.

    Directory of Open Access Journals (Sweden)

    Jie Zhang

    Full Text Available In optical printed Chinese character recognition (OPCCR, many classifiers have been proposed for the recognition. Among the classifiers, support vector machine (SVM might be the best classifier. However, SVM is a classifier for two classes. When it is used for multi-classes in OPCCR, its computation is time-consuming. Thus, we propose a neighbor classes based SVM (NC-SVM to reduce the computation consumption of SVM. Experiments of NC-SVM classification for OPCCR have been done. The results of the experiments have shown that the NC-SVM we proposed can effectively reduce the computation time in OPCCR.

  13. APPLICATION OF FUSION WITH SAR AND OPTICAL IMAGES IN LAND USE CLASSIFICATION BASED ON SVM

    Directory of Open Access Journals (Sweden)

    C. Bao

    2012-07-01

    Full Text Available As the increment of remote sensing data with multi-space resolution, multi-spectral resolution and multi-source, data fusion technologies have been widely used in geological fields. Synthetic Aperture Radar (SAR and optical camera are two most common sensors presently. The multi-spectral optical images express spectral features of ground objects, while SAR images express backscatter information. Accuracy of the image classification could be effectively improved fusing the two kinds of images. In this paper, Terra SAR-X images and ALOS multi-spectral images were fused for land use classification. After preprocess such as geometric rectification, radiometric rectification noise suppression and so on, the two kind images were fused, and then SVM model identification method was used for land use classification. Two different fusion methods were used, one is joining SAR image into multi-spectral images as one band, and the other is direct fusing the two kind images. The former one can raise the resolution and reserve the texture information, and the latter can reserve spectral feature information and improve capability of identifying different features. The experiment results showed that accuracy of classification using fused images is better than only using multi-spectral images. Accuracy of classification about roads, habitation and water bodies was significantly improved. Compared to traditional classification method, the method of this paper for fused images with SVM classifier could achieve better results in identifying complicated land use classes, especially for small pieces ground features.

  14. Oil spill detection from SAR image using SVM based classification

    Directory of Open Access Journals (Sweden)

    A. A. Matkan

    2013-09-01

    Full Text Available In this paper, the potential of fully polarimetric L-band SAR data for detecting sea oil spills is investigated using polarimetric decompositions and texture analysis based on SVM classifier. First, power and magnitude measurements of HH and VV polarization modes and, Pauli, Freeman and Krogager decompositions are computed and applied in SVM classifier. Texture analysis is used for identification using SVM method. The texture features i.e. Mean, Variance, Contrast and Dissimilarity from them are then extracted. Experiments are conducted on full polarimetric SAR data acquired from PALSAR sensor of ALOS satellite on August 25, 2006. An accuracy assessment indicated overall accuracy of 78.92% and 96.46% for the power measurement of the VV polarization and the Krogager decomposition respectively in first step. But by use of texture analysis the results are improved to 96.44% and 96.65% quality for mean of power and magnitude measurements of HH and VV polarizations and the Krogager decomposition. Results show that the Krogager polarimetric decomposition method has the satisfying result for detection of sea oil spill on the sea surface and the texture analysis presents the good results.

  15. Comparative Study on KNN and SVM Based Weather Classification Models for Day Ahead Short Term Solar PV Power Forecasting

    Directory of Open Access Journals (Sweden)

    Fei Wang

    2017-12-01

    Full Text Available Accurate solar photovoltaic (PV power forecasting is an essential tool for mitigating the negative effects caused by the uncertainty of PV output power in systems with high penetration levels of solar PV generation. Weather classification based modeling is an effective way to increase the accuracy of day-ahead short-term (DAST solar PV power forecasting because PV output power is strongly dependent on the specific weather conditions in a given time period. However, the accuracy of daily weather classification relies on both the applied classifiers and the training data. This paper aims to reveal how these two factors impact the classification performance and to delineate the relation between classification accuracy and sample dataset scale. Two commonly used classification methods, K-nearest neighbors (KNN and support vector machines (SVM are applied to classify the daily local weather types for DAST solar PV power forecasting using the operation data from a grid-connected PV plant in Hohhot, Inner Mongolia, China. We assessed the performance of SVM and KNN approaches, and then investigated the influences of sample scale, the number of categories, and the data distribution in different categories on the daily weather classification results. The simulation results illustrate that SVM performs well with small sample scale, while KNN is more sensitive to the length of the training dataset and can achieve higher accuracy than SVM with sufficient samples.

  16. A hybrid particle swarm optimization-SVM classification for automatic cardiac auscultation

    Directory of Open Access Journals (Sweden)

    Prasertsak Charoen

    2017-04-01

    Full Text Available Cardiac auscultation is a method for a doctor to listen to heart sounds, using a stethoscope, for examining the condition of the heart. Automatic cardiac auscultation with machine learning is a promising technique to classify heart conditions without need of doctors or expertise. In this paper, we develop a classification model based on support vector machine (SVM and particle swarm optimization (PSO for an automatic cardiac auscultation system. The model consists of two parts: heart sound signal processing part and a proposed PSO for weighted SVM (WSVM classifier part. In this method, the PSO takes into account the degree of importance for each feature extracted from wavelet packet (WP decomposition. Then, by using principle component analysis (PCA, the features can be selected. The PSO technique is used to assign diverse weights to different features for the WSVM classifier. Experimental results show that both continuous and binary PSO-WSVM models achieve better classification accuracy on the heart sound samples, by reducing system false negatives (FNs, compared to traditional SVM and genetic algorithm (GA based SVM.

  17. Research on Chinese web page SVM classifer based on information gain

    Directory of Open Access Journals (Sweden)

    PAN Zhengcai

    2013-06-01

    Full Text Available In order to improve the efficiency and accuracy of text classification,optimization and improvement are made for defects and deficiencies of the feature dimensionality reduction method and traditional information gain method in text classification of Chinese web pages.At first,part-of-speech filtering and synonyms merging processes are taken for the first feature dimension reduction of feature items.Then,an improved information gain method is proposed for feature weighting computation of feature items.Finally,the classification algorithm of Support Vector Machine (SVM is used for text classification of Chinese web pages.Both theoretical analysis and experimental results show that this method has better performance and classification results than traditional method.

  18. Improving Accuracy of Intrusion Detection Model Using PCA and optimized SVM

    Directory of Open Access Journals (Sweden)

    Sumaiya Thaseen Ikram

    2016-06-01

    Full Text Available Intrusion detection is very essential for providing security to different network domains and is mostly used for locating and tracing the intruders. There are many problems with traditional intrusion detection models (IDS such as low detection capability against unknown network attack, high false alarm rate and insufficient analysis capability. Hence the major scope of the research in this domain is to develop an intrusion detection model with improved accuracy and reduced training time. This paper proposes a hybrid intrusiondetection model by integrating the principal component analysis (PCA and support vector machine (SVM. The novelty of the paper is the optimization of kernel parameters of the SVM classifier using automatic parameter selection technique. This technique optimizes the punishment factor (C and kernel parameter gamma (γ, thereby improving the accuracy of the classifier and reducing the training and testing time. The experimental results obtained on the NSL KDD and gurekddcup dataset show that the proposed technique performs better with higher accuracy, faster convergence speed and better generalization. Minimum resources are consumed as the classifier input requires reduced feature set for optimum classification. A comparative analysis of hybrid models with the proposed model is also performed.

  19. Classification of different kinds of pesticide residues on lettuce based on fluorescence spectra and WT-BCC-SVM algorithm

    Science.gov (United States)

    Zhou, Xin; Jun, Sun; Zhang, Bing; Jun, Wu

    2017-07-01

    In order to improve the reliability of the spectrum feature extracted by wavelet transform, a method combining wavelet transform (WT) with bacterial colony chemotaxis algorithm and support vector machine (BCC-SVM) algorithm (WT-BCC-SVM) was proposed in this paper. Besides, we aimed to identify different kinds of pesticide residues on lettuce leaves in a novel and rapid non-destructive way by using fluorescence spectra technology. The fluorescence spectral data of 150 lettuce leaf samples of five different kinds of pesticide residues on the surface of lettuce were obtained using Cary Eclipse fluorescence spectrometer. Standard normalized variable detrending (SNV detrending), Savitzky-Golay coupled with Standard normalized variable detrending (SG-SNV detrending) were used to preprocess the raw spectra, respectively. Bacterial colony chemotaxis combined with support vector machine (BCC-SVM) and support vector machine (SVM) classification models were established based on full spectra (FS) and wavelet transform characteristics (WTC), respectively. Moreover, WTC were selected by WT. The results showed that the accuracy of training set, calibration set and the prediction set of the best optimal classification model (SG-SNV detrending-WT-BCC-SVM) were 100%, 98% and 93.33%, respectively. In addition, the results indicated that it was feasible to use WT-BCC-SVM to establish diagnostic model of different kinds of pesticide residues on lettuce leaves.

  20. Semi-supervised Learning for Classification of Polarimetric SAR Images Based on SVM-Wishart

    Directory of Open Access Journals (Sweden)

    Hua Wen-qiang

    2015-02-01

    Full Text Available In this study, we propose a new semi-supervised classification method for Polarimetric SAR (PolSAR images, aiming at handling the issue that the number of train set is small. First, considering the scattering characters of PolSAR data, this method extracts multiple scattering features using target decomposition approach. Then, a semi-supervised learning model is established based on a co-training framework and Support Vector Machine (SVM. Both labeled and unlabeled data are utilized in this model to obtain high classification accuracy. Third, a recovery scheme based on the Wishart classifier is proposed to improve the classification performance. From the experiments conducted in this study, it is evident that the proposed method performs more effectively compared with other traditional methods when the number of train set is small.

  1. Multitask SVM learning for remote sensing data classification

    Science.gov (United States)

    Leiva-Murillo, Jose M.; Gómez-Chova, Luis; Camps-Valls, Gustavo

    2010-10-01

    Many remote sensing data processing problems are inherently constituted by several tasks that can be solved either individually or jointly. For instance, each image in a multitemporal classification setting could be taken as an individual task but relation to previous acquisitions should be properly considered. In such problems, different modalities of the data (temporal, spatial, angular) gives rise to changes between the training and test distributions, which constitutes a difficult learning problem known as covariate shift. Multitask learning methods aim at jointly solving a set of prediction problems in an efficient way by sharing information across tasks. This paper presents a novel kernel method for multitask learning in remote sensing data classification. The proposed method alleviates the dataset shift problem by imposing cross-information in the classifiers through matrix regularization. We consider the support vector machine (SVM) as core learner and two regularization schemes are introduced: 1) the Euclidean distance of the predictors in the Hilbert space; and 2) the inclusion of relational operators between tasks. Experiments are conducted in the challenging remote sensing problems of cloud screening from multispectral MERIS images and for landmine detection.

  2. A self-trained semisupervised SVM approach to the remote sensing land cover classification

    Science.gov (United States)

    Liu, Ying; Zhang, Bai; Wang, Li-min; Wang, Nan

    2013-09-01

    Support vector machines (SVM) are nowadays receiving increasing attention in remote sensing applications although this technique is very sensitive to the parameters setting and training set definition. Self-training is an effective semisupervised method, which can reduce the effort needed to prepare the training set by training the model with a small number of labeled examples and an additional set of unlabeled examples. In this study, a novel semisupervised SVM model that uses self-training approach is proposed to address the problem of remote sensing land cover classification. The key characteristics of this approach are that (1) the self-adaptive mutation particle swarm optimization algorithm is introduced to get the optimum parameters that improve the generalization performance of the SVM classifier, and (2) the Gustafson-Kessel fuzzy clustering algorithm is proposed for the selection of unlabeled points to reduce the impact of ineffective labels. The effectiveness of the proposed technique is evaluated firstly with samples from remote sensing data and then by identifying different land cover regions in the remote sensing imagery. Experimental results show that accuracy level is increased by applying this learning scheme, which results in the smallest generalization error compared with the other schemes.

  3. Polsar Land Cover Classification Based on Hidden Polarimetric Features in Rotation Domain and Svm Classifier

    Science.gov (United States)

    Tao, C.-S.; Chen, S.-W.; Li, Y.-Z.; Xiao, S.-P.

    2017-09-01

    Land cover classification is an important application for polarimetric synthetic aperture radar (PolSAR) data utilization. Rollinvariant polarimetric features such as H / Ani / α / Span are commonly adopted in PolSAR land cover classification. However, target orientation diversity effect makes PolSAR images understanding and interpretation difficult. Only using the roll-invariant polarimetric features may introduce ambiguity in the interpretation of targets' scattering mechanisms and limit the followed classification accuracy. To address this problem, this work firstly focuses on hidden polarimetric feature mining in the rotation domain along the radar line of sight using the recently reported uniform polarimetric matrix rotation theory and the visualization and characterization tool of polarimetric coherence pattern. The former rotates the acquired polarimetric matrix along the radar line of sight and fully describes the rotation characteristics of each entry of the matrix. Sets of new polarimetric features are derived to describe the hidden scattering information of the target in the rotation domain. The latter extends the traditional polarimetric coherence at a given rotation angle to the rotation domain for complete interpretation. A visualization and characterization tool is established to derive new polarimetric features for hidden information exploration. Then, a classification scheme is developed combing both the selected new hidden polarimetric features in rotation domain and the commonly used roll-invariant polarimetric features with a support vector machine (SVM) classifier. Comparison experiments based on AIRSAR and multi-temporal UAVSAR data demonstrate that compared with the conventional classification scheme which only uses the roll-invariant polarimetric features, the proposed classification scheme achieves both higher classification accuracy and better robustness. For AIRSAR data, the overall classification accuracy with the proposed

  4. POLSAR LAND COVER CLASSIFICATION BASED ON HIDDEN POLARIMETRIC FEATURES IN ROTATION DOMAIN AND SVM CLASSIFIER

    Directory of Open Access Journals (Sweden)

    C.-S. Tao

    2017-09-01

    Full Text Available Land cover classification is an important application for polarimetric synthetic aperture radar (PolSAR data utilization. Rollinvariant polarimetric features such as H / Ani / α / Span are commonly adopted in PolSAR land cover classification. However, target orientation diversity effect makes PolSAR images understanding and interpretation difficult. Only using the roll-invariant polarimetric features may introduce ambiguity in the interpretation of targets’ scattering mechanisms and limit the followed classification accuracy. To address this problem, this work firstly focuses on hidden polarimetric feature mining in the rotation domain along the radar line of sight using the recently reported uniform polarimetric matrix rotation theory and the visualization and characterization tool of polarimetric coherence pattern. The former rotates the acquired polarimetric matrix along the radar line of sight and fully describes the rotation characteristics of each entry of the matrix. Sets of new polarimetric features are derived to describe the hidden scattering information of the target in the rotation domain. The latter extends the traditional polarimetric coherence at a given rotation angle to the rotation domain for complete interpretation. A visualization and characterization tool is established to derive new polarimetric features for hidden information exploration. Then, a classification scheme is developed combing both the selected new hidden polarimetric features in rotation domain and the commonly used roll-invariant polarimetric features with a support vector machine (SVM classifier. Comparison experiments based on AIRSAR and multi-temporal UAVSAR data demonstrate that compared with the conventional classification scheme which only uses the roll-invariant polarimetric features, the proposed classification scheme achieves both higher classification accuracy and better robustness. For AIRSAR data, the overall classification accuracy

  5. Feature selection based on SVM significance maps for classification of dementia

    NARCIS (Netherlands)

    E.E. Bron (Esther); M. Smits (Marion); J.C. van Swieten (John); W.J. Niessen (Wiro); S. Klein (Stefan)

    2014-01-01

    textabstractSupport vector machine significance maps (SVM p-maps) previously showed clusters of significantly different voxels in dementiarelated brain regions. We propose a novel feature selection method for classification of dementia based on these p-maps. In our approach, the SVM p-maps are

  6. Classification of Multiple Chinese Liquors by Means of a QCM-based E-Nose and MDS-SVM Classifier.

    Science.gov (United States)

    Li, Qiang; Gu, Yu; Jia, Jing

    2017-01-30

    Chinese liquors are internationally well-known fermentative alcoholic beverages. They have unique flavors attributable to the use of various bacteria and fungi, raw materials, and production processes. Developing a novel, rapid, and reliable method to identify multiple Chinese liquors is of positive significance. This paper presents a pattern recognition system for classifying ten brands of Chinese liquors based on multidimensional scaling (MDS) and support vector machine (SVM) algorithms in a quartz crystal microbalance (QCM)-based electronic nose (e-nose) we designed. We evaluated the comprehensive performance of the MDS-SVM classifier that predicted all ten brands of Chinese liquors individually. The prediction accuracy (98.3%) showed superior performance of the MDS-SVM classifier over the back-propagation artificial neural network (BP-ANN) classifier (93.3%) and moving average-linear discriminant analysis (MA-LDA) classifier (87.6%). The MDS-SVM classifier has reasonable reliability, good fitting and prediction (generalization) performance in classification of the Chinese liquors. Taking both application of the e-nose and validation of the MDS-SVM classifier into account, we have thus created a useful method for the classification of multiple Chinese liquors.

  7. Classification of Multiple Chinese Liquors by Means of a QCM-based E-Nose and MDS-SVM Classifier

    Directory of Open Access Journals (Sweden)

    Qiang Li

    2017-01-01

    Full Text Available Chinese liquors are internationally well-known fermentative alcoholic beverages. They have unique flavors attributable to the use of various bacteria and fungi, raw materials, and production processes. Developing a novel, rapid, and reliable method to identify multiple Chinese liquors is of positive significance. This paper presents a pattern recognition system for classifying ten brands of Chinese liquors based on multidimensional scaling (MDS and support vector machine (SVM algorithms in a quartz crystal microbalance (QCM-based electronic nose (e-nose we designed. We evaluated the comprehensive performance of the MDS-SVM classifier that predicted all ten brands of Chinese liquors individually. The prediction accuracy (98.3% showed superior performance of the MDS-SVM classifier over the back-propagation artificial neural network (BP-ANN classifier (93.3% and moving average-linear discriminant analysis (MA-LDA classifier (87.6%. The MDS-SVM classifier has reasonable reliability, good fitting and prediction (generalization performance in classification of the Chinese liquors. Taking both application of the e-nose and validation of the MDS-SVM classifier into account, we have thus created a useful method for the classification of multiple Chinese liquors.

  8. Hybrid Model Based on Genetic Algorithms and SVM Applied to Variable Selection within Fruit Juice Classification

    Directory of Open Access Journals (Sweden)

    C. Fernandez-Lozano

    2013-01-01

    Full Text Available Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM. Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA, the most representative variables for a specific classification problem can be selected.

  9. Hybrid Model Based on Genetic Algorithms and SVM Applied to Variable Selection within Fruit Juice Classification

    Science.gov (United States)

    Fernandez-Lozano, C.; Canto, C.; Gestal, M.; Andrade-Garda, J. M.; Rabuñal, J. R.; Dorado, J.; Pazos, A.

    2013-01-01

    Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM). Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA), the most representative variables for a specific classification problem can be selected. PMID:24453933

  10. Expected Classification Accuracy

    Directory of Open Access Journals (Sweden)

    Lawrence M. Rudner

    2005-08-01

    Full Text Available Every time we make a classification based on a test score, we should expect some number..of misclassifications. Some examinees whose true ability is within a score range will have..observed scores outside of that range. A procedure for providing a classification table of..true and expected scores is developed for polytomously scored items under item response..theory and applied to state assessment data. A simplified procedure for estimating the..table entries is also presented.

  11. Arrhythmia classification using SVM with selected features | Kohli ...

    African Journals Online (AJOL)

    The various types of arrhythmias in the cardiac arrhythmias ECG database chosen from University of California at Irvine (UCI) to train SVM include ischemic changes (coronary artery disease), old inferior myocardial infarction, sinus bradycardy, right bundle branch block, and others. ECG arrhythmia datasets are of generally ...

  12. Classification of EEG-P300 Signals Extracted from Brain Activities in BCI Systems Using ν-SVM and BLDA Algorithms

    Directory of Open Access Journals (Sweden)

    Ali MOMENNEZHAD

    2014-06-01

    Full Text Available In this paper, a linear predictive coding (LPC model is used to improve classification accuracy, convergent speed to maximum accuracy, and maximum bitrates in brain computer interface (BCI system based on extracting EEG-P300 signals. First, EEG signal is filtered in order to eliminate high frequency noise. Then, the parameters of filtered EEG signal are extracted using LPC model. Finally, the samples are reconstructed by LPC coefficients and two classifiers, a Bayesian Linear discriminant analysis (BLDA, and b the υ-support vector machine (υ-SVM are applied in order to classify. The proposed algorithm performance is compared with fisher linear discriminant analysis (FLDA. Results show that the efficiency of our algorithm in improving classification accuracy and convergent speed to maximum accuracy are much better. As example at the proposed algorithms, respectively BLDA with LPC model and υ-SVM with LPC model with8 electrode configuration for subject S1 the total classification accuracy is improved as 9.4% and 1.7%. And also, subject 7 at BLDA and υ-SVM with LPC model algorithms (LPC+BLDA and LPC+ υ-SVM after block 11th converged to maximum accuracy but Fisher Linear Discriminant Analysis (FLDA algorithm did not converge to maximum accuracy (with the same configuration. So, it can be used as a promising tool in designing BCI systems.

  13. Comparison of hand-craft feature based SVM and CNN based deep learning framework for automatic polyp classification.

    Science.gov (United States)

    Younghak Shin; Balasingham, Ilangko

    2017-07-01

    Colonoscopy is a standard method for screening polyps by highly trained physicians. Miss-detected polyps in colonoscopy are potential risk factor for colorectal cancer. In this study, we investigate an automatic polyp classification framework. We aim to compare two different approaches named hand-craft feature method and convolutional neural network (CNN) based deep learning method. Combined shape and color features are used for hand craft feature extraction and support vector machine (SVM) method is adopted for classification. For CNN approach, three convolution and pooling based deep learning framework is used for classification purpose. The proposed framework is evaluated using three public polyp databases. From the experimental results, we have shown that the CNN based deep learning framework shows better classification performance than the hand-craft feature based methods. It achieves over 90% of classification accuracy, sensitivity, specificity and precision.

  14. Iterative Reweighted Noninteger Norm Regularizing SVM for Gene Expression Data Classification

    Directory of Open Access Journals (Sweden)

    Jianwei Liu

    2013-01-01

    Full Text Available Support vector machine is an effective classification and regression method that uses machine learning theory to maximize the predictive accuracy while avoiding overfitting of data. L2 regularization has been commonly used. If the training dataset contains many noise variables, L1 regularization SVM will provide a better performance. However, both L1 and L2 are not the optimal regularization method when handing a large number of redundant values and only a small amount of data points is useful for machine learning. We have therefore proposed an adaptive learning algorithm using the iterative reweighted p-norm regularization support vector machine for 0 < p ≤ 2. A simulated data set was created to evaluate the algorithm. It was shown that a p value of 0.8 was able to produce better feature selection rate with high accuracy. Four cancer data sets from public data banks were used also for the evaluation. All four evaluations show that the new adaptive algorithm was able to achieve the optimal prediction error using a p value less than L1 norm. Moreover, we observe that the proposed Lp penalty is more robust to noise variables than the L1 and L2 penalties.

  15. Iterative reweighted noninteger norm regularizing SVM for gene expression data classification.

    Science.gov (United States)

    Liu, Jianwei; Li, Shuang Cheng; Luo, Xionglin

    2013-01-01

    Support vector machine is an effective classification and regression method that uses machine learning theory to maximize the predictive accuracy while avoiding overfitting of data. L2 regularization has been commonly used. If the training dataset contains many noise variables, L1 regularization SVM will provide a better performance. However, both L1 and L2 are not the optimal regularization method when handing a large number of redundant values and only a small amount of data points is useful for machine learning. We have therefore proposed an adaptive learning algorithm using the iterative reweighted p-norm regularization support vector machine for 0 < p ≤ 2. A simulated data set was created to evaluate the algorithm. It was shown that a p value of 0.8 was able to produce better feature selection rate with high accuracy. Four cancer data sets from public data banks were used also for the evaluation. All four evaluations show that the new adaptive algorithm was able to achieve the optimal prediction error using a p value less than L1 norm. Moreover, we observe that the proposed Lp penalty is more robust to noise variables than the L1 and L2 penalties.

  16. A Multi-Classification Method of Improved SVM-based Information Fusion for Traffic Parameters Forecasting

    Directory of Open Access Journals (Sweden)

    Hongzhuan Zhao

    2016-04-01

    Full Text Available With the enrichment of perception methods, modern transportation system has many physical objects whose states are influenced by many information factors so that it is a typical Cyber-Physical System (CPS. Thus, the traffic information is generally multi-sourced, heterogeneous and hierarchical. Existing research results show that the multisourced traffic information through accurate classification in the process of information fusion can achieve better parameters forecasting performance. For solving the problem of traffic information accurate classification, via analysing the characteristics of the multi-sourced traffic information and using redefined binary tree to overcome the shortcomings of the original Support Vector Machine (SVM classification in information fusion, a multi-classification method using improved SVM in information fusion for traffic parameters forecasting is proposed. The experiment was conducted to examine the performance of the proposed scheme, and the results reveal that the method can get more accurate and practical outcomes.

  17. Classification Accuracy Is Not Enough

    DEFF Research Database (Denmark)

    Sturm, Bob L.

    2013-01-01

    A recent review of the research literature evaluating music genre recognition (MGR) systems over the past two decades shows that most works (81\\%) measure the capacity of a system to recognize genre by its classification accuracy. We show here, by implementing and testing three categorically...

  18. SVM-Based Classification of Segmented Airborne LiDAR Point Clouds in Urban Areas

    OpenAIRE

    Xiaogang Ning; Xiangguo Lin; Jixian Zhang

    2013-01-01

    Object-based point cloud analysis (OBPA) is useful for information extraction from airborne LiDAR point clouds. An object-based classification method is proposed for classifying the airborne LiDAR point clouds in urban areas herein. In the process of classification, the surface growing algorithm is employed to make clustering of the point clouds without outliers, thirteen features of the geometry, radiometry, topology and echo characteristics are calculated, a support vector machine (SVM) is ...

  19. Epileptic seizure classifications of single-channel scalp EEG data using wavelet-based features and SVM.

    Science.gov (United States)

    Janjarasjitt, Suparerk

    2017-02-13

    In this study, wavelet-based features of single-channel scalp EEGs recorded from subjects with intractable seizure are examined for epileptic seizure classification. The wavelet-based features extracted from scalp EEGs are simply based on detail and approximation coefficients obtained from the discrete wavelet transform. Support vector machine (SVM), one of the most commonly used classifiers, is applied to classify vectors of wavelet-based features of scalp EEGs into either seizure or non-seizure class. In patient-based epileptic seizure classification, a training data set used to train SVM classifiers is composed of wavelet-based features of scalp EEGs corresponding to the first epileptic seizure event. Overall, the excellent performance on patient-dependent epileptic seizure classification is obtained with the average accuracy, sensitivity, and specificity of, respectively, 0.9687, 0.7299, and 0.9813. The vector composed of two wavelet-based features of scalp EEGs provide the best performance on patient-dependent epileptic seizure classification in most cases, i.e., 19 cases out of 24. The wavelet-based features corresponding to the 32-64, 8-16, and 4-8 Hz subbands of scalp EEGs are the mostly used features providing the best performance on patient-dependent classification. Furthermore, the performance on both patient-dependent and patient-independent epileptic seizure classifications are also validated using tenfold cross-validation. From the patient-independent epileptic seizure classification validated using tenfold cross-validation, it is shown that the best classification performance is achieved using the wavelet-based features corresponding to the 64-128 and 4-8 Hz subbands of scalp EEGs.

  20. Classification of surface defects on bridge cable based on PSO-SVM

    Science.gov (United States)

    Li, Xinke; Gao, Chao; Guo, Yongcai; Shao, Yanhua; He, Fuliang

    2014-07-01

    Distributed machine vision system was applied for the detection on the cable surface defect of the cable-stayed bridge, and access to surface defects including longitudinal cracking, transverse cracking, surface erosion and scarring pit holes and other scars. In order to achieve the automatic classification of surface defects, firstly, part of the texture features, gray features and shape features on the defect image were selected as the target classification feature quantities; then the particle swarm optimization (PSO) was introduced to optimize the punitive coefficient and kernel function parameter of the support vector machine (SVM) model; and finally the objective of defects was identified with the help of the PSOSVM classifier. Recognition experiments were performed on cable surface defects, presenting a recognition rate of 96.25 percent. The results showed that PSO-SVM has high recognition rate for classification of surface defects on bridge cable.

  1. SVM-Based Classification of Segmented Airborne LiDAR Point Clouds in Urban Areas

    Directory of Open Access Journals (Sweden)

    Xiaogang Ning

    2013-07-01

    Full Text Available Object-based point cloud analysis (OBPA is useful for information extraction from airborne LiDAR point clouds. An object-based classification method is proposed for classifying the airborne LiDAR point clouds in urban areas herein. In the process of classification, the surface growing algorithm is employed to make clustering of the point clouds without outliers, thirteen features of the geometry, radiometry, topology and echo characteristics are calculated, a support vector machine (SVM is utilized to classify the segments, and connected component analysis for 3D point clouds is proposed to optimize the original classification results. Three datasets with different point densities and complexities are employed to test our method. Experiments suggest that the proposed method is capable of making a classification of the urban point clouds with the overall classification accuracy larger than 92.34% and the Kappa coefficient larger than 0.8638, and the classification accuracy is promoted with the increasing of the point density, which is meaningful for various types of applications.

  2. a Comparison Study of Different Kernel Functions for Svm-Based Classification of Multi-Temporal Polarimetry SAR Data

    Science.gov (United States)

    Yekkehkhany, B.; Safari, A.; Homayouni, S.; Hasanlou, M.

    2014-10-01

    In this paper, a framework is developed based on Support Vector Machines (SVM) for crop classification using polarimetric features extracted from multi-temporal Synthetic Aperture Radar (SAR) imageries. The multi-temporal integration of data not only improves the overall retrieval accuracy but also provides more reliable estimates with respect to single-date data. Several kernel functions are employed and compared in this study for mapping the input space to higher Hilbert dimension space. These kernel functions include linear, polynomials and Radial Based Function (RBF). The method is applied to several UAVSAR L-band SAR images acquired over an agricultural area near Winnipeg, Manitoba, Canada. In this research, the temporal alpha features of H/A/α decomposition method are used in classification. The experimental tests show an SVM classifier with RBF kernel for three dates of data increases the Overall Accuracy (OA) to up to 3% in comparison to using linear kernel function, and up to 1% in comparison to a 3rd degree polynomial kernel function.

  3. A COMPARISON STUDY OF DIFFERENT KERNEL FUNCTIONS FOR SVM-BASED CLASSIFICATION OF MULTI-TEMPORAL POLARIMETRY SAR DATA

    Directory of Open Access Journals (Sweden)

    B. Yekkehkhany

    2014-10-01

    Full Text Available In this paper, a framework is developed based on Support Vector Machines (SVM for crop classification using polarimetric features extracted from multi-temporal Synthetic Aperture Radar (SAR imageries. The multi-temporal integration of data not only improves the overall retrieval accuracy but also provides more reliable estimates with respect to single-date data. Several kernel functions are employed and compared in this study for mapping the input space to higher Hilbert dimension space. These kernel functions include linear, polynomials and Radial Based Function (RBF. The method is applied to several UAVSAR L-band SAR images acquired over an agricultural area near Winnipeg, Manitoba, Canada. In this research, the temporal alpha features of H/A/α decomposition method are used in classification. The experimental tests show an SVM classifier with RBF kernel for three dates of data increases the Overall Accuracy (OA to up to 3% in comparison to using linear kernel function, and up to 1% in comparison to a 3rd degree polynomial kernel function.

  4. Using Generalized Entropies and OC-SVM with Mahalanobis Kernel for Detection and Classification of Anomalies in Network Traffic

    Directory of Open Access Journals (Sweden)

    Jayro Santiago-Paz

    2015-09-01

    Full Text Available Network anomaly detection and classification is an important open issue in network security. Several approaches and systems based on different mathematical tools have been studied and developed, among them, the Anomaly-Network Intrusion Detection System (A-NIDS, which monitors network traffic and compares it against an established baseline of a “normal” traffic profile. Then, it is necessary to characterize the “normal” Internet traffic. This paper presents an approach for anomaly detection and classification based on Shannon, Rényi and Tsallis entropies of selected features, and the construction of regions from entropy data employing the Mahalanobis distance (MD, and One Class Support Vector Machine (OC-SVM with different kernels (Radial Basis Function (RBF and Mahalanobis Kernel (MK for “normal” and abnormal traffic. Regular and non-regular regions built from “normal” traffic profiles allow anomaly detection, while the classification is performed under the assumption that regions corresponding to the attack classes have been previously characterized. Although this approach allows the use of as many features as required, only four well-known significant features were selected in our case. In order to evaluate our approach, two different data sets were used: one set of real traffic obtained from an Academic Local Area Network (LAN, and the other a subset of the 1998 MIT-DARPA set. For these data sets, a True positive rate up to 99.35%, a True negative rate up to 99.83% and a False negative rate at about 0.16% were yielded. Experimental results show that certain q-values of the generalized entropies and the use of OC-SVM with RBF kernel improve the detection rate in the detection stage, while the novel inclusion of MK kernel in OC-SVM and k-temporal nearest neighbors improve accuracy in classification. In addition, the results show that using the Box-Cox transformation, the Mahalanobis distance yielded high detection rates with

  5. Effects of atmospheric correction and pansharpening on LULC classification accuracy using WorldView-2 imagery

    Directory of Open Access Journals (Sweden)

    Chinsu Lin

    2015-05-01

    Full Text Available Changes of Land Use and Land Cover (LULC affect atmospheric, climatic, and biological spheres of the earth. Accurate LULC map offers detail information for resources management and intergovernmental cooperation to debate global warming and biodiversity reduction. This paper examined effects of pansharpening and atmospheric correction on LULC classification. Object-Based Support Vector Machine (OB-SVM and Pixel-Based Maximum Likelihood Classifier (PB-MLC were applied for LULC classification. Results showed that atmospheric correction is not necessary for LULC classification if it is conducted in the original multispectral image. Nevertheless, pansharpening plays much more important roles on the classification accuracy than the atmospheric correction. It can help to increase classification accuracy by 12% on average compared to the ones without pansharpening. PB-MLC and OB-SVM achieved similar classification rate. This study indicated that the LULC classification accuracy using PB-MLC and OB-SVM is 82% and 89% respectively. A combination of atmospheric correction, pansharpening, and OB-SVM could offer promising LULC maps from WorldView-2 multispectral and panchromatic images.

  6. SVM-based classification of LV wall motion in cardiac MRI with the assessment of STE

    Science.gov (United States)

    Mantilla, Juan; Garreau, Mireille; Bellanger, Jean-Jacques; Paredes, José Luis

    2015-01-01

    In this paper, we propose an automated method to classify normal/abnormal wall motion in Left Ventricle (LV) function in cardiac cine-Magnetic Resonance Imaging (MRI), taking as reference, strain information obtained from 2D Speckle Tracking Echocardiography (STE). Without the need of pre-processing and by exploiting all the images acquired during a cardiac cycle, spatio-temporal profiles are extracted from a subset of radial lines from the ventricle centroid to points outside the epicardial border. Classical Support Vector Machines (SVM) are used to classify features extracted from gray levels of the spatio-temporal profile as well as their representations in the Wavelet domain under the assumption that the data may be sparse in that domain. Based on information obtained from radial strain curves in 2D-STE studies, we label all the spatio-temporal profiles that belong to a particular segment as normal if the peak systolic radial strain curve of this segment presents normal kinesis, or abnormal if the peak systolic radial strain curve presents hypokinesis or akinesis. For this study, short-axis cine- MR images are collected from 9 patients with cardiac dyssynchrony for which we have the radial strain tracings at the mid-papilary muscle obtained by 2D STE; and from one control group formed by 9 healthy subjects. The best classification performance is obtained with the gray level information of the spatio-temporal profiles using a RBF kernel with 91.88% of accuracy, 92.75% of sensitivity and 91.52% of specificity.

  7. AN IMPLEMENTATION OF EIS-SVM CLASSIFIER USING RESEARCH ARTICLES FOR TEXT CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    B Ramesh

    2016-04-01

    Full Text Available Automatic text classification is a prominent research topic in text mining. The text pre-processing is a major role in text classifier. The efficiency of pre-processing techniques is increasing the performance of text classifier. In this paper, we are implementing ECAS stemmer, Efficient Instance Selection and Pre-computed Kernel Support Vector Machine for text classification using recent research articles. We are using better pre-processing techniques such as ECAS stemmer to find root word, Efficient Instance Selection for dimensionality reduction of text data and Pre-computed Kernel Support Vector Machine for classification of selected instances. In this experiments were performed on 750 research articles with three classes such as engineering article, medical articles and educational articles. The EIS-SVM classifier provides better performance in real-time research articles classification.

  8. Using self-organizing map (SOM) and support vector machine (SVM) for classification of selectivity of ACAT inhibitors.

    Science.gov (United States)

    Wang, Ling; Wang, Maolin; Yan, Aixia; Dai, Bin

    2013-02-01

    Using a self-organizing map (SOM) and support vector machine, two classification models were built to predict whether a compound is a selective inhibitor toward the two Acyl-coenzyme A: cholesterol acyltransferase (ACAT) isozymes, ACAT-1 and ACAT-2. A dataset of 97 ACAT inhibitors was collected. For each molecule, the global descriptors, 2D and 3D property autocorrelation descriptors and autocorrelation of surface properties were calculated from the program ADRIANA.Code. The prediction accuracies of the models (based on the training/ test set splitting by SOM method) for the test sets are 88.9 % for SOM1, 92.6 % for SVM1 model. In addition, the extended connectivity fingerprints (ECFP_4) for all the molecules were calculated and the structure-activity relationship of selective ACAT inhibitors was summarized, which may help find important structural features of inhibitors relating to the selectivity of ACAT isozymes.

  9. SVM and ANN Based Classification of Plant Diseases Using Feature Reduction Technique

    Directory of Open Access Journals (Sweden)

    Jagadeesh D.Pujari

    2016-06-01

    Full Text Available Computers have been used for mechanization and automation in different applications of agriculture/horticulture. The critical decision on the agricultural yield and plant protection is done with the development of expert system (decision support system using computer vision techniques. One of the areas considered in the present work is the processing of images of plant diseases affecting agriculture/horticulture crops. The first symptoms of plant disease have to be correctly detected, identified, and quantified in the initial stages. The color and texture features have been used in order to work with the sample images of plant diseases. Algorithms for extraction of color and texture features have been developed, which are in turn used to train support vector machine (SVM and artificial neural network (ANN classifiers. The study has presented a reduced feature set based approach for recognition and classification of images of plant diseases. The results reveal that SVM classifier is more suitable for identification and classification of plant diseases affecting agriculture/horticulture crops.

  10. A structural SVM approach for reference parsing.

    Science.gov (United States)

    Zhang, Xiaoli; Zou, Jie; Le, Daniel X; Thoma, George R

    2011-06-09

    Automated extraction of bibliographic data, such as article titles, author names, abstracts, and references is essential to the affordable creation of large citation databases. References, typically appearing at the end of journal articles, can also provide valuable information for extracting other bibliographic data. Therefore, parsing individual reference to extract author, title, journal, year, etc. is sometimes a necessary preprocessing step in building citation-indexing systems. The regular structure in references enables us to consider reference parsing a sequence learning problem and to study structural Support Vector Machine (structural SVM), a newly developed structured learning algorithm on parsing references. In this study, we implemented structural SVM and used two types of contextual features to compare structural SVM with conventional SVM. Both methods achieve above 98% token classification accuracy and above 95% overall chunk-level accuracy for reference parsing. We also compared SVM and structural SVM to Conditional Random Field (CRF). The experimental results show that structural SVM and CRF achieve similar accuracies at token- and chunk-levels. When only basic observation features are used for each token, structural SVM achieves higher performance compared to SVM since it utilizes the contextual label features. However, when the contextual observation features from neighboring tokens are combined, SVM performance improves greatly, and is close to that of structural SVM after adding the second order contextual observation features. The comparison of these two methods with CRF using the same set of binary features show that both structural SVM and CRF perform better than SVM, indicating their stronger sequence learning ability in reference parsing.

  11. A Fast SVM-Based Tongue’s Colour Classification Aided by k-Means Clustering Identifiers and Colour Attributes as Computer-Assisted Tool for Tongue Diagnosis

    Directory of Open Access Journals (Sweden)

    Nur Diyana Kamarudin

    2017-01-01

    Full Text Available In tongue diagnosis, colour information of tongue body has kept valuable information regarding the state of disease and its correlation with the internal organs. Qualitatively, practitioners may have difficulty in their judgement due to the instable lighting condition and naked eye’s ability to capture the exact colour distribution on the tongue especially the tongue with multicolour substance. To overcome this ambiguity, this paper presents a two-stage tongue’s multicolour classification based on a support vector machine (SVM whose support vectors are reduced by our proposed k-means clustering identifiers and red colour range for precise tongue colour diagnosis. In the first stage, k-means clustering is used to cluster a tongue image into four clusters of image background (black, deep red region, red/light red region, and transitional region. In the second-stage classification, red/light red tongue images are further classified into red tongue or light red tongue based on the red colour range derived in our work. Overall, true rate classification accuracy of the proposed two-stage classification to diagnose red, light red, and deep red tongue colours is 94%. The number of support vectors in SVM is improved by 41.2%, and the execution time for one image is recorded as 48 seconds.

  12. CLASSIFICATION ACCURACY INCREASE USING MULTISENSOR DATA FUSION

    Directory of Open Access Journals (Sweden)

    A. Makarau

    2012-09-01

    Full Text Available The practical use of very high resolution visible and near-infrared (VNIR data is still growing (IKONOS, Quickbird, GeoEye-1, etc. but for classification purposes the number of bands is limited in comparison to full spectral imaging. These limitations may lead to the confusion of materials such as different roofs, pavements, roads, etc. and therefore may provide wrong interpretation and use of classification products. Employment of hyperspectral data is another solution, but their low spatial resolution (comparing to multispectral data restrict their usage for many applications. Another improvement can be achieved by fusion approaches of multisensory data since this may increase the quality of scene classification. Integration of Synthetic Aperture Radar (SAR and optical data is widely performed for automatic classification, interpretation, and change detection. In this paper we present an approach for very high resolution SAR and multispectral data fusion for automatic classification in urban areas. Single polarization TerraSAR-X (SpotLight mode and multispectral data are integrated using the INFOFUSE framework, consisting of feature extraction (information fission, unsupervised clustering (data representation on a finite domain and dimensionality reduction, and data aggregation (Bayesian or neural network. This framework allows a relevant way of multisource data combination following consensus theory. The classification is not influenced by the limitations of dimensionality, and the calculation complexity primarily depends on the step of dimensionality reduction. Fusion of single polarization TerraSAR-X, WorldView-2 (VNIR or full set, and Digital Surface Model (DSM data allow for different types of urban objects to be classified into predefined classes of interest with increased accuracy. The comparison to classification results of WorldView-2 multispectral data (8 spectral bands is provided and the numerical evaluation of the method in

  13. Classification Accuracy Increase Using Multisensor Data Fusion

    Science.gov (United States)

    Makarau, A.; Palubinskas, G.; Reinartz, P.

    2011-09-01

    The practical use of very high resolution visible and near-infrared (VNIR) data is still growing (IKONOS, Quickbird, GeoEye-1, etc.) but for classification purposes the number of bands is limited in comparison to full spectral imaging. These limitations may lead to the confusion of materials such as different roofs, pavements, roads, etc. and therefore may provide wrong interpretation and use of classification products. Employment of hyperspectral data is another solution, but their low spatial resolution (comparing to multispectral data) restrict their usage for many applications. Another improvement can be achieved by fusion approaches of multisensory data since this may increase the quality of scene classification. Integration of Synthetic Aperture Radar (SAR) and optical data is widely performed for automatic classification, interpretation, and change detection. In this paper we present an approach for very high resolution SAR and multispectral data fusion for automatic classification in urban areas. Single polarization TerraSAR-X (SpotLight mode) and multispectral data are integrated using the INFOFUSE framework, consisting of feature extraction (information fission), unsupervised clustering (data representation on a finite domain and dimensionality reduction), and data aggregation (Bayesian or neural network). This framework allows a relevant way of multisource data combination following consensus theory. The classification is not influenced by the limitations of dimensionality, and the calculation complexity primarily depends on the step of dimensionality reduction. Fusion of single polarization TerraSAR-X, WorldView-2 (VNIR or full set), and Digital Surface Model (DSM) data allow for different types of urban objects to be classified into predefined classes of interest with increased accuracy. The comparison to classification results of WorldView-2 multispectral data (8 spectral bands) is provided and the numerical evaluation of the method in comparison to

  14. Energy Management in Wireless Sensor Networks Based on Naive Bayes, MLP, and SVM Classifications: A Comparative Study

    Directory of Open Access Journals (Sweden)

    Abdulaziz Y. Barnawi

    2016-01-01

    Full Text Available Maximizing wireless sensor networks (WSNs lifetime is a primary objective in the design of these networks. Intelligent energy management models can assist designers to achieve this objective. These models aim to reduce the number of selected sensors to report environmental measurements and, hence, achieve higher energy efficiency while maintaining the desired level of accuracy in the reported measurement. In this paper, we present a comparative study of three intelligent models based on Naive Bayes, Multilayer Perceptrons (MLP, and Support Vector Machine (SVM classifiers. Simulation results show that Linear-SVM selects sensors that produce higher energy efficiency compared to those selected by MLP and Naive Bayes for the same WSNs Lifetime Extension Factor.

  15. Robust optimization of SVM hyperparameters in the classification of bioactive compounds.

    Science.gov (United States)

    Czarnecki, Wojciech M; Podlewska, Sabina; Bojarski, Andrzej J

    2015-01-01

    Support Vector Machine has become one of the most popular machine learning tools used in virtual screening campaigns aimed at finding new drug candidates. Although it can be extremely effective in finding new potentially active compounds, its application requires the optimization of the hyperparameters with which the assessment is being run, particularly the C and [Formula: see text] values. The optimization requirement in turn, establishes the need to develop fast and effective approaches to the optimization procedure, providing the best predictive power of the constructed model. In this study, we investigated the Bayesian and random search optimization of Support Vector Machine hyperparameters for classifying bioactive compounds. The effectiveness of these strategies was compared with the most popular optimization procedures-grid search and heuristic choice. We demonstrated that Bayesian optimization not only provides better, more efficient classification but is also much faster-the number of iterations it required for reaching optimal predictive performance was the lowest out of the all tested optimization methods. Moreover, for the Bayesian approach, the choice of parameters in subsequent iterations is directed and justified; therefore, the results obtained by using it are constantly improved and the range of hyperparameters tested provides the best overall performance of Support Vector Machine. Additionally, we showed that a random search optimization of hyperparameters leads to significantly better performance than grid search and heuristic-based approaches. The Bayesian approach to the optimization of Support Vector Machine parameters was demonstrated to outperform other optimization methods for tasks concerned with the bioactivity assessment of chemical compounds. This strategy not only provides a higher accuracy of classification, but is also much faster and more directed than other approaches for optimization. It appears that, despite its simplicity

  16. THE APPLICATION OF SUPPORT VECTOR MACHINE (SVM USING CIELAB COLOR MODEL, COLOR INTENSITY AND COLOR CONSTANCY AS FEATURES FOR ORTHO IMAGE CLASSIFICATION OF BENTHIC HABITATS IN HINATUAN, SURIGAO DEL SUR, PHILIPPINES

    Directory of Open Access Journals (Sweden)

    J. E. Cubillas

    2016-06-01

    Full Text Available This study demonstrates the application of CIELAB, Color intensity, and One Dimensional Scalar Constancy as features for image recognition and classifying benthic habitats in an image with the coastal areas of Hinatuan, Surigao Del Sur, Philippines as the study area. The study area is composed of four datasets, namely: (a Blk66L005, (b Blk66L021, (c Blk66L024, and (d Blk66L0114. SVM optimization was performed in Matlab® software with the help of Parallel Computing Toolbox to hasten the SVM computing speed. The image used for collecting samples for SVM procedure was Blk66L0114 in which a total of 134,516 sample objects of mangrove, possible coral existence with rocks, sand, sea, fish pens and sea grasses were collected and processed. The collected samples were then used as training sets for the supervised learning algorithm and for the creation of class definitions. The learned hyper-planes separating one class from another in the multi-dimensional feature space can be thought of as a super feature which will then be used in developing the C (classifier rule set in eCognition® software. The classification results of the sampling site yielded an accuracy of 98.85% which confirms the reliability of remote sensing techniques and analysis employed to orthophotos like the CIELAB, Color Intensity and One dimensional scalar constancy and the use of SVM classification algorithm in classifying benthic habitats.

  17. A two-dimensional matrix image based feature extraction method for classification of sEMG: A comparative analysis based on SVM, KNN and RBF-NN.

    Science.gov (United States)

    Wen, Tingxi; Zhang, Zhongnan; Qiu, Ming; Zeng, Ming; Luo, Weizhen

    2017-01-01

    The computer mouse is an important human-computer interaction device. But patients with physical finger disability are unable to operate this device. Surface EMG (sEMG) can be monitored by electrodes on the skin surface and is a reflection of the neuromuscular activities. Therefore, we can control limbs auxiliary equipment by utilizing sEMG classification in order to help the physically disabled patients to operate the mouse. To develop a new a method to extract sEMG generated by finger motion and apply novel features to classify sEMG. A window-based data acquisition method was presented to extract signal samples from sEMG electordes. Afterwards, a two-dimensional matrix image based feature extraction method, which differs from the classical methods based on time domain or frequency domain, was employed to transform signal samples to feature maps used for classification. In the experiments, sEMG data samples produced by the index and middle fingers at the click of a mouse button were separately acquired. Then, characteristics of the samples were analyzed to generate a feature map for each sample. Finally, the machine learning classification algorithms (SVM, KNN, RBF-NN) were employed to classify these feature maps on a GPU. The study demonstrated that all classifiers can identify and classify sEMG samples effectively. In particular, the accuracy of the SVM classifier reached up to 100%. The signal separation method is a convenient, efficient and quick method, which can effectively extract the sEMG samples produced by fingers. In addition, unlike the classical methods, the new method enables to extract features by enlarging sample signals' energy appropriately. The classical machine learning classifiers all performed well by using these features.

  18. Approche de sélection d’attributs pour la classification basée sur l’algorithme RFE-SVM

    OpenAIRE

    Slimani, yahya; Essegir, Mohamed Amir; Samb, Mouhamadou Lamine; Camara, Fodé; Ndiaye, Samba

    2014-01-01

    International audience; The feature selection for classification is a very active research field in data mining and optimization. Its combinatorial nature requires the development of specific techniques (such as filters, wrappers, genetic algorithms, and so on) or hybrid approaches combining several optimization methods. In this context, the support vector machine recursive feature elimination (SVM-RFE), is distinguished as one of the most effective methods. However, the RFE-SVM algorithm is ...

  19. Automated Classification and Removal of EEG Artifacts with SVM and Wavelet-ICA.

    Science.gov (United States)

    Sai, Chong Yeh; Mokhtar, Norrima; Arof, Hamzah; Cumming, Paul; Iwahashi, Masahiro

    2017-07-04

    Brain electrical activity recordings by electroencephalography (EEG) are often contaminated with signal artifacts. Procedures for automated removal of EEG artifacts are frequently sought for clinical diagnostics and brain computer interface (BCI) applications. In recent years, a combination of independent component analysis (ICA) and discrete wavelet transform (DWT) has been introduced as standard technique for EEG artifact removal. However, in performing the wavelet-ICA procedure, visual inspection or arbitrary thresholding may be required for identifying artifactual components in the EEG signal. We now propose a novel approach for identifying artifactual components separated by wavelet-ICA using a pre-trained support vector machine (SVM). Our method presents a robust and extendable system that enables fully automated identification and removal of artifacts from EEG signals, without applying any arbitrary thresholding. Using test data contaminated by eye blink artifacts, we show that our method performed better in identifying artifactual components than did existing thresholding methods. Furthermore, wavelet-ICA in conjunction with SVM successfully removed target artifacts, while largely retaining the EEG source signals of interest. We propose a set of features including kurtosis, variance, Shannon's entropy and range of amplitude as training and test data of SVM to identify eye blink artifacts in EEG signals. This combinatorial method is also extendable to accommodate multiple types of artifacts present in multi-channel EEG. We envision future research to explore other descriptive features corresponding to other types of artifactual components.

  20. Improving Accuracy of Image Classification Using GIS

    Science.gov (United States)

    Gupta, R. K.; Prasad, T. S.; Bala Manikavelu, P. M.; Vijayan, D.

    The Remote Sensing signal which reaches sensor on-board the satellite is the complex aggregation of signals (in agriculture field for example) from soil (with all its variations such as colour, texture, particle size, clay content, organic and nutrition content, inorganic content, water content etc.), plant (height, architecture, leaf area index, mean canopy inclination etc.), canopy closure status and atmospheric effects, and from this we want to find say, characteristics of vegetation. If sensor on- board the satellite makes measurements in n-bands (n of n*1 dimension) and number of classes in an image are c (f of c*1 dimension), then considering linear mixture modeling the pixel classification problem could be written as n = m* f +, where m is the transformation matrix of (n*c) dimension and therepresents the error vector (noise). The problem is to estimate f by inverting the above equation and the possible solutions for such problem are many. Thus, getting back individual classes from satellite data is an ill-posed inverse problem for which unique solution is not feasible and this puts limit to the obtainable classification accuracy. Maximum Likelihood (ML) is the constraint mostly practiced in solving such a situation which suffers from the handicaps of assumed Gaussian distribution and random nature of pixels (in-fact there is high auto-correlation among the pixels of a specific class and further high auto-correlation among the pixels in sub- classes where the homogeneity would be high among pixels). Due to this, achieving of very high accuracy in the classification of remote sensing images is not a straight proposition. With the availability of the GIS for the area under study (i) a priori probability for different classes could be assigned to ML classifier in more realistic terms and (ii) the purity of training sets for different thematic classes could be better ascertained. To what extent this could improve the accuracy of classification in ML classifier

  1. Accuracy to detection timing for assisting repetitive facilitation exercise system using MRCP and SVM.

    Science.gov (United States)

    Miura, Satoshi; Takazawa, Junichi; Kobayashi, Yo; Fujie, Masakatsu G

    2017-01-01

    This paper presents a feasibility study of a brain-machine interface system to assist repetitive facilitation exercise. Repetitive facilitation exercise is an effective rehabilitation method for patients with hemiplegia. In repetitive facilitation exercise, a therapist stimulates the paralyzed part of the patient while motor commands run along the nerve pathway. However, successful repetitive facilitation exercise is difficult to achieve and even a skilled practitioner cannot detect when a motor command occurs in patient's brain. We proposed a brain-machine interface system for automatically detecting motor commands and stimulating the paralyzed part of a patient. To determine motor commands from patient electroencephalogram (EEG) data, we measured the movement-related cortical potential (MRCP) and constructed a support vector machine system. In this paper, we validated the prediction timing of the system at the highest accuracy by the system using EEG and MRCP. In the experiments, we measured the EEG when the participant bent their elbow when prompted to do so. We analyzed the EEG data using a cross-validation method. We found that the average accuracy was 72.9% and the highest at the prediction timing 280 ms. We conclude that 280 ms is the most suitable to predict the judgment that a patient intends to exercise or not.

  2. A multitemporal probabilistic error correction approach to SVM classification of alpine glacier exploiting sentinel-1 images (Conference Presentation)

    Science.gov (United States)

    Callegari, Mattia; Marin, Carlo; Notarnicola, Claudia; Carturan, Luca; Covi, Federico; Galos, Stephan; Seppi, Roberto

    2016-10-01

    In mountain regions and their forelands, glaciers are key source of melt water during the middle and late ablation season, when most of the winter snow has already melted. Furthermore, alpine glaciers are recognized as sensitive indicators of climatic fluctuations. Monitoring glacier extent changes and glacier surface characteristics (i.e. snow, firn and bare ice coverage) is therefore important for both hydrological applications and climate change studies. Satellite remote sensing data have been widely employed for glacier surface classification. Many approaches exploit optical data, such as from Landsat. Despite the intuitive visual interpretation of optical images and the demonstrated capability to discriminate glacial surface thanks to the combination of different bands, one of the main disadvantages of available high-resolution optical sensors is their dependence on cloud conditions and low revisit time frequency. Therefore, operational monitoring strategies relying only on optical data have serious limitations. Since SAR data are insensitive to clouds, they are potentially a valid alternative to optical data for glacier monitoring. Compared to past SAR missions, the new Sentinel-1 mission provides much higher revisit time frequency (two acquisitions each 12 days) over the entire European Alps, and this number will be doubled once the Sentinel1-b will be in orbit (April 2016). In this work we present a method for glacier surface classification by exploiting dual polarimetric Sentinel-1 data. The method consists of a supervised approach based on Support Vector Machine (SVM). In addition to the VV and VH signals, we tested the contribution of local incidence angle, extracted from a digital elevation model and orbital information, as auxiliary input feature in order to account for the topographic effects. By exploiting impossible temporal transition between different classes (e.g. if at a given date one pixel is classified as rock it cannot be classified as

  3. A Linear-RBF Multikernel SVM to Classify Big Text Corpora

    Directory of Open Access Journals (Sweden)

    R. Romero

    2015-01-01

    Full Text Available Support vector machine (SVM is a powerful technique for classification. However, SVM is not suitable for classification of large datasets or text corpora, because the training complexity of SVMs is highly dependent on the input size. Recent developments in the literature on the SVM and other kernel methods emphasize the need to consider multiple kernels or parameterizations of kernels because they provide greater flexibility. This paper shows a multikernel SVM to manage highly dimensional data, providing an automatic parameterization with low computational cost and improving results against SVMs parameterized under a brute-force search. The model consists in spreading the dataset into cohesive term slices (clusters to construct a defined structure (multikernel. The new approach is tested on different text corpora. Experimental results show that the new classifier has good accuracy compared with the classic SVM, while the training is significantly faster than several other SVM classifiers.

  4. Estimating Classification Consistency and Accuracy for Cognitive Diagnostic Assessment

    Science.gov (United States)

    Cui, Ying; Gierl, Mark J.; Chang, Hua-Hua

    2012-01-01

    This article introduces procedures for the computation and asymptotic statistical inference for classification consistency and accuracy indices specifically designed for cognitive diagnostic assessments. The new classification indices can be used as important indicators of the reliability and validity of classification results produced by…

  5. Dimensionality of ICA in resting-state fMRI investigated by feature optimized classification of independent components with SVM

    Science.gov (United States)

    Wang, Yanlu; Li, Tie-Qiang

    2015-01-01

    Different machine learning algorithms have recently been used for assisting automated classification of independent component analysis (ICA) results from resting-state fMRI data. The success of this approach relies on identification of artifact components and meaningful functional networks. A limiting factor of ICA is the uncertainty of the number of independent components (NIC). We aim to develop a framework based on support vector machines (SVM) and optimized feature-selection for automated classification of independent components (ICs) and use the framework to investigate the effects of input NIC on the ICA results. Seven different resting-state fMRI datasets were studied. 18 features were devised by mimicking the empirical criteria for manual evaluation. The five most significant (p NIC. Through tracking, we demonstrate that incrementing NIC affects most ICs when NIC NIC is incremented beyond NIC > 40. For a given IC, its changes with increasing NIC are individually specific irrespective whether the component is a potential resting-state functional network or an artifact component. Using FOCIS, we investigated experimentally the ICA dimensionality of resting-state fMRI datasets and found that the input NIC can critically affect the ICA results of resting-state fMRI data. PMID:26005413

  6. An SVM-based distal lung image classification using texture descriptors.

    Science.gov (United States)

    Désir, Chesner; Petitjean, Caroline; Heutte, Laurent; Thiberville, Luc; Salaün, Mathieu

    2012-06-01

    A novel imaging technique can now provide microscopic images of the distal lung in vivo, for which quantitative analysis tools need to be developed. In this paper, we present an image classification system that is able to discriminate between normal and pathological images. Different feature spaces for discrimination are investigated and evaluated using a support vector machine. Best classification rates reach up to 90% and 95% on non-smoker and smoker groups, respectively. A feature selection process is also implemented, that allows us to gain some insight about these images. Whereas further tests on extended databases are needed, these first results indicate that efficient computer based automated classification of normal vs. pathological images of the distal lung is feasible. Copyright © 2011 Elsevier Ltd. All rights reserved.

  7. Accuracy assessment between different image classification ...

    African Journals Online (AJOL)

    What image classification does is to assign pixel to a particular land cover and land use type that has the most similar spectral signature. However, there are possibilities that different methods or algorithms of image classification of the same data set could produce appreciable variant results in the sizes, shapes and areas of ...

  8. Accuracy of automated classification of major depressive disorder as a function of symptom severity

    Directory of Open Access Journals (Sweden)

    Rajamannar Ramasubbu, MD, FRCPC, MSc

    2016-01-01

    Conclusions: Binary linear SVM classifiers achieved significant classification of very severe depression with resting-state fMRI, but the contribution of brain measurements may have limited potential in differentiating patients with less severe depression from healthy controls.

  9. Dimensionality of ICA in resting-state fMRI investigated by feature optimized classification of independent components with SVM.

    Science.gov (United States)

    Wang, Yanlu; Li, Tie-Qiang

    2015-01-01

    Different machine learning algorithms have recently been used for assisting automated classification of independent component analysis (ICA) results from resting-state fMRI data. The success of this approach relies on identification of artifact components and meaningful functional networks. A limiting factor of ICA is the uncertainty of the number of independent components (NIC). We aim to develop a framework based on support vector machines (SVM) and optimized feature-selection for automated classification of independent components (ICs) and use the framework to investigate the effects of input NIC on the ICA results. Seven different resting-state fMRI datasets were studied. 18 features were devised by mimicking the empirical criteria for manual evaluation. The five most significant (p ICA results. The classification results obtained using FOCIS and previously published FSL-FIX were compared against manually evaluated results. On average the false negative rate in identifying artifact contaminated ICs for FOCIS and FSL-FIX were 98.27 and 92.34%, respectively. The number of artifact and functional network components increased almost linearly with the input NIC. Through tracking, we demonstrate that incrementing NIC affects most ICs when NIC 40. For a given IC, its changes with increasing NIC are individually specific irrespective whether the component is a potential resting-state functional network or an artifact component. Using FOCIS, we investigated experimentally the ICA dimensionality of resting-state fMRI datasets and found that the input NIC can critically affect the ICA results of resting-state fMRI data.

  10. Automatic SVM classification of sudden cardiac death and pump failure death from autonomic and repolarization ECG markers.

    Science.gov (United States)

    Ramírez, Julia; Monasterio, Violeta; Mincholé, Ana; Llamedo, Mariano; Lenis, Gustavo; Cygankiewicz, Iwona; Bayés de Luna, Antonio; Malik, Marek; Martínez, Juan Pablo; Laguna, Pablo; Pueyo, Esther

    2015-01-01

    Considering the rates of sudden cardiac death (SCD) and pump failure death (PFD) in chronic heart failure (CHF) patients and the cost-effectiveness of their preventing treatments, identification of CHF patients at risk is an important challenge. In this work, we studied the prognostic performance of the combination of an index potentially related to dispersion of repolarization restitution (Δα), an index quantifying T-wave alternans (IAA) and the slope of heart rate turbulence (TS) for classification of SCD and PFD. Holter ECG recordings of 597 CHF patients with sinus rhythm enrolled in the MUSIC study were analyzed and Δα, IAA and TS were obtained. A strategy was implemented using support vector machines (SVM) to classify patients in three groups: SCD victims, PFD victims and other patients (the latter including survivors and victims of non-cardiac causes). Cross-validation was used to evaluate the performance of the implemented classifier. Δα and IAA, dichotomized at 0.035 (dimensionless) and 3.73 μV, respectively, were the ECG markers most strongly associated with SCD, while TS, dichotomized at 2.5 ms/RR, was the index most strongly related to PFD. When separating SCD victims from the rest of patients, the individual marker with best performance was Δα≥0.035, which, for a fixed specificity (Sp) of 90%, showed a sensitivity (Se) value of 10%, while the combination of Δα and IAA increased Se to 18%. For separation of PFD victims from the rest of patients, the best individual marker was TS ≤ 2.5 ms/RR, which, for Sp=90%, showed a Se of 26%, this value being lower than Se=34%, produced by the combination of Δα and TS. Furthermore, when performing SVM classification into the three reported groups, the optimal combination of risk markers led to a maximum Sp of 79% (Se=18%) for SCD and Sp of 81% (Se=14%) for PFD. The results shown in this work suggest that it is possible to efficiently discriminate SCD and PFD in a population of CHF patients using ECG

  11. Classification of Auditory Evoked Potentials based on the wavelet decomposition and SVM network

    Directory of Open Access Journals (Sweden)

    Michał Suchocki

    2015-12-01

    Full Text Available For electrophysiological hearing assessment and diagnosis of brain stem lesions, the most often used are auditory brainstem evoked potentials of short latency. They are characterized by successively arranged maxima as a function of time, called waves. Morphology of the course, in particular, the timing and amplitude of each wave, allow a neurologist to make diagnose, what is not an easy task. A neurologist should be experienced, concentrated, and should have very good perception. In order to support his diagnostic process, the authors have developed an algorithm implementing the automated classification of auditory evoked potentials to the group of pathological and physiological cases, the sensitivity and specificity determined for an independent test group (of 50 cases of respectively 84% and 88%.[b]Keywords[/b]: biomedical engineering, brainstem auditory evoked potentials, wavelet decomposition, support vector machine

  12. Prediction of healthy blood with data mining classification by using Decision Tree, Naive Baysian and SVM approaches

    Science.gov (United States)

    Khalilinezhad, Mahdieh; Minaei, Behrooz; Vernazza, Gianni; Dellepiane, Silvana

    2015-03-01

    Data mining (DM) is the process of discovery knowledge from large databases. Applications of data mining in Blood Transfusion Organizations could be useful for improving the performance of blood donation service. The aim of this research is the prediction of healthiness of blood donors in Blood Transfusion Organization (BTO). For this goal, three famous algorithms such as Decision Tree C4.5, Naïve Bayesian classifier, and Support Vector Machine have been chosen and applied to a real database made of 11006 donors. Seven fields such as sex, age, job, education, marital status, type of donor, results of blood tests (doctors' comments and lab results about healthy or unhealthy blood donors) have been selected as input to these algorithms. The results of the three algorithms have been compared and an error cost analysis has been performed. According to this research and the obtained results, the best algorithm with low error cost and high accuracy is SVM. This research helps BTO to realize a model from blood donors in each area in order to predict the healthy blood or unhealthy blood of donors. This research could be useful if used in parallel with laboratory tests to better separate unhealthy blood.

  13. Prediction of hERG Liability - Using SVM Classification, Bootstrapping and Jackknifing.

    Science.gov (United States)

    Sun, Hongmao; Huang, Ruili; Xia, Menghang; Shahane, Sampada; Southall, Noel; Wang, Yuhong

    2017-04-01

    Drug-induced QT prolongation leads to life-threatening cardiotoxicity, mostly through blockage of the human ether-à-go-go-related gene (hERG) encoded potassium ion (K+ ) channels. The hERG channel is one of the most important antitargets to be addressed in the early stage of drug discovery process, in order to avoid more costly failures in the development phase. Using a thallium flux assay, 4,323 molecules were screened for hERG channel inhibition in a quantitative high throughput screening (qHTS) format. Here, we present support vector classification (SVC) models of hERG channel inhibition with the averaged area under the receiver operator characteristics curve (AUC-ROC) of 0.93 for the tested compounds. Both Jackknifing and bootstrapping have been employed to rebalance the heavily biased training datasets, and the impact of these two under-sampling rebalance methods on the performance of the predictive models is discussed. Our results indicated that the rebalancing techniques did not enhance the predictive power of the resulting models; instead, adoption of optimal cutoffs could restore the desirable balance of sensitivity and specificity of the binary classifiers. In an external validation set of 66 drug molecules, the SVC model exhibited an AUC-ROC of 0.86, further demonstrating the utility of this modeling approach to predict hERG liabilities. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. Improving the classification accuracy for IR spectroscopic diagnosis of stomach and colon malignancy using non-linear spectral feature extraction methods.

    Science.gov (United States)

    Lee, Sanguk; Kim, Kyoungok; Lee, Hyeseon; Jun, Chi-Hyuck; Chung, Hoeil; Park, Jong-Jae

    2013-07-21

    Non-linear feature extraction methods, neighborhood preserving embedding (NPE) and supervised NPE (SNPE), were employed to effectively represent the IR spectral features of stomach and colon biopsy tissues for classification, and improve the classification accuracy for diagnosis of malignancy. The motivation was to utilize the NPE and SNPE's capability of capturing non-linear spectral behaviors by simultaneously preserving local relationships in order that minute spectral differences among classes would be effectively recognized. NPE and SNPE derive an optimal embedding feature such that the local neighborhood structure can be preserved in reduced spaces (variables). The IR spectra collected from stomach and colon tissues were represented by several new variables through NPE and SNPE, and also by using the principal component analysis (PCA). Then, the feature-extracted variables were subsequently classified into normal, adenoma and cancer tissues by using both k-nearest neighbor (k-NN) and support vector machine (SVM), and the resulting accuracies were compared with each other. In both cases, the combination of SNPE-SVM provided the best classification performance, and the accuracy was substantially improved compared to when PCA-SVM was used. Overall results demonstrate that NPE and SNPE could be potential feature-representation strategies useful in biomedical diagnosis based on vibrational spectroscopy where effective recognition of minute spectral differences is critical.

  15. An Improved Grey Wolf Optimization Strategy Enhanced SVM and Its Application in Predicting the Second Major

    Directory of Open Access Journals (Sweden)

    Yan Wei

    2017-01-01

    Full Text Available In order to develop a new and effective prediction system, the full potential of support vector machine (SVM was explored by using an improved grey wolf optimization (GWO strategy in this study. An improved GWO, IGWO, was first proposed to identify the most discriminative features for major prediction. In the proposed approach, particle swarm optimization (PSO was firstly adopted to generate the diversified initial positions, and then GWO was used to update the current positions of population in the discrete searching space, thus getting the optimal feature subset for the better classification purpose based on SVM. The resultant methodology, IGWO-SVM, is rigorously examined based on the real-life data which includes a series of factors that influence the students’ final decision to choose the specific major. To validate the proposed method, other metaheuristic based SVM methods including GWO based SVM, genetic algorithm based SVM, and particle swarm optimization-based SVM were used for comparison in terms of classification accuracy, AUC (the area under the receiver operating characteristic (ROC curve, sensitivity, and specificity. The experimental results demonstrate that the proposed approach can be regarded as a promising success with the excellent classification accuracy, AUC, sensitivity, and specificity of 87.36%, 0.8735, 85.37%, and 89.33%, respectively. Promisingly, the proposed methodology might serve as a new candidate of powerful tools for second major selection.

  16. Improving accuracy for cancer classification with a new algorithm for genes selection

    Directory of Open Access Journals (Sweden)

    Zhang Hongyan

    2012-11-01

    Full Text Available Abstract Background Even though the classification of cancer tissue samples based on gene expression data has advanced considerably in recent years, it faces great challenges to improve accuracy. One of the challenges is to establish an effective method that can select a parsimonious set of relevant genes. So far, most methods for gene selection in literature focus on screening individual or pairs of genes without considering the possible interactions among genes. Here we introduce a new computational method named the Binary Matrix Shuffling Filter (BMSF. It not only overcomes the difficulty associated with the search schemes of traditional wrapper methods and overfitting problem in large dimensional search space but also takes potential gene interactions into account during gene selection. This method, coupled with Support Vector Machine (SVM for implementation, often selects very small number of genes for easy model interpretability. Results We applied our method to 9 two-class gene expression datasets involving human cancers. During the gene selection process, the set of genes to be kept in the model was recursively refined and repeatedly updated according to the effect of a given gene on the contributions of other genes in reference to their usefulness in cancer classification. The small number of informative genes selected from each dataset leads to significantly improved leave-one-out (LOOCV classification accuracy across all 9 datasets for multiple classifiers. Our method also exhibits broad generalization in the genes selected since multiple commonly used classifiers achieved either equivalent or much higher LOOCV accuracy than those reported in literature. Conclusions Evaluation of a gene’s contribution to binary cancer classification is better to be considered after adjusting for the joint effect of a large number of other genes. A computationally efficient search scheme was provided to perform effective search in the extensive

  17. Selecting Feature Subsets Based on SVM-RFE and the Overlapping Ratio with Applications in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Xiaohui Lin

    2017-12-01

    Full Text Available Feature selection is an important topic in bioinformatics. Defining informative features from complex high dimensional biological data is critical in disease study, drug development, etc. Support vector machine-recursive feature elimination (SVM-RFE is an efficient feature selection technique that has shown its power in many applications. It ranks the features according to the recursive feature deletion sequence based on SVM. In this study, we propose a method, SVM-RFE-OA, which combines the classification accuracy rate and the average overlapping ratio of the samples to determine the number of features to be selected from the feature rank of SVM-RFE. Meanwhile, to measure the feature weights more accurately, we propose a modified SVM-RFE-OA (M-SVM-RFE-OA algorithm that temporally screens out the samples lying in a heavy overlapping area in each iteration. The experiments on the eight public biological datasets show that the discriminative ability of the feature subset could be measured more accurately by combining the classification accuracy rate with the average overlapping degree of the samples compared with using the classification accuracy rate alone, and shielding the samples in the overlapping area made the calculation of the feature weights more stable and accurate. The methods proposed in this study can also be used with other RFE techniques to define potential biomarkers from big biological data.

  18. A Visual mining based framework for classification accuracy estimation

    Science.gov (United States)

    Arun, Pattathal Vijayakumar

    2013-12-01

    Classification techniques have been widely used in different remote sensing applications and correct classification of mixed pixels is a tedious task. Traditional approaches adopt various statistical parameters, however does not facilitate effective visualisation. Data mining tools are proving very helpful in the classification process. We propose a visual mining based frame work for accuracy assessment of classification techniques using open source tools such as WEKA and PREFUSE. These tools in integration can provide an efficient approach for getting information about improvements in the classification accuracy and helps in refining training data set. We have illustrated framework for investigating the effects of various resampling methods on classification accuracy and found that bilinear (BL) is best suited for preserving radiometric characteristics. We have also investigated the optimal number of folds required for effective analysis of LISS-IV images. Techniki klasyfikacji są szeroko wykorzystywane w różnych aplikacjach teledetekcyjnych, w których poprawna klasyfikacja pikseli stanowi poważne wyzwanie. Podejście tradycyjne wykorzystujące różnego rodzaju parametry statystyczne nie zapewnia efektywnej wizualizacji. Wielce obiecujące wydaje się zastosowanie do klasyfikacji narzędzi do eksploracji danych. W artykule zaproponowano podejście bazujące na wizualnej analizie eksploracyjnej, wykorzystujące takie narzędzia typu open source jak WEKA i PREFUSE. Wymienione narzędzia ułatwiają korektę pół treningowych i efektywnie wspomagają poprawę dokładności klasyfikacji. Działanie metody sprawdzono wykorzystując wpływ różnych metod resampling na zachowanie dokładności radiometrycznej i uzyskując najlepsze wyniki dla metody bilinearnej (BL).

  19. Online Fault Diagnosis for Biochemical Process Based on FCM and SVM.

    Science.gov (United States)

    Wang, Xianfang; Du, Haoze; Tan, Jinglu

    2016-12-01

    Fault diagnosis is becoming an important issue in biochemical process, and a novel online fault detection and diagnosis approach is designed by combining fuzzy c-means (FCM) and support vector machine (SVM). The samples are preprocessed via FCM algorithm to enhance the ability of classification firstly. Then, those samples are input to the SVM classifier to realize the biochemical process fault diagnosis. In this study, a glutamic acid fermentation process is chosen as an example to diagnose the fault by this method, the result shows that the diagnosis time is largely shortened, and the accuracy is extremely improved by comparing to a single SVM method.

  20. Predicting the Metabolic Sites by Flavin-Containing Monooxygenase on Drug Molecules Using SVM Classification on Computed Quantum Mechanics and Circular Fingerprints Molecular Descriptors.

    Directory of Open Access Journals (Sweden)

    Chien-Wei Fu

    Full Text Available As an important enzyme in Phase I drug metabolism, the flavin-containing monooxygenase (FMO also metabolizes some xenobiotics with soft nucleophiles. The site of metabolism (SOM on a molecule is the site where the metabolic reaction is exerted by an enzyme. Accurate prediction of SOMs on drug molecules will assist the search for drug leads during the optimization process. Here, some quantum mechanics features such as the condensed Fukui function and attributes from circular fingerprints (called Molprint2D are computed and classified using the support vector machine (SVM for predicting some potential SOMs on a series of drugs that can be metabolized by FMO enzymes. The condensed Fukui function fA- representing the nucleophilicity of central atom A and the attributes from circular fingerprints accounting the influence of neighbors on the central atom. The total number of FMO substrates and non-substrates collected in the study is 85 and they are equally divided into the training and test sets with each carrying roughly the same number of potential SOMs. However, only N-oxidation and S-oxidation features were considered in the prediction since the available C-oxidation data was scarce. In the training process, the LibSVM package of WEKA package and the option of 10-fold cross validation are employed. The prediction performance on the test set evaluated by accuracy, Matthews correlation coefficient and area under ROC curve computed are 0.829, 0.659, and 0.877 respectively. This work reveals that the SVM model built can accurately predict the potential SOMs for drug molecules that are metabolizable by the FMO enzymes.

  1. Accuracy Analysis Comparison of Supervised Classification Methods for Anomaly Detection on Levees Using SAR Imagery

    Directory of Open Access Journals (Sweden)

    Ramakalavathi Marapareddy

    2017-10-01

    Full Text Available This paper analyzes the use of a synthetic aperture radar (SAR imagery to support levee condition assessment by detecting potential slide areas in an efficient and cost-effective manner. Levees are prone to a failure in the form of internal erosion within the earthen structure and landslides (also called slough or slump slides. If not repaired, slough slides may lead to levee failures. In this paper, we compare the accuracy of the supervised classification methods minimum distance (MD using Euclidean and Mahalanobis distance, support vector machine (SVM, and maximum likelihood (ML, using SAR technology to detect slough slides on earthen levees. In this work, the effectiveness of the algorithms was demonstrated using quad-polarimetric L-band SAR imagery from the NASA Jet Propulsion Laboratory’s (JPL’s uninhabited aerial vehicle synthetic aperture radar (UAVSAR. The study area is a section of the lower Mississippi River valley in the Southern USA, where earthen flood control levees are maintained by the US Army Corps of Engineers.

  2. svmPRAT: SVM-based Protein Residue Annotation Toolkit

    Directory of Open Access Journals (Sweden)

    Kauffman Christopher

    2009-12-01

    Full Text Available Abstract Background Over the last decade several prediction methods have been developed for determining the structural and functional properties of individual protein residues using sequence and sequence-derived information. Most of these methods are based on support vector machines as they provide accurate and generalizable prediction models. Results We present a general purpose protein residue annotation toolkit (svmPRAT to allow biologists to formulate residue-wise prediction problems. svmPRAT formulates the annotation problem as a classification or regression problem using support vector machines. One of the key features of svmPRAT is its ease of use in incorporating any user-provided information in the form of feature matrices. For every residue svmPRAT captures local information around the reside to create fixed length feature vectors. svmPRAT implements accurate and fast kernel functions, and also introduces a flexible window-based encoding scheme that accurately captures signals and pattern for training effective predictive models. Conclusions In this work we evaluate svmPRAT on several classification and regression problems including disorder prediction, residue-wise contact order estimation, DNA-binding site prediction, and local structure alphabet prediction. svmPRAT has also been used for the development of state-of-the-art transmembrane helix prediction method called TOPTMH, and secondary structure prediction method called YASSPP. This toolkit developed provides practitioners an efficient and easy-to-use tool for a wide variety of annotation problems. Availability: http://www.cs.gmu.edu/~mlbio/svmprat

  3. Pediatric surgeon-directed wound classification improves accuracy.

    Science.gov (United States)

    Zens, Tiffany J; Rusy, Deborah A; Gosain, Ankush

    2016-04-01

    Surgical wound classification (SWC) communicates the degree of contamination in the surgical field and is used to stratify risk of surgical site infection and compare outcomes among centers. We hypothesized that by changing from nurse-directed to surgeon-directed SWC during a structured operative debrief, we will improve accuracy of documentation. An institutional review board-approved retrospective chart review was performed. Two time periods were defined: initially, SWC was determined and recorded by the circulating nurse (before debrief, June 2012-May 2013) and allowing 6 mo for adoption and education, we implemented a structured operative debriefing including surgeon-directed SWC (after debrief, January 2014-August 2014). Accuracy of SWC was determined for four commonly performed pediatric general surgery operations: inguinal hernia repair (clean), gastrostomy ± Nissen fundoplication (clean contaminated), appendectomy without perforation (contaminated), and appendectomy with perforation (dirty). One hundred eighty-three cases before debrief and 142 cases after debrief met inclusion criteria. No differences between time periods were noted in regard to patient demographics, ASA class, or case mix. Accuracy of wound classification improved before debrief (42% versus 58.5%, P = 0.003). Before debrief, 26.8% of cases were overestimated or underestimated by more than one wound class, versus 3.5% of cases after debrief (P wounds and decreases the degree of inaccuracy in incorrectly classified cases. However, after implementation of the debriefing, we still observed a 41.5% rate of incorrect documentation, most notably in contaminated cases, indicating further education and process improvement is needed. Copyright © 2016 Elsevier Inc. All rights reserved.

  4. OPTIMALISASI SUPPORT VEKTOR MACHINE (SVM UNTUK KLASIFIKASI TEMA TUGAS AKHIR BERBASIS K-MEANS

    Directory of Open Access Journals (Sweden)

    Oman Somantri

    2017-01-01

    Full Text Available The difficulty in determining the classification of students final project theme often experienced by each college. The purpose of this study is to provide a decision support for policy makers in the study program so that each student can be achieved in accordance with their own competence. From the research that has been done text mining algorithms using Support Vector Machine ( SVM and K -Means as the technology used was produced a better accuracy rate with an accuracy rate of 86.21 % when compared to the SVM without K -Means is 85 , 38 %

  5. Fault diagnosis of monoblock centrifugal pump using SVM

    Directory of Open Access Journals (Sweden)

    V. Muralidharan

    2014-09-01

    Full Text Available Monoblock centrifugal pumps are employed in variety of critical engineering applications. Continuous monitoring of such machine component becomes essential in order to reduce the unnecessary break downs. At the outset, vibration based approaches are widely used to carry out the condition monitoring tasks. Particularly fuzzy logic, support vector machine (SVM and artificial neural networks were employed for continuous monitoring and fault diagnosis. In the present study, the application of SVM algorithm in the field of fault diagnosis and condition monitoring is discussed. The continuous wavelet transforms were calculated for different families and at different levels. The computed transformation coefficients form the feature set for the classification of good and faulty conditions of the components of centrifugal pump. The classification accuracies of different continuous wavelet families at different levels were calculated and compared to find the best wavelet for the fault diagnosis of the monoblock centrifugal pump.

  6. Poster abstract: A machine learning approach for vehicle classification using passive infrared and ultrasonic sensors

    KAUST Repository

    Warriach, Ehsan Ullah

    2013-01-01

    This article describes the implementation of four different machine learning techniques for vehicle classification in a dual ultrasonic/passive infrared traffic flow sensors. Using k-NN, Naive Bayes, SVM and KNN-SVM algorithms, we show that KNN-SVM significantly outperforms other algorithms in terms of classification accuracy. We also show that some of these algorithms could run in real time on the prototype system. Copyright © 2013 ACM.

  7. Sentiment analysis of feature ranking methods for classification accuracy

    Science.gov (United States)

    Joseph, Shashank; Mugauri, Calvin; Sumathy, S.

    2017-11-01

    Text pre-processing and feature selection are important and critical steps in text mining. Text pre-processing of large volumes of datasets is a difficult task as unstructured raw data is converted into structured format. Traditional methods of processing and weighing took much time and were less accurate. To overcome this challenge, feature ranking techniques have been devised. A feature set from text preprocessing is fed as input for feature selection. Feature selection helps improve text classification accuracy. Of the three feature selection categories available, the filter category will be the focus. Five feature ranking methods namely: document frequency, standard deviation information gain, CHI-SQUARE, and weighted-log likelihood –ratio is analyzed.

  8. Attribute-Level and Pattern-Level Classification Consistency and Accuracy Indices for Cognitive Diagnostic Assessment

    Science.gov (United States)

    Wang, Wenyi; Song, Lihong; Chen, Ping; Meng, Yaru; Ding, Shuliang

    2015-01-01

    Classification consistency and accuracy are viewed as important indicators for evaluating the reliability and validity of classification results in cognitive diagnostic assessment (CDA). Pattern-level classification consistency and accuracy indices were introduced by Cui, Gierl, and Chang. However, the indices at the attribute level have not yet…

  9. Parallelization of multicategory support vector machines (PMC-SVM for classifying microarray data

    Directory of Open Access Journals (Sweden)

    Deng Youping

    2006-12-01

    Full Text Available Abstract Background Multicategory Support Vector Machines (MC-SVM are powerful classification systems with excellent performance in a variety of data classification problems. Since the process of generating models in traditional multicategory support vector machines for large datasets is very computationally intensive, there is a need to improve the performance using high performance computing techniques. Results In this paper, Parallel Multicategory Support Vector Machines (PMC-SVM have been developed based on the sequential minimum optimization-type decomposition method for support vector machines (SMO-SVM. It was implemented in parallel using MPI and C++ libraries and executed on both shared memory supercomputer and Linux cluster for multicategory classification of microarray data. PMC-SVM has been analyzed and evaluated using four microarray datasets with multiple diagnostic categories, such as different cancer types and normal tissue types. Conclusion The experiments show that the PMC-SVM can significantly improve the performance of classification of microarray data without loss of accuracy, compared with previous work.

  10. Research on feature extraction and classification of AE signals of fibers' tensile failure based on HHT and SVM

    Directory of Open Access Journals (Sweden)

    Yanding SHEN

    2016-10-01

    Full Text Available In order to study the feature extraction and recognition method of fibers' tensile failure, AE technology is used to collect AE signals of fiber bundle's tensile fracture of two kinds of fibers of Aramid 1313 and viscose. A transform called wavelet is used to deal with the signals to reduce noise. A method called Hilbert-Huang transform (HHT is used to extract characteristic frequencies of the signals after the noise is reduced. And a classification method called Least Squares support vector machines (LSSVM is used for the classification and recognition of characteristic frequencies of the two kinds of fibers. The results show that wavelet de-noise method can reduce some noise of the signals. Hilbert spectrum can reflect fracture circumstances of the two kinds of fibers in the time dimension to some extent. Characteristic frequencies' extraction can be done from marginal spectrum. The LSSVM can be used for the classification and recognition of characteristic frequencies. The recognition rates of Aramid 1313 and viscose reach 40%, 80% respectively, and the total recognition rate reaches 60%.

  11. Accurate Fluid Level Measurement in Dynamic Environment Using Ultrasonic Sensor and ν-SVM

    Directory of Open Access Journals (Sweden)

    Jenny TERZIC

    2009-10-01

    Full Text Available A fluid level measurement system based on a single Ultrasonic Sensor and Support Vector Machines (SVM based signal processing and classification system has been developed to determine the fluid level in automotive fuel tanks. The novel approach based on the ν-SVM classification method uses the Radial Basis Function (RBF to compensate for the measurement error induced by the sloshing effects in the tank caused by vehicle motion. A broad investigation on selected pre-processing filters, namely, Moving Mean, Moving Median, and Wavelet filter, has also been presented. Field drive trials were performed under normal driving conditions at various fuel volumes ranging from 5 L to 50 L to acquire sample data from the ultrasonic sensor for the training of SVM model. Further drive trials were conducted to obtain data to verify the SVM results. A comparison of the accuracy of the predicted fluid level obtained using SVM and the pre-processing filters is provided. It is demonstrated that the ν-SVM model using the RBF kernel function and the Moving Median filter has produced the most accurate outcome compared with the other signal filtration methods in terms of fluid level measurement.

  12. COMPARATIVE STUDY OF CLASSIFICATION ALGORITHMS: HOLDOUTS AS ACCURACY ESTIMATION

    Directory of Open Access Journals (Sweden)

    Debby Erce Sondakh

    2016-09-01

    Full Text Available Penelitian ini bertujuan untuk mengukur dan membandingkan kinerja lima algoritma klasifikasi teks berbasis pembelajaran mesin, yaitu decision rules, decision tree, k-nearest neighbor (k-NN, naïve Bayes, dan Support Vector Machine (SVM, menggunakan dokumen teks multi-class. Perbandingan dilakukan pada efektifiatas algoritma, yaitu kemampuan untuk mengklasifikasi dokumen pada kategori yang tepat, menggunakan metode holdout atau percentage split. Ukuran efektifitas yang digunakan adalah precision, recall, F-measure, dan akurasi. Hasil eksperimen menunjukkan bahwa untuk algoritma naïve Bayes, semakin besar persentase dokumen pelatihan semakin tinggi akurasi model yang dihasilkan. Akurasi tertinggi naïve Bayes pada persentase 90/10, SVM pada 80/20, dan decision tree pada 70/30. Hasil eksperimen juga menunjukkan, algoritma naïve Bayes memiliki nilai efektifitas tertinggi di antara lima algoritma yang diuji, dan waktu membangun model klasiifikasi yang tercepat, yaitu 0.02 detik. Algoritma decision tree dapat mengklasifikasi dokumen teks dengan nilai akurasi yang lebih tinggi dibanding SVM, namun waktu membangun modelnya lebih lambat. Dalam hal waktu membangun model, k-NN adalah yang tercepat namun nilai akurasinya kurang.

  13. SVM-based glioma grading. Optimization by feature reduction analysis

    Energy Technology Data Exchange (ETDEWEB)

    Zoellner, Frank G.; Schad, Lothar R. [University Medical Center Mannheim, Heidelberg Univ., Mannheim (Germany). Computer Assisted Clinical Medicine; Emblem, Kyrre E. [Massachusetts General Hospital, Charlestown, A.A. Martinos Center for Biomedical Imaging, Boston MA (United States). Dept. of Radiology; Harvard Medical School, Boston, MA (United States); Oslo Univ. Hospital (Norway). The Intervention Center

    2012-11-01

    We investigated the predictive power of feature reduction analysis approaches in support vector machine (SVM)-based classification of glioma grade. In 101 untreated glioma patients, three analytic approaches were evaluated to derive an optimal reduction in features; (i) Pearson's correlation coefficients (PCC), (ii) principal component analysis (PCA) and (iii) independent component analysis (ICA). Tumor grading was performed using a previously reported SVM approach including whole-tumor cerebral blood volume (CBV) histograms and patient age. Best classification accuracy was found using PCA at 85% (sensitivity = 89%, specificity = 84%) when reducing the feature vector from 101 (100-bins rCBV histogram + age) to 3 principal components. In comparison, classification accuracy by PCC was 82% (89%, 77%, 2 dimensions) and 79% by ICA (87%, 75%, 9 dimensions). For improved speed (up to 30%) and simplicity, feature reduction by all three methods provided similar classification accuracy to literature values ({proportional_to}87%) while reducing the number of features by up to 98%. (orig.)

  14. Classification of Stellar Spectra with Fuzzy Minimum Within-Class ...

    Indian Academy of Sciences (India)

    Classification is one of the important tasks in astronomy, especially in spectra analysis. Support Vector Machine (SVM) is a typical classification method, which is widely used in spectra classification. Although it performs well in practice, its classification accuracies can not be greatly improved because of two limitations.

  15. PAIR Comparison between Two Within-Group Conditions of Resting-State fMRI Improves Classification Accuracy

    Science.gov (United States)

    Zhou, Zhen; Wang, Jian-Bao; Zang, Yu-Feng; Pan, Gang

    2018-01-01

    Classification approaches have been increasingly applied to differentiate patients and normal controls using resting-state functional magnetic resonance imaging data (RS-fMRI). Although most previous classification studies have reported promising accuracy within individual datasets, achieving high levels of accuracy with multiple datasets remains challenging for two main reasons: high dimensionality, and high variability across subjects. We used two independent RS-fMRI datasets (n = 31, 46, respectively) both with eyes closed (EC) and eyes open (EO) conditions. For each dataset, we first reduced the number of features to a small number of brain regions with paired t-tests, using the amplitude of low frequency fluctuation (ALFF) as a metric. Second, we employed a new method for feature extraction, named the PAIR method, examining EC and EO as paired conditions rather than independent conditions. Specifically, for each dataset, we obtained EC minus EO (EC—EO) maps of ALFF from half of subjects (n = 15 for dataset-1, n = 23 for dataset-2) and obtained EO—EC maps from the other half (n = 16 for dataset-1, n = 23 for dataset-2). A support vector machine (SVM) method was used for classification of EC RS-fMRI mapping and EO mapping. The mean classification accuracy of the PAIR method was 91.40% for dataset-1, and 92.75% for dataset-2 in the conventional frequency band of 0.01–0.08 Hz. For cross-dataset validation, we applied the classifier from dataset-1 directly to dataset-2, and vice versa. The mean accuracy of cross-dataset validation was 94.93% for dataset-1 to dataset-2 and 90.32% for dataset-2 to dataset-1 in the 0.01–0.08 Hz range. For the UNPAIR method, classification accuracy was substantially lower (mean 69.89% for dataset-1 and 82.97% for dataset-2), and was much lower for cross-dataset validation (64.69% for dataset-1 to dataset-2 and 64.98% for dataset-2 to dataset-1) in the 0.01–0.08 Hz range. In conclusion, for within-group design studies (e

  16. Land cover classification accuracy from electro-optical, X, C, and L-band Synthetic Aperture Radar data fusion

    Science.gov (United States)

    Hammann, Mark Gregory

    The fusion of electro-optical (EO) multi-spectral satellite imagery with Synthetic Aperture Radar (SAR) data was explored with the working hypothesis that the addition of multi-band SAR will increase the land-cover (LC) classification accuracy compared to EO alone. Three satellite sources for SAR imagery were used: X-band from TerraSAR-X, C-band from RADARSAT-2, and L-band from PALSAR. Images from the RapidEye satellites were the source of the EO imagery. Imagery from the GeoEye-1 and WorldView-2 satellites aided the selection of ground truth. Three study areas were chosen: Wad Medani, Sudan; Campinas, Brazil; and Fresno- Kings Counties, USA. EO imagery were radiometrically calibrated, atmospherically compensated, orthorectifed, co-registered, and clipped to a common area of interest (AOI). SAR imagery were radiometrically calibrated, and geometrically corrected for terrain and incidence angle by converting to ground range and Sigma Naught (?0). The original SAR HH data were included in the fused image stack after despeckling with a 3x3 Enhanced Lee filter. The variance and Gray-Level-Co-occurrence Matrix (GLCM) texture measures of contrast, entropy, and correlation were derived from the non-despeckled SAR HH bands. Data fusion was done with layer stacking and all data were resampled to a common spatial resolution. The Support Vector Machine (SVM) decision rule was used for the supervised classifications. Similar LC classes were identified and tested for each study area. For Wad Medani, nine classes were tested: low and medium intensity urban, sparse forest, water, barren ground, and four agriculture classes (fallow, bare agricultural ground, green crops, and orchards). For Campinas, Brazil, five generic classes were tested: urban, agriculture, forest, water, and barren ground. For the Fresno-Kings Counties location 11 classes were studied: three generic classes (urban, water, barren land), and eight specific crops. In all cases the addition of SAR to EO resulted

  17. Accuracy of Remotely Sensed Classifications For Stratification of Forest and Nonforest Lands

    Science.gov (United States)

    Raymond L. Czaplewski; Paul L. Patterson

    2001-01-01

    We specify accuracy standards for remotely sensed classifications used by FIA to stratify landscapes into two categories: forest and nonforest. Accuracy must be highest when forest area approaches 100 percent of the landscape. If forest area is rare in a landscape, then accuracy in the nonforest stratum must be very high, even at the expense of accuracy in the forest...

  18. Influences on classification accuracy of exam sets: an example from vocational education and training

    NARCIS (Netherlands)

    Hubregtse, M.; Eggen, Theodorus Johannes Hendrikus Maria; Eggen, T.J.H.M.; Veldkamp, B.P.

    2012-01-01

    Classification accuracy of single exams is well studied in the educational measurement literature. However, when making important decisions, such as certification decisions, one usually uses several exams: an exam set. This chapter elaborates on classification accuracy of exam sets. This is

  19. Classification Accuracies of Physical Activities Using Smartphone Motion Sensors

    Science.gov (United States)

    Wu, Wanmin; Dasgupta, Sanjoy; Ramirez, Ernesto E; Peterson, Carlyn

    2012-01-01

    activity and sedentary behavior (walking, jogging, and sitting) can be recognized with high accuracies using both the accelerometer and gyroscope onboard the iPod touch or iPhone. This suggests the potential of developing just-in-time classification and feedback tools on smartphones. PMID:23041431

  20. Classification accuracies of physical activities using smartphone motion sensors.

    Science.gov (United States)

    Wu, Wanmin; Dasgupta, Sanjoy; Ramirez, Ernesto E; Peterson, Carlyn; Norman, Gregory J

    2012-10-05

    sitting) can be recognized with high accuracies using both the accelerometer and gyroscope onboard the iPod touch or iPhone. This suggests the potential of developing just-in-time classification and feedback tools on smartphones.

  1. SVM Intrusion Detection Model Based on Compressed Sampling

    Directory of Open Access Journals (Sweden)

    Shanxiong Chen

    2016-01-01

    Full Text Available Intrusion detection needs to deal with a large amount of data; particularly, the technology of network intrusion detection has to detect all of network data. Massive data processing is the bottleneck of network software and hardware equipment in intrusion detection. If we can reduce the data dimension in the stage of data sampling and directly obtain the feature information of network data, efficiency of detection can be improved greatly. In the paper, we present a SVM intrusion detection model based on compressive sampling. We use compressed sampling method in the compressed sensing theory to implement feature compression for network data flow so that we can gain refined sparse representation. After that SVM is used to classify the compression results. This method can realize detection of network anomaly behavior quickly without reducing the classification accuracy.

  2. Hyperspectral recognition of processing tomato early blight based on GA and SVM

    Science.gov (United States)

    Yin, Xiaojun; Zhao, SiFeng

    2013-03-01

    Processing tomato early blight seriously affect the yield and quality of its.Determine the leaves spectrum of different disease severity level of processing tomato early blight.We take the sensitive bands of processing tomato early blight as support vector machine input vector.Through the genetic algorithm(GA) to optimize the parameters of SVM, We could recognize different disease severity level of processing tomato early blight.The result show:the sensitive bands of different disease severity levels of processing tomato early blight is 628-643nm and 689-692nm.The sensitive bands are as the GA and SVM input vector.We get the best penalty parameters is 0.129 and kernel function parameters is 3.479.We make classification training and testing by polynomial nuclear,radial basis function nuclear,Sigmoid nuclear.The best classification model is the radial basis function nuclear of SVM. Training accuracy is 84.615%,Testing accuracy is 80.681%.It is combined GA and SVM to achieve multi-classification of processing tomato early blight.It is provided the technical support of prediction processing tomato early blight occurrence, development and diffusion rule in large areas.

  3. Improving accuracy in astrocytomas grading by integrating a robust least squares mapping driven support vector machine classifier into a two level grade classification scheme.

    Science.gov (United States)

    Glotsos, Dimitris; Kalatzis, Ioannis; Spyridonos, Panagiota; Kostopoulos, Spiros; Daskalakis, Antonis; Athanasiadis, Emmanouil; Ravazoula, Panagiota; Nikiforidis, George; Cavouras, Dionisis

    2008-06-01

    Grading of astrocytomas is an important task for treatment planning; however, it suffers from significantly great inter-observer variability. Computer-assisted diagnosis systems have been propose to assist towards minimizing subjectivity, however, these systems present either moderate accuracy or utilize specialized staining protocols and grading systems that are difficult to apply in daily clinical practice. The present study proposes a robust mathematical formulation by integrating state-of-art technologies (support vector machines and least squares mapping) in a cascade classification scheme for separating low from high and grade III from grade IV astrocytic tumours. Results have indicated that low from high-grade tumours can be correctly separated with a certainty as high as 97.3%, whereas grade III from grade IV tumours with 97.8%. The overall performance was 95.2%. These high rates have been a result of applying the least squares mapping technique to features prior to classification. A significant byproduct of least squares mapping is that the number of support vectors of the SVM classifiers dropped dramatically from about 80% when no mapping was used to less than 5% when mapping was used. The latter is a clear indication that the SVM classifier has a greater potential to generalize well to new data. In this way, digital image analysis systems for automated grading of astrocytomas are brought closer to clinical practice.

  4. SVM-based spectrum mobility prediction scheme in mobile cognitive radio networks.

    Science.gov (United States)

    Wang, Yao; Zhang, Zhongzhao; Ma, Lin; Chen, Jiamei

    2014-01-01

    Spectrum mobility as an essential issue has not been fully investigated in mobile cognitive radio networks (CRNs). In this paper, a novel support vector machine based spectrum mobility prediction (SVM-SMP) scheme is presented considering time-varying and space-varying characteristics simultaneously in mobile CRNs. The mobility of cognitive users (CUs) and the working activities of primary users (PUs) are analyzed in theory. And a joint feature vector extraction (JFVE) method is proposed based on the theoretical analysis. Then spectrum mobility prediction is executed through the classification of SVM with a fast convergence speed. Numerical results validate that SVM-SMP gains better short-time prediction accuracy rate and miss prediction rate performance than the two algorithms just depending on the location and speed information. Additionally, a rational parameter design can remedy the prediction performance degradation caused by high speed SUs with strong randomness movements.

  5. A SVM-based method for sentiment analysis in Persian language

    Science.gov (United States)

    Hajmohammadi, Mohammad Sadegh; Ibrahim, Roliana

    2013-03-01

    Persian language is the official language of Iran, Tajikistan and Afghanistan. Local online users often represent their opinions and experiences on the web with written Persian. Although the information in those reviews is valuable to potential consumers and sellers, the huge amount of web reviews make it difficult to give an unbiased evaluation to a product. In this paper, standard machine learning techniques SVM and naive Bayes are incorporated into the domain of online Persian Movie reviews to automatically classify user reviews as positive or negative and performance of these two classifiers is compared with each other in this language. The effects of feature presentations on classification performance are discussed. We find that accuracy is influenced by interaction between the classification models and the feature options. The SVM classifier achieves as well as or better accuracy than naive Bayes in Persian movie. Unigrams are proved better features than bigrams and trigrams in capturing Persian sentiment orientation.

  6. Associations between psychologists' thinking styles and accuracy on a diagnostic classification task

    NARCIS (Netherlands)

    Aarts, A.A.; Witteman, C.L.M.; Souren, P.M.; Egger, J.I.M.

    2012-01-01

    The present study investigated whether individual differences between psychologists in thinking styles are associated with accuracy in diagnostic classification. We asked novice and experienced clinicians to classify two clinical cases of clients with two co-occurring psychological disorders. No

  7. The study of vehicle classification equipment with solutions to improve accuracy in Oklahoma.

    Science.gov (United States)

    2014-12-01

    The accuracy of vehicle counting and classification data is vital for appropriate future highway and road : design, including determining pavement characteristics, eliminating traffic jams, and improving safety. : Organizations relying on vehicle cla...

  8. Convolutional neural network for high-accuracy functional near-infrared spectroscopy in a brain-computer interface: three-class classification of rest, right-, and left-hand motor execution.

    Science.gov (United States)

    Trakoolwilaiwan, Thanawin; Behboodi, Bahareh; Lee, Jaeseok; Kim, Kyungsoo; Choi, Ji-Woong

    2018-01-01

    The aim of this work is to develop an effective brain-computer interface (BCI) method based on functional near-infrared spectroscopy (fNIRS). In order to improve the performance of the BCI system in terms of accuracy, the ability to discriminate features from input signals and proper classification are desired. Previous studies have mainly extracted features from the signal manually, but proper features need to be selected carefully. To avoid performance degradation caused by manual feature selection, we applied convolutional neural networks (CNNs) as the automatic feature extractor and classifier for fNIRS-based BCI. In this study, the hemodynamic responses evoked by performing rest, right-, and left-hand motor execution tasks were measured on eight healthy subjects to compare performances. Our CNN-based method provided improvements in classification accuracy over conventional methods employing the most commonly used features of mean, peak, slope, variance, kurtosis, and skewness, classified by support vector machine (SVM) and artificial neural network (ANN). Specifically, up to 6.49% and 3.33% improvement in classification accuracy was achieved by CNN compared with SVM and ANN, respectively.

  9. Variance estimates and confidence intervals for the Kappa measure of classification accuracy

    Science.gov (United States)

    M. A. Kalkhan; R. M. Reich; R. L. Czaplewski

    1997-01-01

    The Kappa statistic is frequently used to characterize the results of an accuracy assessment used to evaluate land use and land cover classifications obtained by remotely sensed data. This statistic allows comparisons of alternative sampling designs, classification algorithms, photo-interpreters, and so forth. In order to make these comparisons, it is...

  10. Toward accountable land use mapping: Using geocomputation to improve classification accuracy and reveal uncertainty

    NARCIS (Netherlands)

    Beekhuizen, J.; Clarke, K.C.

    2010-01-01

    The classification of satellite imagery into land use/cover maps is a major challenge in the field of remote sensing. This research aimed at improving the classification accuracy while also revealing uncertain areas by employing a geocomputational approach. We computed numerous land use maps by

  11. Assessing the Accuracy of Prediction Algorithms for Classification

    DEFF Research Database (Denmark)

    Baldi, P.; Brunak, Søren; Chauvin, Y.

    2000-01-01

    We provide a unified overview of methods that currently are widely used to assess the accuracy of prediction algorithms, from raw percentages, quadratic error measures and other distances, ann correlation coefficients, and to information theoretic measures such as relative entropy and mutual info...

  12. Analytic radar micro-Doppler signatures classification

    Science.gov (United States)

    Oh, Beom-Seok; Gu, Zhaoning; Wang, Guan; Toh, Kar-Ann; Lin, Zhiping

    2017-06-01

    Due to its capability of capturing the kinematic properties of a target object, radar micro-Doppler signatures (m-DS) play an important role in radar target classification. This is particularly evident from the remarkable number of research papers published every year on m-DS for various applications. However, most of these works rely on the support vector machine (SVM) for target classification. It is well known that training an SVM is computationally expensive due to its nature of search to locate the supporting vectors. In this paper, the classifier learning problem is addressed by a total error rate (TER) minimization where an analytic solution is available. This largely reduces the search time in the learning phase. The analytically obtained TER solution is globally optimal with respect to the classification total error count rate. Moreover, our empirical results show that TER outperforms SVM in terms of classification accuracy and computational efficiency on a five-category radar classification problem.

  13. Classifying smoke in laparoscopic videos using SVM

    Directory of Open Access Journals (Sweden)

    Alshirbaji Tamer Abdulbaki

    2017-09-01

    Full Text Available Smoke in laparoscopic videos usually appears due to the use of electrocautery when cutting or coagulating tissues. Therefore, detecting smoke can be used for event-based annotation in laparoscopic surgeries by retrieving the events associated with the electrocauterization. Furthermore, smoke detection can also be used for automatic smoke removal. However, detecting smoke in laparoscopic video is a challenge because of the changeability of smoke patterns, the moving camera and the different lighting conditions. In this paper, we present a video-based smoke detection algorithm to detect smoke of different densities such as fog, low and high density in laparoscopic videos. The proposed method depends on extracting various visual features from the laparoscopic images and providing them to support vector machine (SVM classifier. Features are based on motion, colour and texture patterns of the smoke. We validated our algorithm using experimental evaluation on four laparoscopic cholecystectomy videos. These four videos were manually annotated by defining every frame as smoke or non-smoke frame. The algorithm was applied to the videos by using different feature combinations for classification. Experimental results show that the combination of all proposed features gives the best classification performance. The overall accuracy (i.e. correctly classified frames is around 84%, with the sensitivity (i.e. correctly detected smoke frames and the specificity (i.e. correctly detected non-smoke frames are 89% and 80%, respectively.

  14. Computer-Aided Lung Nodule Recognition by SVM Classifier Based on Combination of Random Undersampling and SMOTE

    Directory of Open Access Journals (Sweden)

    Yuan Sui

    2015-01-01

    Full Text Available In lung cancer computer-aided detection/diagnosis (CAD systems, classification of regions of interest (ROI is often used to detect/diagnose lung nodule accurately. However, problems of unbalanced datasets often have detrimental effects on the performance of classification. In this paper, both minority and majority classes are resampled to increase the generalization ability. We propose a novel SVM classifier combined with random undersampling (RU and SMOTE for lung nodule recognition. The combinations of the two resampling methods not only achieve a balanced training samples but also remove noise and duplicate information in the training sample and retain useful information to improve the effective data utilization, hence improving performance of SVM algorithm for pulmonary nodules classification under the unbalanced data. Eight features including 2D and 3D features are extracted for training and classification. Experimental results show that for different sizes of training datasets our RU-SMOTE-SVM classifier gets the highest classification accuracy among the four kinds of classifiers, and the average classification accuracy is more than 92.94%.

  15. A novel approach to the detection of acromegaly: accuracy of diagnosis by automatic face classification.

    Science.gov (United States)

    Schneider, Harald J; Kosilek, Robert P; Günther, Manuel; Roemmler, Josefine; Stalla, Günter K; Sievers, Caroline; Reincke, Martin; Schopohl, Jochen; Würtz, Rolf P

    2011-07-01

    The delay between onset of first symptoms and diagnosis of the acromegaly is 6-10 yr. Acromegaly causes typical changes of the face that might be recognized by face classification software. The objective of the study was to assess classification accuracy of acromegaly by face-classification software. This was a diagnostic study. The study was conducted in specialized care. Participants in the study included 57 patients with acromegaly (29 women, 28 men) and 60 sex- and age-matched controls. We took frontal and side photographs of the faces and grouped patients into subjects with mild, moderate, and severe facial features of acromegaly by overall impression. We then analyzed all pictures using computerized similarity analysis based on Gabor jets and geometry functions. We used the leave-one-out cross-validation method to classify subjects by the software. Additionally, all subjects were classified by visual impression by three acromegaly experts and three general internists. Classification accuracy by software, experts, and internists was measured. The software correctly classified 71.9% of patients and 91.5% of controls. Classification accuracy for patients by visual analysis was 63.2 and 42.1% by experts and general internists, respectively. Classification accuracy for controls was 80.8 and 87.0% by experts and internists, respectively. The highest differences in accuracy between software and experts and internists were present for patients with mild acromegaly. Acromegaly can be detected by computer software using photographs of the face. Classification accuracy by software is higher than by medical experts or general internists, particularly in patients with mild features of acromegaly. This is a promising tool to help detecting acromegaly.

  16. Diesel Engine Valve Clearance Fault Diagnosis Based on Features Extraction Techniques and FastICA-SVM

    Science.gov (United States)

    Jing, Ya-Bing; Liu, Chang-Wen; Bi, Feng-Rong; Bi, Xiao-Yang; Wang, Xia; Shao, Kang

    2017-07-01

    Numerous vibration-based techniques are rarely used in diesel engines fault diagnosis in a direct way, due to the surface vibration signals of diesel engines with the complex non-stationary and nonlinear time-varying features. To investigate the fault diagnosis of diesel engines, fractal correlation dimension, wavelet energy and entropy as features reflecting the diesel engine fault fractal and energy characteristics are extracted from the decomposed signals through analyzing vibration acceleration signals derived from the cylinder head in seven different states of valve train. An intelligent fault detector FastICA-SVM is applied for diesel engine fault diagnosis and classification. The results demonstrate that FastICA-SVM achieves higher classification accuracy and makes better generalization performance in small samples recognition. Besides, the fractal correlation dimension and wavelet energy and entropy as the special features of diesel engine vibration signal are considered as input vectors of classifier FastICA-SVM and could produce the excellent classification results. The proposed methodology improves the accuracy of feature extraction and the fault diagnosis of diesel engines.

  17. The Sample Size Influence in the Accuracy of the Image Classification of the Remote Sensing

    Directory of Open Access Journals (Sweden)

    Thomaz C. e C. da Costa

    2004-12-01

    Full Text Available Landuse/landcover maps produced by classification of remote sensing images incorporate uncertainty. This uncertainty is measured by accuracy indices using reference samples. The size of the reference sample is defined by approximation by a binomial function without the use of a pilot sample. This way the accuracy are not estimated, but fixed a priori. In case of divergency between the estimated and a priori accuracy the error of the sampling will deviate from the expected error. The size using pilot sample (theorically correct procedure justify when haven´t estimate of accuracy for work area, referent the product remote sensing utility.

  18. Detection of Alzheimer's disease using group lasso SVM-based region selection

    Science.gov (United States)

    Sun, Zhuo; Fan, Yong; Lelieveldt, Boudewijn P. F.; van de Giessen, Martijn

    2015-03-01

    Alzheimer's disease (AD) is one of the most frequent forms of dementia and an increasing challenging public health problem. In the last two decades, structural magnetic resonance imaging (MRI) has shown potential in distinguishing patients with Alzheimer's disease and elderly controls (CN). To obtain AD-specific biomarkers, previous research used either statistical testing to find statistically significant different regions between the two clinical groups, or l1 sparse learning to select isolated features in the image domain. In this paper, we propose a new framework that uses structural MRI to simultaneously distinguish the two clinical groups and find the bio-markers of AD, using a group lasso support vector machine (SVM). The group lasso term (mixed l1- l2 norm) introduces anatomical information from the image domain into the feature domain, such that the resulting set of selected voxels are more meaningful than the l1 sparse SVM. Because of large inter-structure size variation, we introduce a group specific normalization factor to deal with the structure size bias. Experiments have been performed on a well-designed AD vs. CN dataset1 to validate our method. Comparing to the l1 sparse SVM approach, our method achieved better classification performance and a more meaningful biomarker selection. When we vary the training set, the selected regions by our method were more stable than the l1 sparse SVM. Classification experiments showed that our group normalization lead to higher classification accuracy with fewer selected regions than the non-normalized method. Comparing to the state-of-art AD vs. CN classification methods, our approach not only obtains a high accuracy with the same dataset, but more importantly, we simultaneously find the brain anatomies that are closely related to the disease.

  19. Automatic epileptic seizure detection in EEGs using MF-DFA, SVM based on cloud computing.

    Science.gov (United States)

    Zhang, Zhongnan; Wen, Tingxi; Huang, Wei; Wang, Meihong; Li, Chunfeng

    2017-01-01

    Epilepsy is a chronic disease with transient brain dysfunction that results from the sudden abnormal discharge of neurons in the brain. Since electroencephalogram (EEG) is a harmless and noninvasive detection method, it plays an important role in the detection of neurological diseases. However, the process of analyzing EEG to detect neurological diseases is often difficult because the brain electrical signals are random, non-stationary and nonlinear. In order to overcome such difficulty, this study aims to develop a new computer-aided scheme for automatic epileptic seizure detection in EEGs based on multi-fractal detrended fluctuation analysis (MF-DFA) and support vector machine (SVM). New scheme first extracts features from EEG by MF-DFA during the first stage. Then, the scheme applies a genetic algorithm (GA) to calculate parameters used in SVM and classify the training data according to the selected features using SVM. Finally, the trained SVM classifier is exploited to detect neurological diseases. The algorithm utilizes MLlib from library of SPARK and runs on cloud platform. Applying to a public dataset for experiment, the study results show that the new feature extraction method and scheme can detect signals with less features and the accuracy of the classification reached up to 99%. MF-DFA is a promising approach to extract features for analyzing EEG, because of its simple algorithm procedure and less parameters. The features obtained by MF-DFA can represent samples as well as traditional wavelet transform and Lyapunov exponents. GA can always find useful parameters for SVM with enough execution time. The results illustrate that the classification model can achieve comparable accuracy, which means that it is effective in epileptic seizure detection.

  20. Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Maolong Xi

    2016-01-01

    Full Text Available This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO for cancer feature gene selection, coupling support vector machine (SVM for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV. Finally, the BQPSO coupling SVM (BQPSO/SVM, binary PSO coupling SVM (BPSO/SVM, and genetic algorithm coupling SVM (GA/SVM are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms.

  1. Classification Accuracy and Acceptability of the Integrated Screening and Intervention System Teacher Rating Form

    Science.gov (United States)

    Daniels, Brian; Volpe, Robert J.; Fabiano, Gregory A.; Briesch, Amy M.

    2017-01-01

    This study examines the classification accuracy and teacher acceptability of a problem-focused screener for academic and disruptive behavior problems, which is directly linked to evidence-based intervention. Participants included 39 classroom teachers from 2 public school districts in the Northeastern United States. Teacher ratings were obtained…

  2. Concurrent Validity and Classification Accuracy of Curriculum-Based Measurement for Written Expression

    Science.gov (United States)

    Furey, William M.; Marcotte, Amanda M.; Hintze, John M.; Shackett, Caroline M.

    2016-01-01

    The study presents a critical analysis of written expression curriculum-based measurement (WE-CBM) metrics derived from 3- and 10-min test lengths. Criterion validity and classification accuracy were examined for Total Words Written (TWW), Correct Writing Sequences (CWS), Percent Correct Writing Sequences (%CWS), and Correct Minus Incorrect…

  3. Effects of sample survey design on the accuracy of classification tree models in species distribution models

    Science.gov (United States)

    Thomas C. Edwards; D. Richard Cutler; Niklaus E. Zimmermann; Linda Geiser; Gretchen G. Moisen

    2006-01-01

    We evaluated the effects of probabilistic (hereafter DESIGN) and non-probabilistic (PURPOSIVE) sample surveys on resultant classification tree models for predicting the presence of four lichen species in the Pacific Northwest, USA. Models derived from both survey forms were assessed using an independent data set (EVALUATION). Measures of accuracy as gauged by...

  4. Using Multidimensional ADTPE and SVM for Optical Modulation Real-Time Recognition

    Directory of Open Access Journals (Sweden)

    Junyu Wei

    2016-01-01

    Full Text Available Based on the feature extraction of multidimensional asynchronous delay-tap plot entropy (ADTPE and multiclass classification of support vector machine (SVM, we propose a method for recognition of multiple optical modulation formats and various data rates. We firstly present the algorithm of multidimensional ADTPE, which is extracted from asynchronous delay sampling pairs of modulated optical signal. Then, a multiclass SVM is utilized for fast and accurate classification of several widely-used optical modulation formats. In addition, a simple real-time recognition scheme is designed to reduce the computation time. Compared to the existing method based on asynchronous delay-tap plot (ADTP, the theoretical analysis and simulation results show that our recognition method can effectively enhance the tolerance of transmission impairments, obtaining relatively high accuracy. Finally, it is further demonstrated that the proposed method can be integrated in an optical transport network (OTN with flexible expansion. Through simply adding the corresponding sub-SVM module in the digital signal processer (DSP, arbitrary new modulation formats can be recognized with high recognition accuracy in a short response time.

  5. Generalized SMO algorithm for SVM-based multitask learning.

    Science.gov (United States)

    Cai, Feng; Cherkassky, Vladimir

    2012-06-01

    Exploiting additional information to improve traditional inductive learning is an active research area in machine learning. In many supervised-learning applications, training data can be naturally separated into several groups, and incorporating this group information into learning may improve generalization. Recently, Vapnik proposed a general approach to formalizing such problems, known as "learning with structured data" and its support vector machine (SVM) based optimization formulation called SVM+. Liang and Cherkassky showed the connection between SVM+ and multitask learning (MTL) approaches in machine learning, and proposed an SVM-based formulation for MTL called SVM+MTL for classification. Training the SVM+MTL classifier requires the solution of a large quadratic programming optimization problem which scales as O(n(3)) with sample size n. So there is a need to develop computationally efficient algorithms for implementing SVM+MTL. This brief generalizes Platt's sequential minimal optimization (SMO) algorithm to the SVM+MTL setting. Empirical results show that, for typical SVM+MTL problems, the proposed generalized SMO achieves over 100 times speed-up, in comparison with general-purpose optimization routines.

  6. A Study on SVM Based on the Weighted Elitist Teaching-Learning-Based Optimization and Application in the Fault Diagnosis of Chemical Process

    Directory of Open Access Journals (Sweden)

    Cao Junxiang

    2015-01-01

    Full Text Available Teaching-Learning-Based Optimization (TLBO is a new swarm intelligence optimization algorithm that simulates the class learning process. According to such problems of the traditional TLBO as low optimizing efficiency and poor stability, this paper proposes an improved TLBO algorithm mainly by introducing the elite thought in TLBO and adopting different inertia weight decreasing strategies for elite and ordinary individuals of the teacher stage and the student stage. In this paper, the validity of the improved TLBO is verified by the optimizations of several typical test functions and the SVM optimized by the weighted elitist TLBO is used in the diagnosis and classification of common failure data of the TE chemical process. Compared with the SVM combining other traditional optimizing methods, the SVM optimized by the weighted elitist TLBO has a certain improvement in the accuracy of fault diagnosis and classification.

  7. Land cover classification accuracy as a function of sensor spatial resolution

    Science.gov (United States)

    Markham, B. L.; Townshend, J. R. G.

    1981-01-01

    The benefits obtained from sensor systems for monitoring earth resources will depend on the application and interpretation methods used. A frequently used analysis method is supervised per-pixel multispectral classification with a typical application being land cover classification. An investigation is conducted to evaluate the effect of spatial resolution on the ability to classify land cover types with per-pixel digital image classification techniques. Attention is also given to the documentation of changes in scene noise and the percentage of boundary pixels as a function of spatial resolution, in order to improve the understanding of the interrelationship between classification accuracy and spatial resolution. It is found that scene noise varies considerably between land cover categories. Changes in scene noise with coarsening resolution occur at different rates for different categories.

  8. A Support Vector Machine Hydrometeor Classification Algorithm for Dual-Polarization Radar

    Directory of Open Access Journals (Sweden)

    Nicoletta Roberto

    2017-07-01

    Full Text Available An algorithm based on a support vector machine (SVM is proposed for hydrometeor classification. The training phase is driven by the output of a fuzzy logic hydrometeor classification algorithm, i.e., the most popular approach for hydrometer classification algorithms used for ground-based weather radar. The performance of SVM is evaluated by resorting to a weather scenario, generated by a weather model; the corresponding radar measurements are obtained by simulation and by comparing results of SVM classification with those obtained by a fuzzy logic classifier. Results based on the weather model and simulations show a higher accuracy of the SVM classification. Objective comparison of the two classifiers applied to real radar data shows that SVM classification maps are spatially more homogenous (textural indices, energy, and homogeneity increases by 21% and 12% respectively and do not present non-classified data. The improvements found by SVM classifier, even though it is applied pixel-by-pixel, can be attributed to its ability to learn from the entire hyperspace of radar measurements and to the accurate training. The reliability of results and higher computing performance make SVM attractive for some challenging tasks such as its implementation in Decision Support Systems for helping pilots to make optimal decisions about changes inthe flight route caused by unexpected adverse weather.

  9. Grouped fuzzy SVM with EM-based partition of sample space for clustered microcalcification detection.

    Science.gov (United States)

    Wang, Huiya; Feng, Jun; Wang, Hongyu

    2017-07-20

    Detection of clustered microcalcification (MC) from mammograms plays essential roles in computer-aided diagnosis for early stage breast cancer. To tackle problems associated with the diversity of data structures of MC lesions and the variability of normal breast tissues, multi-pattern sample space learning is required. In this paper, a novel grouped fuzzy Support Vector Machine (SVM) algorithm with sample space partition based on Expectation-Maximization (EM) (called G-FSVM) is proposed for clustered MC detection. The diversified pattern of training data is partitioned into several groups based on EM algorithm. Then a series of fuzzy SVM are integrated for classification with each group of samples from the MC lesions and normal breast tissues. From DDSM database, a total of 1,064 suspicious regions are selected from 239 mammography, and the measurement of Accuracy, True Positive Rate (TPR), False Positive Rate (FPR) and EVL = TPR* 1-FPR are 0.82, 0.78, 0.14 and 0.72, respectively. The proposed method incorporates the merits of fuzzy SVM and multi-pattern sample space learning, decomposing the MC detection problem into serial simple two-class classification. Experimental results from synthetic data and DDSM database demonstrate that our integrated classification framework reduces the false positive rate significantly while maintaining the true positive rate.

  10. Impacts of Sample Design for Validation Data on the Accuracy of Feedforward Neural Network Classification

    Directory of Open Access Journals (Sweden)

    Giles M. Foody

    2017-08-01

    Full Text Available Validation data are often used to evaluate the performance of a trained neural network and used in the selection of a network deemed optimal for the task at-hand. Optimality is commonly assessed with a measure, such as overall classification accuracy. The latter is often calculated directly from a confusion matrix showing the counts of cases in the validation set with particular labelling properties. The sample design used to form the validation set can, however, influence the estimated magnitude of the accuracy. Commonly, the validation set is formed with a stratified sample to give balanced classes, but also via random sampling, which reflects class abundance. It is suggested that if the ultimate aim is to accurately classify a dataset in which the classes do vary in abundance, a validation set formed via random, rather than stratified, sampling is preferred. This is illustrated with the classification of simulated and remotely-sensed datasets. With both datasets, statistically significant differences in the accuracy with which the data could be classified arose from the use of validation sets formed via random and stratified sampling (z = 2.7 and 1.9 for the simulated and real datasets respectively, for both p < 0.05%. The accuracy of the classifications that used a stratified sample in validation were smaller, a result of cases of an abundant class being commissioned into a rarer class. Simple means to address the issue are suggested.

  11. Comparison of SVM, RF and ELM on an Electronic Nose for the Intelligent Evaluation of Paraffin Samples

    Directory of Open Access Journals (Sweden)

    Hong Men

    2018-01-01

    Full Text Available Paraffin odor intensity is an important quality indicator when a paraffin inspection is performed. Currently, paraffin odor level assessment is mainly dependent on an artificial sensory evaluation. In this paper, we developed a paraffin odor analysis system to classify and grade four kinds of paraffin samples. The original feature set was optimized using Principal Component Analysis (PCA and Partial Least Squares (PLS. Support Vector Machine (SVM, Random Forest (RF, and Extreme Learning Machine (ELM were applied to three different feature data sets for classification and level assessment of paraffin. For classification, the model based on SVM, with an accuracy rate of 100%, was superior to that based on RF, with an accuracy rate of 98.33–100%, and ELM, with an accuracy rate of 98.01–100%. For level assessment, the R2 related to the training set was above 0.97 and the R2 related to the test set was above 0.87. Through comprehensive comparison, the generalization of the model based on ELM was superior to those based on SVM and RF. The scoring errors for the three models were 0.0016–0.3494, lower than the error of 0.5–1.0 measured by industry standard experts, meaning these methods have a higher prediction accuracy for scoring paraffin level.

  12. Comparison of SVM, RF and ELM on an Electronic Nose for the Intelligent Evaluation of Paraffin Samples

    Science.gov (United States)

    Men, Hong; Fu, Songlin; Yang, Jialin; Cheng, Meiqi; Shi, Yan

    2018-01-01

    Paraffin odor intensity is an important quality indicator when a paraffin inspection is performed. Currently, paraffin odor level assessment is mainly dependent on an artificial sensory evaluation. In this paper, we developed a paraffin odor analysis system to classify and grade four kinds of paraffin samples. The original feature set was optimized using Principal Component Analysis (PCA) and Partial Least Squares (PLS). Support Vector Machine (SVM), Random Forest (RF), and Extreme Learning Machine (ELM) were applied to three different feature data sets for classification and level assessment of paraffin. For classification, the model based on SVM, with an accuracy rate of 100%, was superior to that based on RF, with an accuracy rate of 98.33–100%, and ELM, with an accuracy rate of 98.01–100%. For level assessment, the R2 related to the training set was above 0.97 and the R2 related to the test set was above 0.87. Through comprehensive comparison, the generalization of the model based on ELM was superior to those based on SVM and RF. The scoring errors for the three models were 0.0016–0.3494, lower than the error of 0.5–1.0 measured by industry standard experts, meaning these methods have a higher prediction accuracy for scoring paraffin level. PMID:29346328

  13. Discrimination between Alzheimer's Disease and Mild Cognitive Impairment Using SOM and PSO-SVM

    Directory of Open Access Journals (Sweden)

    Shih-Ting Yang

    2013-01-01

    Full Text Available In this study, an MRI-based classification framework was proposed to distinguish the patients with AD and MCI from normal participants by using multiple features and different classifiers. First, we extracted features (volume and shape from MRI data by using a series of image processing steps. Subsequently, we applied principal component analysis (PCA to convert a set of features of possibly correlated variables into a smaller set of values of linearly uncorrelated variables, decreasing the dimensions of feature space. Finally, we developed a novel data mining framework in combination with support vector machine (SVM and particle swarm optimization (PSO for the AD/MCI classification. In order to compare the hybrid method with traditional classifier, two kinds of classifiers, that is, SVM and a self-organizing map (SOM, were trained for patient classification. With the proposed framework, the classification accuracy is improved up to 82.35% and 77.78% in patients with AD and MCI. The result achieved up to 94.12% and 88.89% in AD and MCI by combining the volumetric features and shape features and using PCA. The present results suggest that novel multivariate methods of pattern matching reach a clinically relevant accuracy for the a priori prediction of the progression from MCI to AD.

  14. Bearing Fault Diagnosis Based on Improved Locality-Constrained Linear Coding and Adaptive PSO-Optimized SVM

    Directory of Open Access Journals (Sweden)

    Haodong Yuan

    2017-01-01

    Full Text Available A novel bearing fault diagnosis method based on improved locality-constrained linear coding (LLC and adaptive PSO-optimized support vector machine (SVM is proposed. In traditional LLC, each feature is encoded by using a fixed number of bases without considering the distribution of the features and the weight of the bases. To address these problems, an improved LLC algorithm based on adaptive and weighted bases is proposed. Firstly, preliminary features are obtained by wavelet packet node energy. Then, dictionary learning with class-wise K-SVD algorithm is implemented. Subsequently, based on the learned dictionary the LLC codes can be solved using the improved LLC algorithm. Finally, SVM optimized by adaptive particle swarm optimization (PSO is utilized to classify the discriminative LLC codes and thus bearing fault diagnosis is realized. In the dictionary leaning stage, other methods such as selecting the samples themselves as dictionary and K-means are also conducted for comparison. The experiment results show that the LLC codes can effectively extract the bearing fault characteristics and the improved LLC outperforms traditional LLC. The dictionary learned by class-wise K-SVD achieves the best performance. Additionally, adaptive PSO-optimized SVM can greatly enhance the classification accuracy comparing with SVM using default parameters and linear SVM.

  15. An IPSO-SVM algorithm for security state prediction of mine production logistics system

    Science.gov (United States)

    Zhang, Yanliang; Lei, Junhui; Ma, Qiuli; Chen, Xin; Bi, Runfang

    2017-06-01

    A theoretical basis for the regulation of corporate security warning and resources was provided in order to reveal the laws behind the security state in mine production logistics. Considering complex mine production logistics system and the variable is difficult to acquire, a superior security status predicting model of mine production logistics system based on the improved particle swarm optimization and support vector machine (IPSO-SVM) is proposed in this paper. Firstly, through the linear adjustments of inertia weight and learning weights, the convergence speed and search accuracy are enhanced with the aim to deal with situations associated with the changeable complexity and the data acquisition difficulty. The improved particle swarm optimization (IPSO) is then introduced to resolve the problem of parameter settings in traditional support vector machines (SVM). At the same time, security status index system is built to determine the classification standards of safety status. The feasibility and effectiveness of this method is finally verified using the experimental results.

  16. Biased binomial assessment of cross-validated estimation of classification accuracies illustrated in diagnosis predictions

    Directory of Open Access Journals (Sweden)

    Quentin Noirhomme

    2014-01-01

    Full Text Available Multivariate classification is used in neuroimaging studies to infer brain activation or in medical applications to infer diagnosis. Their results are often assessed through either a binomial or a permutation test. Here, we simulated classification results of generated random data to assess the influence of the cross-validation scheme on the significance of results. Distributions built from classification of random data with cross-validation did not follow the binomial distribution. The binomial test is therefore not adapted. On the contrary, the permutation test was unaffected by the cross-validation scheme. The influence of the cross-validation was further illustrated on real-data from a brain–computer interface experiment in patients with disorders of consciousness and from an fMRI study on patients with Parkinson disease. Three out of 16 patients with disorders of consciousness had significant accuracy on binomial testing, but only one showed significant accuracy using permutation testing. In the fMRI experiment, the mental imagery of gait could discriminate significantly between idiopathic Parkinson's disease patients and healthy subjects according to the permutation test but not according to the binomial test. Hence, binomial testing could lead to biased estimation of significance and false positive or negative results. In our view, permutation testing is thus recommended for clinical application of classification with cross-validation.

  17. Geographical traceability of wild Boletus edulis based on data fusion of FT-MIR and ICP-AES coupled with data mining methods (SVM)

    Science.gov (United States)

    Li, Yun; Zhang, Ji; Li, Tao; Liu, Honggao; Li, Jieqing; Wang, Yuanzhong

    2017-04-01

    In this work, the data fusion strategy of Fourier transform mid infrared (FT-MIR) spectroscopy and inductively coupled plasma-atomic emission spectrometry (ICP-AES) was used in combination with Support Vector Machine (SVM) to determine the geographic origin of Boletus edulis collected from nine regions of Yunnan Province in China. Firstly, competitive adaptive reweighted sampling (CARS) was used for selecting an optimal combination of key wavenumbers of second derivative FT-MIR spectra, and thirteen elements were sorted with variable importance in projection (VIP) scores. Secondly, thirteen subsets of multi-elements with the best VIP score were generated and each subset was used to fuse with FT-MIR. Finally, the classification models were established by SVM, and the combination of parameter C and γ (gamma) of SVM models was calculated by the approaches of grid search (GS) and genetic algorithm (GA). The results showed that both GS-SVM and GA-SVM models achieved good performances based on the #9 subset and the prediction accuracy in calibration and validation sets of the two models were 81.40% and 90.91%, correspondingly. In conclusion, it indicated that the data fusion strategy of FT-MIR and ICP-AES coupled with the algorithm of SVM can be used as a reliable tool for accurate identification of B. edulis, and it can provide a useful way of thinking for the quality control of edible mushrooms.

  18. Boosting accuracy of automated classification of fluorescence microscope images for location proteomics

    Directory of Open Access Journals (Sweden)

    Huang Kai

    2004-06-01

    Full Text Available Abstract Background Detailed knowledge of the subcellular location of each expressed protein is critical to a full understanding of its function. Fluorescence microscopy, in combination with methods for fluorescent tagging, is the most suitable current method for proteome-wide determination of subcellular location. Previous work has shown that neural network classifiers can distinguish all major protein subcellular location patterns in both 2D and 3D fluorescence microscope images. Building on these results, we evaluate here new classifiers and features to improve the recognition of protein subcellular location patterns in both 2D and 3D fluorescence microscope images. Results We report here a thorough comparison of the performance on this problem of eight different state-of-the-art classification methods, including neural networks, support vector machines with linear, polynomial, radial basis, and exponential radial basis kernel functions, and ensemble methods such as AdaBoost, Bagging, and Mixtures-of-Experts. Ten-fold cross validation was used to evaluate each classifier with various parameters on different Subcellular Location Feature sets representing both 2D and 3D fluorescence microscope images, including new feature sets incorporating features derived from Gabor and Daubechies wavelet transforms. After optimal parameters were chosen for each of the eight classifiers, optimal majority-voting ensemble classifiers were formed for each feature set. Comparison of results for each image for all eight classifiers permits estimation of the lower bound classification error rate for each subcellular pattern, which we interpret to reflect the fraction of cells whose patterns are distorted by mitosis, cell death or acquisition errors. Overall, we obtained statistically significant improvements in classification accuracy over the best previously published results, with the overall error rate being reduced by one-third to one-half and with the average

  19. Boosting accuracy of automated classification of fluorescence microscope images for location proteomics.

    Science.gov (United States)

    Huang, Kai; Murphy, Robert F

    2004-06-18

    Detailed knowledge of the subcellular location of each expressed protein is critical to a full understanding of its function. Fluorescence microscopy, in combination with methods for fluorescent tagging, is the most suitable current method for proteome-wide determination of subcellular location. Previous work has shown that neural network classifiers can distinguish all major protein subcellular location patterns in both 2D and 3D fluorescence microscope images. Building on these results, we evaluate here new classifiers and features to improve the recognition of protein subcellular location patterns in both 2D and 3D fluorescence microscope images. We report here a thorough comparison of the performance on this problem of eight different state-of-the-art classification methods, including neural networks, support vector machines with linear, polynomial, radial basis, and exponential radial basis kernel functions, and ensemble methods such as AdaBoost, Bagging, and Mixtures-of-Experts. Ten-fold cross validation was used to evaluate each classifier with various parameters on different Subcellular Location Feature sets representing both 2D and 3D fluorescence microscope images, including new feature sets incorporating features derived from Gabor and Daubechies wavelet transforms. After optimal parameters were chosen for each of the eight classifiers, optimal majority-voting ensemble classifiers were formed for each feature set. Comparison of results for each image for all eight classifiers permits estimation of the lower bound classification error rate for each subcellular pattern, which we interpret to reflect the fraction of cells whose patterns are distorted by mitosis, cell death or acquisition errors. Overall, we obtained statistically significant improvements in classification accuracy over the best previously published results, with the overall error rate being reduced by one-third to one-half and with the average accuracy for single 2D images being higher than

  20. A Novel Feature Extraction Approach Using Window Function Capturing and QPSO-SVM for Enhancing Electronic Nose Performance

    Directory of Open Access Journals (Sweden)

    Xiuzhen Guo

    2015-06-01

    Full Text Available In this paper, a novel feature extraction approach which can be referred to as moving window function capturing (MWFC has been proposed to analyze signals of an electronic nose (E-nose used for detecting types of infectious pathogens in rat wounds. Meanwhile, a quantum-behaved particle swarm optimization (QPSO algorithm is implemented in conjunction with support vector machine (SVM for realizing a synchronization optimization of the sensor array and SVM model parameters. The results prove the efficacy of the proposed method for E-nose feature extraction, which can lead to a higher classification accuracy rate compared to other established techniques. Meanwhile it is interesting to note that different classification results can be obtained by changing the types, widths or positions of windows. By selecting the optimum window function for the sensor response, the performance of an E-nose can be enhanced.

  1. Applications of PCA and SVM-PSO Based Real-Time Face Recognition System

    Directory of Open Access Journals (Sweden)

    Ming-Yuan Shieh

    2014-01-01

    Full Text Available This paper incorporates principal component analysis (PCA with support vector machine-particle swarm optimization (SVM-PSO for developing real-time face recognition systems. The integrated scheme aims to adopt the SVM-PSO method to improve the validity of PCA based image recognition systems on dynamically visual perception. The face recognition for most human-robot interaction applications is accomplished by PCA based method because of its dimensionality reduction. However, PCA based systems are only suitable for processing the faces with the same face expressions and/or under the same view directions. Since the facial feature selection process can be considered as a problem of global combinatorial optimization in machine learning, the SVM-PSO is usually used as an optimal classifier of the system. In this paper, the PSO is used to implement a feature selection, and the SVMs serve as fitness functions of the PSO for classification problems. Experimental results demonstrate that the proposed method simplifies features effectively and obtains higher classification accuracy.

  2. Data Driven Constraints for the SVM

    DEFF Research Database (Denmark)

    Darkner, Sune; Clemmensen, Line Katrine Harder

    2012-01-01

    . Assuming that two observations of the same subject in different states span a vector, we hypothesise that such structure of the data contains implicit information which can aid the classification, thus the name data driven constraints. We derive a constraint based on the data which allow for the use...... classifier solution, compared to the SVM i.e. reduces variance and improves classification rates. We present a quantitative measure of the information level contained in the pairing and test the method on simulated as well as a high-dimensional paired data set of ear-canal surfaces....

  3. A robust data scaling algorithm to improve classification accuracies in biomedical data.

    Science.gov (United States)

    Cao, Xi Hang; Stojkovic, Ivan; Obradovic, Zoran

    2016-09-09

    Machine learning models have been adapted in biomedical research and practice for knowledge discovery and decision support. While mainstream biomedical informatics research focuses on developing more accurate models, the importance of data preprocessing draws less attention. We propose the Generalized Logistic (GL) algorithm that scales data uniformly to an appropriate interval by learning a generalized logistic function to fit the empirical cumulative distribution function of the data. The GL algorithm is simple yet effective; it is intrinsically robust to outliers, so it is particularly suitable for diagnostic/classification models in clinical/medical applications where the number of samples is usually small; it scales the data in a nonlinear fashion, which leads to potential improvement in accuracy. To evaluate the effectiveness of the proposed algorithm, we conducted experiments on 16 binary classification tasks with different variable types and cover a wide range of applications. The resultant performance in terms of area under the receiver operation characteristic curve (AUROC) and percentage of correct classification showed that models learned using data scaled by the GL algorithm outperform the ones using data scaled by the Min-max and the Z-score algorithm, which are the most commonly used data scaling algorithms. The proposed GL algorithm is simple and effective. It is robust to outliers, so no additional denoising or outlier detection step is needed in data preprocessing. Empirical results also show models learned from data scaled by the GL algorithm have higher accuracy compared to the commonly used data scaling algorithms.

  4. SVM and ANFIS Models for precipitaton Modeling (Case Study: GonbadKavouse

    Directory of Open Access Journals (Sweden)

    N. Zabet Pishkhani

    2016-10-01

    Full Text Available Introduction: In recent years, according to the intelligent models increased as new techniques and tools in hydrological processes such as precipitation forecasting. ANFIS model has good ability in train, construction and classification, and also has the advantage that allows the extraction of fuzzy rules from numerical information or knowledge. Another intelligent technique in recent years has been used in various areas is support vector machine (SVM. In this paper the ability of artificial intelligence methods including support vector machine (SVM and adaptive neuro fuzzy inference system (ANFIS were analyzed in monthly precipitation prediction. Materials and Methods: The study area was the city of Gonbad in Golestan Province. The city has a temperate climate in the southern highlands and southern plains, mountains and temperate humid, semi-arid and semi-arid in the north of Gorganroud river. In total, the city's climate is temperate and humid. In the present study, monthly precipitation was modeled in Gonbad using ANFIS and SVM and two different database structures were designed. The first structure: input layer consisted of mean temperature, relative humidity, pressure and wind speed at Gonbad station. The second structure: According to Pearson coefficient, the monthly precipitation data were used from four stations: Arazkoose, Bahalke, Tamar and Aqqala which had a higher correlation with Gonbad station precipitation. In this study precipitation data was used from 1995 to 2012. 80% data were used for model training and the remaining 20% of data for validation. SVM was developed from support vector machines in the 1990s by Vapnik. SVM has been widely recognized as a powerful tool to deal with function fitting problems. An Adaptive Neuro-Fuzzy Inference System (ANFIS refers, in general, to an adaptive network which performs the function of a fuzzy inference system. The most commonly used fuzzy system in ANFIS architectures is the Sugeno model

  5. [Urban land use change detection based on high accuracy classification of multispectral remote sensing imagery].

    Science.gov (United States)

    Tong, Xiao-Hua; Zhang, Xue; Liu, Miao-Long

    2009-08-01

    In the present paper, the urban land change in Jiading district of Shanghai was studied on the basis of high accuracy classification for 4 epochs of multispectral remotely sensed imageries. A further improved genetic-algorithm optimized back propagation neural network approach was first employed in our study to obtain sorts of land cover types from the remotely sensed imageries. The urban land and non-urban land types were thus extracted based on the classification result. According to the 16 corresponding relationships between the pixel values in the four urban land imageries and the ones in the generated urban land change imagery, the amount of each type pixel in the generated imagery was calculated according to the four plates, and the situation of urban land change was analyzed and investigated for the study area in three year intervals. The urban development in the study area was also preliminarily revealed.

  6. Accuracy Improvement for Diabetes Disease Classification: A Case on a Public Medical Dataset

    Directory of Open Access Journals (Sweden)

    Mehrbakhsh Nilashi

    2017-09-01

    Full Text Available As a chronic disease, diabetes mellitus has emerged as a worldwide epidemic. Providing diagnostic aid for diabetes disease by using a set of data that contains only medical information obtained without advanced medical equipment, can help numbers of people who want to discover the disease or the risk of disease at an early stage. This can possibly make a huge positive impact on a lot of peoples lives. The aim of this study is to classify diabetes disease by developing an intelligence system using machine learning techniques. Our method is developed through clustering, noise removal and classification approaches. Accordingly, we use SOM, PCA and NN for clustering, noise removal and classification tasks, respectively. Experimental results on Pima Indian Diabetes dataset show that proposed method remarkably improves the accuracy of prediction in relation to methods developed in the previous studies. The hybrid intelligent system can assist medical practitioners in the healthcare practice as a decision support system.

  7. A comparison of non-symmetric entropy-based classification trees and support vector machine for cardiovascular risk stratification.

    Science.gov (United States)

    Singh, Anima; Guttag, John V

    2011-01-01

    Classification tree-based risk stratification models generate easily interpretable classification rules. This feature makes classification tree-based models appealing for use in a clinical setting, provided that they have comparable accuracy to other methods. In this paper, we present and evaluate the performance of a non-symmetric entropy-based classification tree algorithm. The algorithm is designed to accommodate class imbalance found in many medical datasets. We evaluate the performance of this algorithm, and compare it to that of SVM-based classifiers, when applied to 4219 non-ST elevation acute coronary syndrome patients. We generated SVM-based classifiers using three different strategies for handling class imbalance: cost-sensitive SVM learning, synthetic minority oversampling (SMOTE), and random majority undersampling. We used both linear and radial basis kernel-based SVMs. Our classification tree models outperformed SVM-based classifiers generated using each of the three techniques. On average, the classification tree models yielded a 14% improvement in G-score and a 21% improvement in F-score relative to the linear SVM classifiers with the best performance. Similarly, our classification tree models yielded a 12% improvement in G-score and a 21% improvement in the F-score over the best RBF kernel-based SVM classifiers.

  8. SA-SVM based automated diagnostic system for skin cancer

    Science.gov (United States)

    Masood, Ammara; Al-Jumaily, Adel

    2015-03-01

    Early diagnosis of skin cancer is one of the greatest challenges due to lack of experience of general practitioners (GPs). This paper presents a clinical decision support system aimed to save time and resources in the diagnostic process. Segmentation, feature extraction, pattern recognition, and lesion classification are the important steps in the proposed decision support system. The system analyses the images to extract the affected area using a novel proposed segmentation method H-FCM-LS. The underlying features which indicate the difference between melanoma and benign lesions are obtained through intensity, spatial/frequency and texture based methods. For classification purpose, self-advising SVM is adapted which showed improved classification rate as compared to standard SVM. The presented work also considers analyzed performance of linear and kernel based SVM on the specific skin lesion diagnostic problem and discussed corresponding findings. The best diagnostic rates obtained through the proposed method are around 90.5 %.

  9. Classification of Herbaceous Vegetation Using Airborne Hyperspectral Imagery

    Directory of Open Access Journals (Sweden)

    Péter Burai

    2015-02-01

    Full Text Available Alkali landscapes hold an extremely fine-scale mosaic of several vegetation types, thus it seems challenging to separate these classes by remote sensing. Our aim was to test the applicability of different image classification methods of hyperspectral data in this complex situation. To reach the highest classification accuracy, we tested traditional image classifiers (maximum likelihood classifier—MLC, machine learning algorithms (support vector machine—SVM, random forest—RF and feature extraction (minimum noise fraction (MNF-transformation on training datasets of different sizes. Digital images were acquired from an AISA EAGLE II hyperspectral sensor of 128 contiguous bands (400–1000 nm, a spectral sampling of 5 nm bandwidth and a ground pixel size of 1 m. For the classification, we established twenty vegetation classes based on the dominant species, canopy height, and total vegetation cover. Image classification was applied to the original and MNF (minimum noise fraction transformed dataset with various training sample sizes between 10 and 30 pixels. In order to select the optimal number of the transformed features, we applied SVM, RF and MLC classification to 2–15 MNF transformed bands. In the case of the original bands, SVM and RF classifiers provided high accuracy irrespective of the number of the training pixels. We found that SVM and RF produced the best accuracy when using the first nine MNF transformed bands; involving further features did not increase classification accuracy. SVM and RF provided high accuracies with the transformed bands, especially in the case of the aggregated groups. Even MLC provided high accuracy with 30 training pixels (80.78%, but the use of a smaller training dataset (10 training pixels significantly reduced the accuracy of classification (52.56%. Our results suggest that in alkali landscapes, the application of SVM is a feasible solution, as it provided the highest accuracies compared to RF and MLC

  10. A Unified Framework for Dimensionality Reduction and Classification of Hyperspectral Data

    Science.gov (United States)

    Kolluru, P.; Pandey, K.; Padalia, H.

    2014-11-01

    The processing of hyperspectral remote sensing data, for information retrieval, is challenging due to its higher dimensionality. Machine learning based algorithms such as Support Vector Machine (SVM) is preferably applied to perform classification of high dimensionality data. A single-step unified framework is required which could decide the intrinsic dimensionality of data and achieve higher classification accuracy using SVM. This work present development of a SVM-based dimensionality reduction and classification (SVMDRC) framework for hyperspectral data. The proposed unified framework was tested at Los Tollos in Rodalquilar district of Spain, which have predominance of alunite, kaolinite, and illite minerals with sparse vegetation cover. Summer season image was utilized for implementing the proposed method. Modified broken stick rule (MBSR) was used to calculate the intrinsic dimensionality of HyMap data which automatically reduce the number of bands. Comparison of SVMDRC with SVM clearly suggests that SVM alone is inadequate in yielding better classification accuracies for minerals from hyperspectral data rather requires dimensionality reduction. Incorporation of modified broken stick method in SVMDRC framework positively influenced the feature separability and provided better classification accuracy. The mineral distribution map produced for the study area would be useful for refining the areas for mineral exploration.

  11. Combining MLC and SVM Classifiers for Learning Based Decision Making: Analysis and Evaluations

    Directory of Open Access Journals (Sweden)

    Yi Zhang

    2015-01-01

    Full Text Available Maximum likelihood classifier (MLC and support vector machines (SVM are two commonly used approaches in machine learning. MLC is based on Bayesian theory in estimating parameters of a probabilistic model, whilst SVM is an optimization based nonparametric method in this context. Recently, it is found that SVM in some cases is equivalent to MLC in probabilistically modeling the learning process. In this paper, MLC and SVM are combined in learning and classification, which helps to yield probabilistic output for SVM and facilitate soft decision making. In total four groups of data are used for evaluations, covering sonar, vehicle, breast cancer, and DNA sequences. The data samples are characterized in terms of Gaussian/non-Gaussian distributed and balanced/unbalanced samples which are then further used for performance assessment in comparing the SVM and the combined SVM-MLC classifier. Interesting results are reported to indicate how the combined classifier may work under various conditions.

  12. Comparison of the Data Classification Approaches to Diagnose Spinal Cord Injury

    Directory of Open Access Journals (Sweden)

    Yunus Ziya Arslan

    2012-01-01

    Full Text Available In our previous study, we have demonstrated that analyzing the skin impedances measured along the key points of the dermatomes might be a useful supplementary technique to enhance the diagnosis of spinal cord injury (SCI, especially for unconscious and noncooperative patients. Initially, in order to distinguish between the skin impedances of control group and patients, artificial neural networks (ANNs were used as the main data classification approach. However, in the present study, we have proposed two more data classification approaches, that is, support vector machine (SVM and hierarchical cluster tree analysis (HCTA, which improved the classification rate and also the overall performance. A comparison of the performance of these three methods in classifying traumatic SCI patients and controls was presented. The classification results indicated that dendrogram analysis based on HCTA algorithm and SVM achieved higher recognition accuracies compared to ANN. HCTA and SVM algorithms improved the classification rate and also the overall performance of SCI diagnosis.

  13. Recursive Cluster Elimination (RCE for classification and feature selection from gene expression data

    Directory of Open Access Journals (Sweden)

    Showe Louise C

    2007-05-01

    Full Text Available Abstract Background Classification studies using gene expression datasets are usually based on small numbers of samples and tens of thousands of genes. The selection of those genes that are important for distinguishing the different sample classes being compared, poses a challenging problem in high dimensional data analysis. We describe a new procedure for selecting significant genes as recursive cluster elimination (RCE rather than recursive feature elimination (RFE. We have tested this algorithm on six datasets and compared its performance with that of two related classification procedures with RFE. Results We have developed a novel method for selecting significant genes in comparative gene expression studies. This method, which we refer to as SVM-RCE, combines K-means, a clustering method, to identify correlated gene clusters, and Support Vector Machines (SVMs, a supervised machine learning classification method, to identify and score (rank those gene clusters for the purpose of classification. K-means is used initially to group genes into clusters. Recursive cluster elimination (RCE is then applied to iteratively remove those clusters of genes that contribute the least to the classification performance. SVM-RCE identifies the clusters of correlated genes that are most significantly differentially expressed between the sample classes. Utilization of gene clusters, rather than individual genes, enhances the supervised classification accuracy of the same data as compared to the accuracy when either SVM or Penalized Discriminant Analysis (PDA with recursive feature elimination (SVM-RFE and PDA-RFE are used to remove genes based on their individual discriminant weights. Conclusion SVM-RCE provides improved classification accuracy with complex microarray data sets when it is compared to the classification accuracy of the same datasets using either SVM-RFE or PDA-RFE. SVM-RCE identifies clusters of correlated genes that when considered together

  14. Recursive cluster elimination (RCE) for classification and feature selection from gene expression data.

    Science.gov (United States)

    Yousef, Malik; Jung, Segun; Showe, Louise C; Showe, Michael K

    2007-05-02

    Classification studies using gene expression datasets are usually based on small numbers of samples and tens of thousands of genes. The selection of those genes that are important for distinguishing the different sample classes being compared, poses a challenging problem in high dimensional data analysis. We describe a new procedure for selecting significant genes as recursive cluster elimination (RCE) rather than recursive feature elimination (RFE). We have tested this algorithm on six datasets and compared its performance with that of two related classification procedures with RFE. We have developed a novel method for selecting significant genes in comparative gene expression studies. This method, which we refer to as SVM-RCE, combines K-means, a clustering method, to identify correlated gene clusters, and Support Vector Machines (SVMs), a supervised machine learning classification method, to identify and score (rank) those gene clusters for the purpose of classification. K-means is used initially to group genes into clusters. Recursive cluster elimination (RCE) is then applied to iteratively remove those clusters of genes that contribute the least to the classification performance. SVM-RCE identifies the clusters of correlated genes that are most significantly differentially expressed between the sample classes. Utilization of gene clusters, rather than individual genes, enhances the supervised classification accuracy of the same data as compared to the accuracy when either SVM or Penalized Discriminant Analysis (PDA) with recursive feature elimination (SVM-RFE and PDA-RFE) are used to remove genes based on their individual discriminant weights. SVM-RCE provides improved classification accuracy with complex microarray data sets when it is compared to the classification accuracy of the same datasets using either SVM-RFE or PDA-RFE. SVM-RCE identifies clusters of correlated genes that when considered together provide greater insight into the structure of the

  15. Optimal Features Subset Selection and Classification for Iris Recognition

    Directory of Open Access Journals (Sweden)

    Prabir Bhattacharya

    2008-06-01

    Full Text Available The selection of the optimal features subset and the classification have become an important issue in the field of iris recognition. We propose a feature selection scheme based on the multiobjectives genetic algorithm (MOGA to improve the recognition accuracy and asymmetrical support vector machine for the classification of iris patterns. We also suggest a segmentation scheme based on the collarette area localization. The deterministic feature sequence is extracted from the iris images using the 1D log-Gabor wavelet technique, and the extracted feature sequence is used to train the support vector machine (SVM. The MOGA is applied to optimize the features sequence and to increase the overall performance based on the matching accuracy of the SVM. The parameters of SVM are optimized to improve the overall generalization performance, and the traditional SVM is modified to an asymmetrical SVM to treat the false accept and false reject cases differently and to handle the unbalanced data of a specific class with respect to the other classes. Our experimental results indicate that the performance of SVM as a classifier is better than the performance of the classifiers based on the feedforward neural network, the k-nearest neighbor, and the Hamming and the Mahalanobis distances. The proposed technique is computationally effective with recognition rates of 99.81% and 96.43% on CASIA and ICE datasets, respectively.

  16. Classification of visualization exudates fundus images results using ...

    African Journals Online (AJOL)

    The kernel function settings; linear, polynomial, quadratic and RBF have an effect on the classification results. For SVM1, the best parameter in classifying pixels is linear kernel function. The visualization results using CAC and radar chart are classified using ts accuracy. It has proven to discriminated exudates and non ...

  17. Comparison research on iot oriented image classification algorithms

    Directory of Open Access Journals (Sweden)

    Du Ke

    2016-01-01

    Full Text Available Image classification belongs to the machine learning and computer vision fields, it aims to recognize and classify objects in the image contents. How to apply image classification algorithms to large-scale data in the IoT framework is the focus of current research. Based on Anaconda, this article implement sk-NN, SVM, Softmax and Neural Network algorithms by Python, performs data normalization, random search, HOG and colour histogram feature extraction to enhance the algorithms, experiments on them in CIFAR-10 datasets, then conducts comparison from three aspects of training time, test time and classification accuracy. The experimental results show that: the vectorized implementation of the algorithms is more efficient than the loop implementation; The training time of k-NN is the shortest, SVM and Softmax spend more time, and the training time of Neural Network is the longest; The test time of SVM, Softmax and Neural Network are much shorter than of k-NN; Neural Network gets the highest classification accuracy, SVM and Softmax get lower and approximate accuracies, and k-NN gets the lowest accuracy. The effects of three algorithm improvement methods are obvious.

  18. Combined algorithm for improvement of fused radar and optical data classification accuracy

    Science.gov (United States)

    Karimi, Danya; Rangzan, Kazem; Akbarizadeh, Gholamreza; Kabolizadeh, Mostafa

    2017-01-01

    A new method, MICO-LDASR, is proposed to improve the classification accuracy of fused radar and optical data. The proposed algorithm combines three algorithms: multiplicative intrinsic component optimization (MICO), linear discriminant analysis (LDA), and sparse regularization (SR). MICO-LDASR first corrects the bias fields of the input images by an energy minimization process and then selects the most discriminative image features using a combination of LDA and SR (LDASR) based on a supervised feature selection and learning. Two pairs of fused radar and optical data were used in this study. Features, such as non-negative matrix factorization and textural features, were extracted from the original and bias corrected images, and, following the formation of two different types of feature matrices, the matrices were optimized based on LDASR and utilized in the two learned and unlearned forms as the inputs to rotation forest and support vector machine classifiers. The results showed that classification accuracy is greatly improved when implementing MICO-LDASR on feature matrices of Sentinel and ALOS-fused data.

  19. Optimal two-phase sampling design for comparing accuracies of two binary classification rules.

    Science.gov (United States)

    Xu, Huiping; Hui, Siu L; Grannis, Shaun

    2014-02-10

    In this paper, we consider the design for comparing the performance of two binary classification rules, for example, two record linkage algorithms or two screening tests. Statistical methods are well developed for comparing these accuracy measures when the gold standard is available for every unit in the sample, or in a two-phase study when the gold standard is ascertained only in the second phase in a subsample using a fixed sampling scheme. However, these methods do not attempt to optimize the sampling scheme to minimize the variance of the estimators of interest. In comparing the performance of two classification rules, the parameters of primary interest are the difference in sensitivities, specificities, and positive predictive values. We derived the analytic variance formulas for these parameter estimates and used them to obtain the optimal sampling design. The efficiency of the optimal sampling design is evaluated through an empirical investigation that compares the optimal sampling with simple random sampling and with proportional allocation. Results of the empirical study show that the optimal sampling design is similar for estimating the difference in sensitivities and in specificities, and both achieve a substantial amount of variance reduction with an over-sample of subjects with discordant results and under-sample of subjects with concordant results. A heuristic rule is recommended when there is no prior knowledge of individual sensitivities and specificities, or the prevalence of the true positive findings in the study population. The optimal sampling is applied to a real-world example in record linkage to evaluate the difference in classification accuracy of two matching algorithms. Copyright © 2013 John Wiley & Sons, Ltd.

  20. Using spectrotemporal indices to improve the fruit-tree crop classification accuracy

    Science.gov (United States)

    Peña, M. A.; Liao, R.; Brenning, A.

    2017-06-01

    This study assesses the potential of spectrotemporal indices derived from satellite image time series (SITS) to improve the classification accuracy of fruit-tree crops. Six major fruit-tree crop types in the Aconcagua Valley, Chile, were classified by applying various linear discriminant analysis (LDA) techniques on a Landsat-8 time series of nine images corresponding to the 2014-15 growing season. As features we not only used the complete spectral resolution of the SITS, but also all possible normalized difference indices (NDIs) that can be constructed from any two bands of the time series, a novel approach to derive features from SITS. Due to the high dimensionality of this ;enhanced; feature set we used the lasso and ridge penalized variants of LDA (PLDA). Although classification accuracies yielded by the standard LDA applied on the full-band SITS were good (misclassification error rate, MER = 0.13), they were further improved by 23% (MER = 0.10) with ridge PLDA using the enhanced feature set. The most important bands to discriminate the crops of interest were mainly concentrated on the first two image dates of the time series, corresponding to the crops' greenup stage. Despite the high predictor weights provided by the red and near infrared bands, typically used to construct greenness spectral indices, other spectral regions were also found important for the discrimination, such as the shortwave infrared band at 2.11-2.19 μm, sensitive to foliar water changes. These findings support the usefulness of spectrotemporal indices in the context of SITS-based crop type classifications, which until now have been mainly constructed by the arithmetic combination of two bands of the same image date in order to derive greenness temporal profiles like those from the normalized difference vegetation index.

  1. Di-codon Usage for Gene Classification

    Science.gov (United States)

    Nguyen, Minh N.; Ma, Jianmin; Fogel, Gary B.; Rajapakse, Jagath C.

    Classification of genes into biologically related groups facilitates inference of their functions. Codon usage bias has been described previously as a potential feature for gene classification. In this paper, we demonstrate that di-codon usage can further improve classification of genes. By using both codon and di-codon features, we achieve near perfect accuracies for the classification of HLA molecules into major classes and sub-classes. The method is illustrated on 1,841 HLA sequences which are classified into two major classes, HLA-I and HLA-II. Major classes are further classified into sub-groups. A binary SVM using di-codon usage patterns achieved 99.95% accuracy in the classification of HLA genes into major HLA classes; and multi-class SVM achieved accuracy rates of 99.82% and 99.03% for sub-class classification of HLA-I and HLA-II genes, respectively. Furthermore, by combining codon and di-codon usages, the prediction accuracies reached 100%, 99.82%, and 99.84% for HLA major class classification, and for sub-class classification of HLA-I and HLA-II genes, respectively.

  2. Comparison of accuracy of fibrosis degree classifications by liver biopsy and non-invasive tests in chronic hepatitis C

    Science.gov (United States)

    2011-01-01

    Background Non-invasive tests have been constructed and evaluated mainly for binary diagnoses such as significant fibrosis. Recently, detailed fibrosis classifications for several non-invasive tests have been developed, but their accuracy has not been thoroughly evaluated in comparison to liver biopsy, especially in clinical practice and for Fibroscan. Therefore, the main aim of the present study was to evaluate the accuracy of detailed fibrosis classifications available for non-invasive tests and liver biopsy. The secondary aim was to validate these accuracies in independent populations. Methods Four HCV populations provided 2,068 patients with liver biopsy, four different pathologist skill-levels and non-invasive tests. Results were expressed as percentages of correctly classified patients. Results In population #1 including 205 patients and comparing liver biopsy (reference: consensus reading by two experts) and blood tests, Metavir fibrosis (FM) stage accuracy was 64.4% in local pathologists vs. 82.2% (p liver biopsy read by a local pathologist, i.e., in clinical practice; however, the classification precision is apparently lesser. This detailed classification accuracy is much lower than that of significant fibrosis with Fibroscan and even Fibrotest but higher with FibroMeter3G. FibroMeter classification accuracy was significantly higher than those of other non-invasive tests. Finally, for hepatitis C evaluation in clinical practice, fibrosis degree can be evaluated using an accurate blood test. PMID:22129438

  3. Support vector machine as an alternative method for lithology classification of crystalline rocks

    Science.gov (United States)

    Deng, Chengxiang; Pan, Heping; Fang, Sinan; Amara Konaté, Ahmed; Qin, Ruidong

    2017-03-01

    With the expansion of machine learning algorithms, automatic lithology classification that uses well logging data is becoming significant in formation evaluation and reservoir characterization. In fact, the complicated composition and structural variations of metamorphic rocks result in more nonlinear features in well logging data and elevate requirements to algorithms. Herein, the application of the support vector machine (SVM) in classifying crystalline rocks from Chinese Continental Scientific Drilling Main Hole (CCSD-MH) data was reported. We found that the SVM performs poorly on the lithology classification of crystalline rocks when training samples are imbalanced. The fact is that training samples are generally limited and imbalanced as cores cannot be obtained balanced and at 100 percent. In this paper, we introduced the synthetic minority over-sampling technique (SMOTE) and Borderline-SMOTE to deal with imbalanced data. After experiments generating different quantities of training samples by SMOTE and Borderline-SMOTE, the most suitable classifier was selected to overcome the disadvantage of the SVM. Then, the popular supervised classifier back-propagation neural networks (BPNN), which has been proved competent for lithology classification of crystalline rocks in previous studies, was compared to evaluate the performance of the SVM. Results show that Borderline-SMOTE can improve the SVM with substantially increased accuracy even for minority classes in a reasonable manner, while the SVM outperforms BPNN in aspects of lithology prediction and CCSD-MH data generalization. We demonstrate the potential of the SVM as an alternative to current methods for lithology identification of crystalline rocks.

  4. High-Reproducibility and High-Accuracy Method for Automated Topic Classification

    Directory of Open Access Journals (Sweden)

    Andrea Lancichinetti

    2015-01-01

    Full Text Available Much of human knowledge sits in large databases of unstructured text. Leveraging this knowledge requires algorithms that extract and record metadata on unstructured text documents. Assigning topics to documents will enable intelligent searching, statistical characterization, and meaningful classification. Latent Dirichlet allocation (LDA is the state of the art in topic modeling. Here, we perform a systematic theoretical and numerical analysis that demonstrates that current optimization techniques for LDA often yield results that are not accurate in inferring the most suitable model parameters. Adapting approaches from community detection in networks, we propose a new algorithm that displays high reproducibility and high accuracy and also has high computational efficiency. We apply it to a large set of documents in the English Wikipedia and reveal its hierarchical structure.

  5. [SVM-based qualitative analysis of Muscat Hamburg wine produced in Tianjin region].

    Science.gov (United States)

    Zhang, Jun; Wang, Fang; Wei, Ji-Ping; Li, Chang-Wen; Yang, Hua; Shao, Chun-Fu; Zhang, Fu-Qing; Yin, Ji-Tai; Xiao, Dong-Guang

    2011-01-01

    The purpose was to achieve the identification of Muscat Hamburg wines produced in Tianjin region through scanning and analyzing dry white wine samples of different grape varieties and regions by infrared spectroscopy technology. A support vector machine (SVM) based method was introduced to analyze infrared spectra of dry white wines. The pretreatment processes of the IR spectra were also elaborated, including baseline adjustment, noise Elimination, standard normalization and eliminating the main component of abnormal sample points. The authors selected great quantity of dry white wine samples of different grape regions including 511 Muscat Hamburg wine samples, 438 Italian Riesling wine samples, 307 Chardonnay wine samples, 29 Ugni Blanc wine samples, 44 Rkatsiteli wine samples, 31 longan wine samples and 79 ZeHong wine samples. According to different classification problems, 80% of IR spectra of the wine samples were used to establish discrimination models with SVM-based method, and the remaining 20% of IR spectra were used for the validation of the discrimination models. Experimental results showed that the proposed method is effective, since high classification accuracy, identification rate and rejecting rate were achieved: over 97% for the white wine samples of different grape varieties, meanwhile over 98% for the Muscat Hamburg wine samples produced in different regions. So the method developed in this paper played a good role in the qualitative classification and discrimination of Muscat Hamburg wines produced in Tianjin region. This novel method has a considerable potential and a rosy application future due to the expeditiousness, stability and easy-operation of FTIR method, as well as the veracity and credibility of SVM method.

  6. SVM Method used to Study Gender Differences Based on Microelement

    Science.gov (United States)

    Chun, Yang; Yuan, Liu; Jun, Du; Bin, Tang

    [objective] Intelligent Algorithm of SVM is used for studying gender differences based on microelement data, which provide reference For the application of Microelement in healthy people, such as providing technical support for the investigation of cases.[Method] Our Long-term test results on hair microelement of health people were consolidated. Support vector machine (SVM) is used to classified model of male and female based on microelement data. The radical basis function (RBF) is adopted as a kernel function of SVM, and the model adjusts C and σ to build the optimization classifier, [Result] Healthy population of men and women of manganese, cadmium and nickel are quite different, The classified model of Microelement based on SVM can classifies the male and female, the correct classification ratio set to be 81.71% and 66.47% by SVM based on 7 test date and 3 test data selection. [conclusion] The classified model of microelement data based on SVM can classifies male and female.

  7. Classification of electrocardiogram signals with support vector machines and particle swarm optimization.

    Science.gov (United States)

    Melgani, Farid; Bazi, Yakoub

    2008-09-01

    The aim of this paper is twofold. First, we present a thorough experimental study to show the superiority of the generalization capability of the support vector machine (SVM) approach in the automatic classification of electrocardiogram (ECG) beats. Second, we propose a novel classification system based on particle swarm optimization (PSO) to improve the generalization performance of the SVM classifier. For this purpose, we have optimized the SVM classifier design by searching for the best value of the parameters that tune its discriminant function, and upstream by looking for the best subset of features that feed the classifier. The experiments were conducted on the basis of ECG data from the Massachusetts Institute of Technology-Beth Israel Hospital (MIT-BIH) arrhythmia database to classify five kinds of abnormal waveforms and normal beats. In particular, they were organized so as to test the sensitivity of the SVM classifier and that of two reference classifiers used for comparison, i.e., the k-nearest neighbor (kNN) classifier and the radial basis function (RBF) neural network classifier, with respect to the curse of dimensionality and the number of available training beats. The obtained results clearly confirm the superiority of the SVM approach as compared to traditional classifiers, and suggest that further substantial improvements in terms of classification accuracy can be achieved by the proposed PSO-SVM classification system. On an average, over three experiments making use of a different total number of training beats (250, 500, and 750, respectively), the PSO-SVM yielded an overall accuracy of 89.72% on 40438 test beats selected from 20 patient records against 85.98%, 83.70%, and 82.34% for the SVM, the kNN, and the RBF classifiers, respectively.

  8. Reverse Classification Accuracy: Predicting Segmentation Performance in the Absence of Ground Truth.

    Science.gov (United States)

    Valindria, Vanya V; Lavdas, Ioannis; Bai, Wenjia; Kamnitsas, Konstantinos; Aboagye, Eric O; Rockall, Andrea G; Rueckert, Daniel; Glocker, Ben

    2017-08-01

    When integrating computational tools, such as automatic segmentation, into clinical practice, it is of utmost importance to be able to assess the level of accuracy on new data and, in particular, to detect when an automatic method fails. However, this is difficult to achieve due to the absence of ground truth. Segmentation accuracy on clinical data might be different from what is found through cross validation, because validation data are often used during incremental method development, which can lead to overfitting and unrealistic performance expectations. Before deployment, performance is quantified using different metrics, for which the predicted segmentation is compared with a reference segmentation, often obtained manually by an expert. But little is known about the real performance after deployment when a reference is unavailable. In this paper, we introduce the concept of reverse classification accuracy (RCA) as a framework for predicting the performance of a segmentation method on new data. In RCA, we take the predicted segmentation from a new image to train a reverse classifier, which is evaluated on a set of reference images with available ground truth. The hypothesis is that if the predicted segmentation is of good quality, then the reverse classifier will perform well on at least some of the reference images. We validate our approach on multi-organ segmentation with different classifiers and segmentation methods. Our results indicate that it is indeed possible to predict the quality of individual segmentations, in the absence of ground truth. Thus, RCA is ideal for integration into automatic processing pipelines in clinical routine and as a part of large-scale image analysis studies.

  9. An improved multivariate analytical method to assess the accuracy of acoustic sediment classification maps.

    Science.gov (United States)

    Biondo, M.; Bartholomä, A.

    2014-12-01

    High resolution hydro acoustic methods have been successfully employed for the detailed classification of sedimentary habitats. The fine-scale mapping of very heterogeneous, patchy sedimentary facies, and the compound effect of multiple non-linear physical processes on the acoustic signal, cause the classification of backscatter images to be subject to a great level of uncertainty. Standard procedures for assessing the accuracy of acoustic classification maps are not yet established. This study applies different statistical techniques to automated classified acoustic images with the aim of i) quantifying the ability of backscatter to resolve grain size distributions ii) understanding complex patterns influenced by factors other than grain size variations iii) designing innovative repeatable statistical procedures to spatially assess classification uncertainties. A high-frequency (450 kHz) sidescan sonar survey, carried out in the year 2012 in the shallow upper-mesotidal inlet the Jade Bay (German North Sea), allowed to map 100 km2 of surficial sediment with a resolution and coverage never acquired before in the area. The backscatter mosaic was ground-truthed using a large dataset of sediment grab sample information (2009-2011). Multivariate procedures were employed for modelling the relationship between acoustic descriptors and granulometric variables in order to evaluate the correctness of acoustic classes allocation and sediment group separation. Complex patterns in the acoustic signal appeared to be controlled by the combined effect of surface roughness, sorting and mean grain size variations. The area is dominated by silt and fine sand in very mixed compositions; in this fine grained matrix, percentages of gravel resulted to be the prevailing factor affecting backscatter variability. In the absence of coarse material, sorting mostly affected the ability to detect gradual but significant changes in seabed types. Misclassification due to temporal discrepancies

  10. Reliability, Validity, and Classification Accuracy of the DSM-5 Diagnostic Criteria for Gambling Disorder and Comparison to DSM-IV.

    Science.gov (United States)

    Stinchfield, Randy; McCready, John; Turner, Nigel E; Jimenez-Murcia, Susana; Petry, Nancy M; Grant, Jon; Welte, John; Chapman, Heather; Winters, Ken C

    2016-09-01

    The DSM-5 was published in 2013 and it included two substantive revisions for gambling disorder (GD). These changes are the reduction in the threshold from five to four criteria and elimination of the illegal activities criterion. The purpose of this study was to twofold. First, to assess the reliability, validity and classification accuracy of the DSM-5 diagnostic criteria for GD. Second, to compare the DSM-5-DSM-IV on reliability, validity, and classification accuracy, including an examination of the effect of the elimination of the illegal acts criterion on diagnostic accuracy. To compare DSM-5 and DSM-IV, eight datasets from three different countries (Canada, USA, and Spain; total N = 3247) were used. All datasets were based on similar research methods. Participants were recruited from outpatient gambling treatment services to represent the group with a GD and from the community to represent the group without a GD. All participants were administered a standardized measure of diagnostic criteria. The DSM-5 yielded satisfactory reliability, validity and classification accuracy. In comparing the DSM-5 to the DSM-IV, most comparisons of reliability, validity and classification accuracy showed more similarities than differences. There was evidence of modest improvements in classification accuracy for DSM-5 over DSM-IV, particularly in reduction of false negative errors. This reduction in false negative errors was largely a function of lowering the cut score from five to four and this revision is an improvement over DSM-IV. From a statistical standpoint, eliminating the illegal acts criterion did not make a significant impact on diagnostic accuracy. From a clinical standpoint, illegal acts can still be addressed in the context of the DSM-5 criterion of lying to others.

  11. Power quality events recognition using a SVM-based method

    Energy Technology Data Exchange (ETDEWEB)

    Cerqueira, Augusto Santiago; Ferreira, Danton Diego; Ribeiro, Moises Vidal; Duque, Carlos Augusto [Department of Electrical Circuits, Federal University of Juiz de Fora, Campus Universitario, 36036 900, Juiz de Fora MG (Brazil)

    2008-09-15

    In this paper, a novel SVM-based method for power quality event classification is proposed. A simple approach for feature extraction is introduced, based on the subtraction of the fundamental component from the acquired voltage signal. The resulting signal is presented to a support vector machine for event classification. Results from simulation are presented and compared with two other methods, the OTFR and the LCEC. The proposed method shown an improved performance followed by a reasonable computational cost. (author)

  12. STUDY COMPARISON OF SVM-, K-NN- AND BACKPROPAGATION-BASED CLASSIFIER FOR IMAGE RETRIEVAL

    Directory of Open Access Journals (Sweden)

    Muhammad Athoillah

    2015-03-01

    Full Text Available Classification is a method for compiling data systematically according to the rules that have been set previously. In recent years classification method has been proven to help many people’s work, such as image classification, medical biology, traffic light, text classification etc. There are many methods to solve classification problem. This variation method makes the researchers find it difficult to determine which method is best for a problem. This framework is aimed to compare the ability of classification methods, such as Support Vector Machine (SVM, K-Nearest Neighbor (K-NN, and Backpropagation, especially in study cases of image retrieval with five category of image dataset. The result shows that K-NN has the best average result in accuracy with 82%. It is also the fastest in average computation time with 17,99 second during retrieve session for all categories class. The Backpropagation, however, is the slowest among three of them. In average it needed 883 second for training session and 41,7 second for retrieve session.

  13. Application of machine learning on brain cancer multiclass classification

    Science.gov (United States)

    Panca, V.; Rustam, Z.

    2017-07-01

    Classification of brain cancer is a problem of multiclass classification. One approach to solve this problem is by first transforming it into several binary problems. The microarray gene expression dataset has the two main characteristics of medical data: extremely many features (genes) and only a few number of samples. The application of machine learning on microarray gene expression dataset mainly consists of two steps: feature selection and classification. In this paper, the features are selected using a method based on support vector machine recursive feature elimination (SVM-RFE) principle which is improved to solve multiclass classification, called multiple multiclass SVM-RFE. Instead of using only the selected features on a single classifier, this method combines the result of multiple classifiers. The features are divided into subsets and SVM-RFE is used on each subset. Then, the selected features on each subset are put on separate classifiers. This method enhances the feature selection ability of each single SVM-RFE. Twin support vector machine (TWSVM) is used as the method of the classifier to reduce computational complexity. While ordinary SVM finds single optimum hyperplane, the main objective Twin SVM is to find two non-parallel optimum hyperplanes. The experiment on the brain cancer microarray gene expression dataset shows this method could classify 71,4% of the overall test data correctly, using 100 and 1000 genes selected from multiple multiclass SVM-RFE feature selection method. Furthermore, the per class results show that this method could classify data of normal and MD class with 100% accuracy.

  14. Land Cover Classification Accuracy Assessment Using Full-Waveform LiDAR Data

    Directory of Open Access Journals (Sweden)

    Kuan-Tsung Chang

    2015-01-01

    Full Text Available The geomorphology of Taiwan is characterized by marked changes in terrain, geological fractures, and frequent natural disasters. Because of sustained economic growth, urbanization and land development, the land cover in Taiwan has undergone frequent use changes. Among the various technologies for monitoring changes in land cover, remote sensing technologies, such as LiDAR, are efficient tools for collecting a broad range of spectral and spatial data. Two types of airborne LiDAR systems exist; full-waveform (FW LiDAR and traditional discrete-echo LiDAR. Because reflected waveforms are affected by the land object material type and properties, the waveform features can be applied to analyze the characteristics specifically associated with land-cover classification (LCC. Five types of land cover that characterize the volcanic Guishan Island were investigated. The automatic LCC method was used to elucidate the spectral, geomorphometric and textural characteristics. Interpretation keys accompanied by additional information were extracted from the FW LiDAR data for subsequent statistical and separation analyses. The results show that the Gabor texture and geomorphometric features, such as the normalized digital surface model (nDSM and slopes can enhance the overall LCC accuracy to higher than 90%. Moreover, both the producer and user accuracy can be higher than 92% for forest and built-up types using amplitude and pulse width. Although the waveform characteristics did not perform as well as anticipated due to the waveform data sampling rate, the data provides suitable training samples for testing the waveform feature effects.

  15. Feature Selection Has a Large Impact on One-Class Classification Accuracy for MicroRNAs in Plants

    Directory of Open Access Journals (Sweden)

    Malik Yousef

    2016-01-01

    Full Text Available MicroRNAs (miRNAs are short RNA sequences involved in posttranscriptional gene regulation. Their experimental analysis is complicated and, therefore, needs to be supplemented with computational miRNA detection. Currently computational miRNA detection is mainly performed using machine learning and in particular two-class classification. For machine learning, the miRNAs need to be parametrized and more than 700 features have been described. Positive training examples for machine learning are readily available, but negative data is hard to come by. Therefore, it seems prerogative to use one-class classification instead of two-class classification. Previously, we were able to almost reach two-class classification accuracy using one-class classifiers. In this work, we employ feature selection procedures in conjunction with one-class classification and show that there is up to 36% difference in accuracy among these feature selection methods. The best feature set allowed the training of a one-class classifier which achieved an average accuracy of ~95.6% thereby outperforming previous two-class-based plant miRNA detection approaches by about 0.5%. We believe that this can be improved upon in the future by rigorous filtering of the positive training examples and by improving current feature clustering algorithms to better target pre-miRNA feature selection.

  16. Measurement Properties and Classification Accuracy of Two Spanish Parent Surveys of Language Development for Preschool-Age Children

    Science.gov (United States)

    Guiberson, Mark; Rodriguez, Barbara L.

    2010-01-01

    Purpose: To describe the concurrent validity and classification accuracy of 2 Spanish parent surveys of language development, the Spanish Ages and Stages Questionnaire (ASQ; Squires, Potter, & Bricker, 1999) and the Pilot Inventario-III (Pilot INV-III; Guiberson, 2008a). Method: Forty-eight Spanish-speaking parents of preschool-age children…

  17. Pre-Processing Effect on the Accuracy of Event-Based Activity Segmentation and Classification through Inertial Sensors

    Directory of Open Access Journals (Sweden)

    Benish Fida

    2015-09-01

    Full Text Available Inertial sensors are increasingly being used to recognize and classify physical activities in a variety of applications. For monitoring and fitness applications, it is crucial to develop methods able to segment each activity cycle, e.g., a gait cycle, so that the successive classification step may be more accurate. To increase detection accuracy, pre-processing is often used, with a concurrent increase in computational cost. In this paper, the effect of pre-processing operations on the detection and classification of locomotion activities was investigated, to check whether the presence of pre-processing significantly contributes to an increase in accuracy. The pre-processing stages evaluated in this study were inclination correction and de-noising. Level walking, step ascending, descending and running were monitored by using a shank-mounted inertial sensor. Raw and filtered segments, obtained from a modified version of a rule-based gait detection algorithm optimized for sequential processing, were processed to extract time and frequency-based features for physical activity classification through a support vector machine classifier. The proposed method accurately detected >99% gait cycles from raw data and produced >98% accuracy on these segmented gait cycles. Pre-processing did not substantially increase classification accuracy, thus highlighting the possibility of reducing the amount of pre-processing for real-time applications.

  18. Parallel SVM for the analysis of hyperspectral data

    Science.gov (United States)

    Cavallaro, Gabriele; Atli Benediktsson, Jón; Riedel, Morris

    2014-05-01

    .e., borders, edges, discontinuities, surfaces, shapes) by performing a detailed physical analysis of the structures. Mathematical morphology provides very useful tools which allow enriching the image analysis when dealing with very high resolution (VHR) images. One of the most promising of the recent developments in the field of pattern recognition are Support Vector Machines (SVMs). These are supervised learning methods which are widely used for classification and regression. In such a context, our work aims to explore some issues regarding the SVMs. In particular, SVMs require a significant computational and storage capacity due to the large number of training vectors used for the analysis of very high spatial and spectral resolution remote sensing data. Specifically, we will adopt a parallel SVM based on the iterative MapReduce in order to analyze large scale classification problems by improving the computation speed and preserving the classification accuracies.

  19. Pseudo-inverse linear discriminants for the improvement of overall classification accuracies.

    Science.gov (United States)

    Daqi, Gao; Ahmed, Dastagir; Lili, Guo; Zejian, Wang; Zhe, Wang

    2016-09-01

    This paper studies the learning and generalization performances of pseudo-inverse linear discriminant (PILDs) based on the processing minimum sum-of-squared error (MS(2)E) and the targeting overall classification accuracy (OCA) criterion functions. There is little practicable significance to prove the equivalency between a PILD with the desired outputs in reverse proportion to the number of class samples and an FLD with the totally projected mean thresholds. When the desired outputs of each class are assigned a fixed value, a PILD is partly equal to an FLD. With the customarily desired outputs {1, -1}, a practicable threshold is acquired, which is only related to sample sizes. If the desired outputs of each sample are changeable, a PILD has nothing in common with an FLD. The optimal threshold may thus be singled out from multiple empirical ones related to sizes and distributed regions. Depending upon the processing MS(2)E criteria and the actually algebraic distances, an iterative learning strategy of PILD is proposed, the outstanding advantages of which are with limited epoch, without learning rate and divergent risk. Enormous experimental results for the benchmark datasets have verified that the iterative PILDs with optimal thresholds have good learning and generalization performances, and even reach the top OCAs for some datasets among the existing classifiers. Copyright © 2016 Elsevier Ltd. All rights reserved.

  20. Classification and biomarker identification using gene network modules and support vector machines

    Directory of Open Access Journals (Sweden)

    Showe Louise C

    2009-10-01

    Full Text Available Abstract Background Classification using microarray datasets is usually based on a small number of samples for which tens of thousands of gene expression measurements have been obtained. The selection of the genes most significant to the classification problem is a challenging issue in high dimension data analysis and interpretation. A previous study with SVM-RCE (Recursive Cluster Elimination, suggested that classification based on groups of correlated genes sometimes exhibits better performance than classification using single genes. Large databases of gene interaction networks provide an important resource for the analysis of genetic phenomena and for classification studies using interacting genes. We now demonstrate that an algorithm which integrates network information with recursive feature elimination based on SVM exhibits good performance and improves the biological interpretability of the results. We refer to the method as SVM with Recursive Network Elimination (SVM-RNE Results Initially, one thousand genes selected by t-test from a training set are filtered so that only genes that map to a gene network database remain. The Gene Expression Network Analysis Tool (GXNA is applied to the remaining genes to form n clusters of genes that are highly connected in the network. Linear SVM is used to classify the samples using these clusters, and a weight is assigned to each cluster based on its importance to the classification. The least informative clusters are removed while retaining the remainder for the next classification step. This process is repeated until an optimal classification is obtained. Conclusion More than 90% accuracy can be obtained in classification of selected microarray datasets by integrating the interaction network information with the gene expression information from the microarrays. The Matlab version of SVM-RNE can be downloaded from http://web.macam.ac.il/~myousef

  1. COMPARISON OF SVM AND FUZZY CLASSIFIER FOR AN INDIAN SCRIPT

    Directory of Open Access Journals (Sweden)

    M. J. Baheti

    2012-01-01

    Full Text Available With the advent of technological era, conversion of scanned document (handwritten or printed into machine editable format has attracted many researchers. This paper deals with the problem of recognition of Gujarati handwritten numerals. Gujarati numeral recognition requires performing some specific steps as a part of preprocessing. For preprocessing digitization, segmentation, normalization and thinning are done with considering that the image have almost no noise. Further affine invariant moments based model is used for feature extraction and finally Support Vector Machine (SVM and Fuzzy classifiers are used for numeral classification. . The comparison of SVM and Fuzzy classifier is made and it can be seen that SVM procured better results as compared to Fuzzy Classifier.

  2. A novel transmission line protection using DOST and SVM

    Directory of Open Access Journals (Sweden)

    M. Jaya Bharata Reddy

    2016-06-01

    Full Text Available This paper proposes a smart fault detection, classification and location (SFDCL methodology for transmission systems with multi-generators using discrete orthogonal Stockwell transform (DOST. The methodology is based on synchronized current measurements from remote telemetry units (RTUs installed at both ends of the transmission line. The energy coefficients extracted from the transient current signals due to occurrence of different types of faults using DOST are being utilized for real-time fault detection and classification. Support vector machine (SVM has been deployed for locating the fault distance using the extracted coefficients. A comparative study is performed for establishing the superiority of SVM over other popular computational intelligence methods, such as adaptive neuro-fuzzy inference system (ANFIS and artificial neural network (ANN, for more precise and reliable estimation of fault distance. The results corroborate the effectiveness of the suggested SFDCL algorithm for real-time transmission line fault detection, classification and localization.

  3. Medical imbalanced data classification

    Directory of Open Access Journals (Sweden)

    Sara Belarouci

    2017-04-01

    Full Text Available In general, the imbalanced dataset is a problem often found in health applications. In medical data classification, we often face the imbalanced number of data samples where at least one of the classes constitutes only a very small minority of the data. In the same time, it represent a difficult problem in most of machine learning algorithms. There have been many works dealing with classification of imbalanced dataset. In this paper, we proposed a learning method based on a cost sensitive extension of Least Mean Square (LMS algorithm that penalizes errors of different samples with different weights and some rules of thumb to determine those weights. After the balancing phase, we apply the different techniques (Support Vector Machine [SVM], K- Nearest Neighbor [K-NN] and Multilayer perceptron [MLP] for the balanced datasets. We have also compared the obtained results before and after balancing method. We have obtained best results compared to literature with a classification accuracy of 100%.

  4. PMSVM: An Optimized Support Vector Machine Classification Algorithm Based on PCA and Multilevel Grid Search Methods

    Directory of Open Access Journals (Sweden)

    Yukai Yao

    2015-01-01

    Full Text Available We propose an optimized Support Vector Machine classifier, named PMSVM, in which System Normalization, PCA, and Multilevel Grid Search methods are comprehensively considered for data preprocessing and parameters optimization, respectively. The main goals of this study are to improve the classification efficiency and accuracy of SVM. Sensitivity, Specificity, Precision, and ROC curve, and so forth, are adopted to appraise the performances of PMSVM. Experimental results show that PMSVM has relatively better accuracy and remarkable higher efficiency compared with traditional SVM algorithms.

  5. Automated, high accuracy classification of Parkinsonian disorders: a pattern recognition approach.

    Directory of Open Access Journals (Sweden)

    Andre F Marquand

    Full Text Available Progressive supranuclear palsy (PSP, multiple system atrophy (MSA and idiopathic Parkinson's disease (IPD can be clinically indistinguishable, especially in the early stages, despite distinct patterns of molecular pathology. Structural neuroimaging holds promise for providing objective biomarkers for discriminating these diseases at the single subject level but all studies to date have reported incomplete separation of disease groups. In this study, we employed multi-class pattern recognition to assess the value of anatomical patterns derived from a widely available structural neuroimaging sequence for automated classification of these disorders. To achieve this, 17 patients with PSP, 14 with IPD and 19 with MSA were scanned using structural MRI along with 19 healthy controls (HCs. An advanced probabilistic pattern recognition approach was employed to evaluate the diagnostic value of several pre-defined anatomical patterns for discriminating the disorders, including: (i a subcortical motor network; (ii each of its component regions and (iii the whole brain. All disease groups could be discriminated simultaneously with high accuracy using the subcortical motor network. The region providing the most accurate predictions overall was the midbrain/brainstem, which discriminated all disease groups from one another and from HCs. The subcortical network also produced more accurate predictions than the whole brain and all of its constituent regions. PSP was accurately predicted from the midbrain/brainstem, cerebellum and all basal ganglia compartments; MSA from the midbrain/brainstem and cerebellum and IPD from the midbrain/brainstem only. This study demonstrates that automated analysis of structural MRI can accurately predict diagnosis in individual patients with Parkinsonian disorders, and identifies distinct patterns of regional atrophy particularly useful for this process.

  6. Prediction of carcinogenicity for diverse chemicals based on substructure grouping and SVM modeling.

    Science.gov (United States)

    Tanabe, Kazutoshi; Lučić, Bono; Amić, Dragan; Kurita, Takio; Kaihara, Mikio; Onodera, Natsuo; Suzuki, Takahiro

    2010-11-01

    The Carcinogenicity Reliability Database (CRDB) was constructed by collecting experimental carcinogenicity data on about 1,500 chemicals from six sources, including IARC, and NTP databases, and then by ranking their reliabilities into six unified categories. A wide variety of 911 organic chemicals were selected from the database for QSAR modeling, and 1,504 kinds of different molecular descriptors were calculated, based on their 3D molecular structures as modeled by the Dragon software. Positive (carcinogenic) and negative (non-carcinogenic) chemicals containing various substructures were counted using atom and functional group count descriptors, and the statistical significance of ratios of positives to negatives was tested for those substructures. Very few were judged to be strongly related to carcinogenicity, among substructures known to be responsible for carcinogens as revealed from biomedical studies. In order to develop QSAR models for the prediction of the carcinogenicities of a wide variety of chemicals with a satisfactory performance level, the relationship between the carcinogenicity data with improved reliability and a subset of significant descriptors selected from 1,504 Dragon descriptors was analyzed with a support vector machine (SVM) method: the classification function (SVC) for weighted data in LIBSVM program was used to classify chemicals into two carcinogenic categories (positive or negative), where weights were set depending on the reliabilities of the carcinogenicity data. The quality and stability of the models presented were tested by performing a dual cross-validation procedure. A single SVM model as the first step was developed for all the 911 chemicals using 250 selected descriptors, achieving an overall accuracy level, i.e., positive and negative correct estimate, of about 70%. In order to improve the accuracy of the final model, the 911 chemicals were classified into 20 mutually overlapping subgroups according to contained substructures

  7. sw-SVM: sensor weighting support vector machines for EEG-based brain-computer interfaces

    Science.gov (United States)

    Jrad, N.; Congedo, M.; Phlypo, R.; Rousseau, S.; Flamary, R.; Yger, F.; Rakotomamonjy, A.

    2011-10-01

    In many machine learning applications, like brain-computer interfaces (BCI), high-dimensional sensor array data are available. Sensor measurements are often highly correlated and signal-to-noise ratio is not homogeneously spread across sensors. Thus, collected data are highly variable and discrimination tasks are challenging. In this work, we focus on sensor weighting as an efficient tool to improve the classification procedure. We present an approach integrating sensor weighting in the classification framework. Sensor weights are considered as hyper-parameters to be learned by a support vector machine (SVM). The resulting sensor weighting SVM (sw-SVM) is designed to satisfy a margin criterion, that is, the generalization error. Experimental studies on two data sets are presented, a P300 data set and an error-related potential (ErrP) data set. For the P300 data set (BCI competition III), for which a large number of trials is available, the sw-SVM proves to perform equivalently with respect to the ensemble SVM strategy that won the competition. For the ErrP data set, for which a small number of trials are available, the sw-SVM shows superior performances as compared to three state-of-the art approaches. Results suggest that the sw-SVM promises to be useful in event-related potentials classification, even with a small number of training trials.

  8. A simulated Linear Mixture Model to Improve Classification Accuracy of Satellite Data Utilizing Degradation of Atmospheric Effect

    Directory of Open Access Journals (Sweden)

    WIDAD Elmahboub

    2005-02-01

    Full Text Available Researchers in remote sensing have attempted to increase the accuracy of land cover information extracted from remotely sensed imagery. Factors that influence the supervised and unsupervised classification accuracy are the presence of atmospheric effect and mixed pixel information. A linear mixture simulated model experiment is generated to simulate real world data with known end member spectral sets and class cover proportions (CCP. The CCP were initially generated by a random number generator and normalized to make the sum of the class proportions equal to 1.0 using MATLAB program. Random noise was intentionally added to pixel values using different combinations of noise levels to simulate a real world data set. The atmospheric scattering error is computed for each pixel value for three generated images with SPOT data. Accuracy can either be classified or misclassified. Results portrayed great improvement in classified accuracy, for example, in image 1, misclassified pixels due to atmospheric noise is 41 %. Subsequent to the degradation of atmospheric effect, the misclassified pixels were reduced to 4 %. We can conclude that accuracy of classification can be improved by degradation of atmospheric noise.

  9. Identification and classification of similar looking food grains

    Science.gov (United States)

    Anami, B. S.; Biradar, Sunanda D.; Savakar, D. G.; Kulkarni, P. V.

    2013-01-01

    This paper describes the comparative study of Artificial Neural Network (ANN) and Support Vector Machine (SVM) classifiers by taking a case study of identification and classification of four pairs of similar looking food grains namely, Finger Millet, Mustard, Soyabean, Pigeon Pea, Aniseed, Cumin-seeds, Split Greengram and Split Blackgram. Algorithms are developed to acquire and process color images of these grains samples. The developed algorithms are used to extract 18 colors-Hue Saturation Value (HSV), and 42 wavelet based texture features. Back Propagation Neural Network (BPNN)-based classifier is designed using three feature sets namely color - HSV, wavelet-texture and their combined model. SVM model for color- HSV model is designed for the same set of samples. The classification accuracies ranging from 93% to 96% for color-HSV, ranging from 78% to 94% for wavelet texture model and from 92% to 97% for combined model are obtained for ANN based models. The classification accuracy ranging from 80% to 90% is obtained for color-HSV based SVM model. Training time required for the SVM based model is substantially lesser than ANN for the same set of images.

  10. Improved classification accuracy of powdery mildew infection levels of wine grapes by spatial-spectral analysis of hyperspectral images.

    Science.gov (United States)

    Knauer, Uwe; Matros, Andrea; Petrovic, Tijana; Zanker, Timothy; Scott, Eileen S; Seiffert, Udo

    2017-01-01

    Hyperspectral imaging is an emerging means of assessing plant vitality, stress parameters, nutrition status, and diseases. Extraction of target values from the high-dimensional datasets either relies on pixel-wise processing of the full spectral information, appropriate selection of individual bands, or calculation of spectral indices. Limitations of such approaches are reduced classification accuracy, reduced robustness due to spatial variation of the spectral information across the surface of the objects measured as well as a loss of information intrinsic to band selection and use of spectral indices. In this paper we present an improved spatial-spectral segmentation approach for the analysis of hyperspectral imaging data and its application for the prediction of powdery mildew infection levels (disease severity) of intact Chardonnay grape bunches shortly before veraison. Instead of calculating texture features (spatial features) for the huge number of spectral bands independently, dimensionality reduction by means of Linear Discriminant Analysis (LDA) was applied first to derive a few descriptive image bands. Subsequent classification was based on modified Random Forest classifiers and selective extraction of texture parameters from the integral image representation of the image bands generated. Dimensionality reduction, integral images, and the selective feature extraction led to improved classification accuracies of up to [Formula: see text] for detached berries used as a reference sample (training dataset). Our approach was validated by predicting infection levels for a sample of 30 intact bunches. Classification accuracy improved with the number of decision trees of the Random Forest classifier. These results corresponded with qPCR results. An accuracy of 0.87 was achieved in classification of healthy, infected, and severely diseased bunches. However, discrimination between visually healthy and infected bunches proved to be challenging for a few samples

  11. A comparison of methods for three-class mammograms classification.

    Science.gov (United States)

    Milosevic, Marina; Jovanovic, Zeljko; Jankovic, Dragan

    2017-08-09

    Mammography is considered the gold standard for early breast cancer detection but it is very difficult to interpret mammograms for many reason. Computer aided diagnosis (CAD) is an important development that may help to improve the performance in breast cancer detection. We present a CAD system based on feature extraction techniques for detecting abnormal patterns in digital mammograms. Computed features based on gray-level co-occurrence matrices (GLCM) are used to evaluate the effectiveness of textural information possessed by mass regions. A total of 20 texture features are extracted from each mammogram. The ability of feature set in differentiating normal, benign and malign tissue is investigated using a Support Vector Machine (SVM) classifier, Naive Bayes classifier and K-Nearest Neighbor (k-NN) classifier. The efficiency of classification is provided using cross-validation technique. Support Vector Machine was originally designed for binary classification. We constructed a three-class SVM classifier by combining two binary classifiers and then compared his performance with classifiers intended for multi-class classification. To evaluate the classification performance, confusion matrix and Receiver Operating Characteristic (ROC) analysis were performed. Obtained results indicate that SVM classification results are better than the k-NN and Naive Bayes classification results, with accuracy ratio of 65% according to 51.6% and 38.1%, respectively.The unbalanced classification that occurs in all three classification tests is reason for unsatisfactory accuracy. Obtained experimental results indicate that the proposed three-class SVM classifier is more suitable for practical use than the other two methods.

  12. Classification of Camellia (Theaceae) Species Using Leaf Architecture Variations and Pattern Recognition Techniques

    Science.gov (United States)

    Lee, Sean; Nitin, Mantri

    2012-01-01

    Leaf characters have been successfully utilized to classify Camellia (Theaceae) species; however, leaf characters combined with supervised pattern recognition techniques have not been previously explored. We present results of using leaf morphological and venation characters of 93 species from five sections of genus Camellia to assess the effectiveness of several supervised pattern recognition techniques for classifications and compare their accuracy. Clustering approach, Learning Vector Quantization neural network (LVQ-ANN), Dynamic Architecture for Artificial Neural Networks (DAN2), and C-support vector machines (SVM) are used to discriminate 93 species from five sections of genus Camellia (11 in sect. Furfuracea, 16 in sect. Paracamellia, 12 in sect. Tuberculata, 34 in sect. Camellia, and 20 in sect. Theopsis). DAN2 and SVM show excellent classification results for genus Camellia with DAN2's accuracy of 97.92% and 91.11% for training and testing data sets respectively. The RBF-SVM results of 97.92% and 97.78% for training and testing offer the best classification accuracy. A hierarchical dendrogram based on leaf architecture data has confirmed the morphological classification of the five sections as previously proposed. The overall results suggest that leaf architecture-based data analysis using supervised pattern recognition techniques, especially DAN2 and SVM discrimination methods, is excellent for identification of Camellia species. PMID:22235330

  13. Progressive Classification Using Support Vector Machines

    Science.gov (United States)

    Wagstaff, Kiri; Kocurek, Michael

    2009-01-01

    An algorithm for progressive classification of data, analogous to progressive rendering of images, makes it possible to compromise between speed and accuracy. This algorithm uses support vector machines (SVMs) to classify data. An SVM is a machine learning algorithm that builds a mathematical model of the desired classification concept by identifying the critical data points, called support vectors. Coarse approximations to the concept require only a few support vectors, while precise, highly accurate models require far more support vectors. Once the model has been constructed, the SVM can be applied to new observations. The cost of classifying a new observation is proportional to the number of support vectors in the model. When computational resources are limited, an SVM of the appropriate complexity can be produced. However, if the constraints are not known when the model is constructed, or if they can change over time, a method for adaptively responding to the current resource constraints is required. This capability is particularly relevant for spacecraft (or any other real-time systems) that perform onboard data analysis. The new algorithm enables the fast, interactive application of an SVM classifier to a new set of data. The classification process achieved by this algorithm is characterized as progressive because a coarse approximation to the true classification is generated rapidly and thereafter iteratively refined. The algorithm uses two SVMs: (1) a fast, approximate one and (2) slow, highly accurate one. New data are initially classified by the fast SVM, producing a baseline approximate classification. For each classified data point, the algorithm calculates a confidence index that indicates the likelihood that it was classified correctly in the first pass. Next, the data points are sorted by their confidence indices and progressively reclassified by the slower, more accurate SVM, starting with the items most likely to be incorrectly classified. The user

  14. Tree Species Classification Using Hyperspectral Imagery: A Comparison of Two Classifiers

    Directory of Open Access Journals (Sweden)

    Laurel Ballanti

    2016-05-01

    Full Text Available The identification of tree species can provide a useful and efficient tool for forest managers for planning and monitoring purposes. Hyperspectral data provide sufficient spectral information to classify individual tree species. Two non-parametric classifiers, support vector machines (SVM and random forest (RF, have resulted in high accuracies in previous classification studies. This research takes a comparative classification approach to examine the SVM and RF classifiers in the complex and heterogeneous forests of Muir Woods National Monument and Kent Creek Canyon in Marin County, California. The influence of object- or pixel-based training samples and segmentation size on the object-oriented classification is also explored. To reduce the data dimensionality, a minimum noise fraction transform was applied to the mosaicked hyperspectral image, resulting in the selection of 27 bands for the final classification. Each classifier was also assessed individually to identify any advantage related to an increase in training sample size or an increase in object segmentation size. All classifications resulted in overall accuracies above 90%. No difference was found between classifiers when using object-based training samples. SVM outperformed RF when additional training samples were used. An increase in training samples was also found to improve the individual performance of the SVM classifier.

  15. Multi-category classification using an Extreme Learning Machine for microarray gene expression cancer diagnosis.

    Science.gov (United States)

    Zhang, Runxuan; Huang, Guang-Bin; Sundararajan, Narasimhan; Saratchandran, P

    2007-01-01

    In this paper, the recently developed Extreme Learning Machine (ELM) is used for direct multicategory classification problems in the cancer diagnosis area. ELM avoids problems like local minima, improper learning rate and overfitting commonly faced by iterative learning methods and completes the training very fast. We have evaluated the multi-category classification performance of ELM on three benchmark microarray datasets for cancer diagnosis, namely, the GCM dataset, the Lung dataset and the Lymphoma dataset. The results indicate that ELM produces comparable or better classification accuracies with reduced training time and implementation complexity compared to artificial neural networks methods like conventional back-propagation ANN, Linder's SANN, and Support Vector Machine methods like SVM-OVO and Ramaswamy's SVM-OVA. ELM also achieves better accuracies for classification of individual categories.

  16. Support vectors machine classification of surface electromyography for non-invasive naturally controlled hand prostheses.

    Science.gov (United States)

    Moura, Karina O A; Favieiro, Gabriela W; Balbinot, Alexandre

    2016-08-01

    The scientific researches in human rehabilitation techniques have continually evolved to offer again the mobility and freedom lost to disability. Many systems managed by myoelectric signals intended to mimic the movement of the human arm still have results considered partial, which makes it subject of many researches. The use of Natural Interfaces Signal Processing methods makes possible to design systems capable of offering prosthesis in a more natural and intuitive way. This paper presents a study investigating the use of forearm surface electromyography (sEMG) signals for classification of specific movements of hand using 12 sEMG channels and support vector machine (SVM). The system acquired the sEMG signal using a virtual model as a visual stimulus in order to demonstrate to the volunteer the hand movements which must be replicated by them. The Root Mean Square (RMS) value feature is extracted of the signal and it serves as input data for the classification with SVM. The classification stage used three types of kernel functions (linear, polynomial, radial basis) for comparison of the results. The average accuracy reached for the classification of seventeen distinct movements of 83.7% was achieved using the SVM linear classifier, 80.8% was achieved using the SVM polynomial classifier and 85.1% was achieved using the SVM radial basis classifier.

  17. Using LS-SVM based motion recognition for smartphone indoor wireless positioning.

    Science.gov (United States)

    Pei, Ling; Liu, Jingbin; Guinness, Robert; Chen, Yuwei; Kuusniemi, Heidi; Chen, Ruizhi

    2012-01-01

    The paper presents an indoor navigation solution by combining physical motion recognition with wireless positioning. Twenty-seven simple features are extracted from the built-in accelerometers and magnetometers in a smartphone. Eight common motion states used during indoor navigation are detected by a Least Square-Support Vector Machines (LS-SVM) classification algorithm, e.g., static, standing with hand swinging, normal walking while holding the phone in hand, normal walking with hand swinging, fast walking, U-turning, going up stairs, and going down stairs. The results indicate that the motion states are recognized with an accuracy of up to 95.53% for the test cases employed in this study. A motion recognition assisted wireless positioning approach is applied to determine the position of a mobile user. Field tests show a 1.22 m mean error in "Static Tests" and a 3.53 m in "Stop-Go Tests".

  18. Using LS-SVM Based Motion Recognition for Smartphone Indoor Wireless Positioning

    Directory of Open Access Journals (Sweden)

    Ruizhi Chen

    2012-05-01

    Full Text Available The paper presents an indoor navigation solution by combining physical motion recognition with wireless positioning. Twenty-seven simple features are extracted from the built-in accelerometers and magnetometers in a smartphone. Eight common motion states used during indoor navigation are detected by a Least Square-Support Vector Machines (LS-SVM classification algorithm, e.g., static, standing with hand swinging, normal walking while holding the phone in hand, normal walking with hand swinging, fast walking, U-turning, going up stairs, and going down stairs. The results indicate that the motion states are recognized with an accuracy of up to 95.53% for the test cases employed in this study. A motion recognition assisted wireless positioning approach is applied to determine the position of a mobile user. Field tests show a 1.22 m mean error in “Static Tests” and a 3.53 m in “Stop-Go Tests”.

  19. Hybrid Brain–Computer Interface Techniques for Improved Classification Accuracy and Increased Number of Commands: A Review

    Science.gov (United States)

    Hong, Keum-Shik; Khan, Muhammad Jawad

    2017-01-01

    In this article, non-invasive hybrid brain–computer interface (hBCI) technologies for improving classification accuracy and increasing the number of commands are reviewed. Hybridization combining more than two modalities is a new trend in brain imaging and prosthesis control. Electroencephalography (EEG), due to its easy use and fast temporal resolution, is most widely utilized in combination with other brain/non-brain signal acquisition modalities, for instance, functional near infrared spectroscopy (fNIRS), electromyography (EMG), electrooculography (EOG), and eye tracker. Three main purposes of hybridization are to increase the number of control commands, improve classification accuracy and reduce the signal detection time. Currently, such combinations of EEG + fNIRS and EEG + EOG are most commonly employed. Four principal components (i.e., hardware, paradigm, classifiers, and features) relevant to accuracy improvement are discussed. In the case of brain signals, motor imagination/movement tasks are combined with cognitive tasks to increase active brain–computer interface (BCI) accuracy. Active and reactive tasks sometimes are combined: motor imagination with steady-state evoked visual potentials (SSVEP) and motor imagination with P300. In the case of reactive tasks, SSVEP is most widely combined with P300 to increase the number of commands. Passive BCIs, however, are rare. After discussing the hardware and strategies involved in the development of hBCI, the second part examines the approaches used to increase the number of control commands and to enhance classification accuracy. The future prospects and the extension of hBCI in real-time applications for daily life scenarios are provided. PMID:28790910

  20. Hybrid Brain-Computer Interface Techniques for Improved Classification Accuracy and Increased Number of Commands: A Review.

    Science.gov (United States)

    Hong, Keum-Shik; Khan, Muhammad Jawad

    2017-01-01

    In this article, non-invasive hybrid brain-computer interface (hBCI) technologies for improving classification accuracy and increasing the number of commands are reviewed. Hybridization combining more than two modalities is a new trend in brain imaging and prosthesis control. Electroencephalography (EEG), due to its easy use and fast temporal resolution, is most widely utilized in combination with other brain/non-brain signal acquisition modalities, for instance, functional near infrared spectroscopy (fNIRS), electromyography (EMG), electrooculography (EOG), and eye tracker. Three main purposes of hybridization are to increase the number of control commands, improve classification accuracy and reduce the signal detection time. Currently, such combinations of EEG + fNIRS and EEG + EOG are most commonly employed. Four principal components (i.e., hardware, paradigm, classifiers, and features) relevant to accuracy improvement are discussed. In the case of brain signals, motor imagination/movement tasks are combined with cognitive tasks to increase active brain-computer interface (BCI) accuracy. Active and reactive tasks sometimes are combined: motor imagination with steady-state evoked visual potentials (SSVEP) and motor imagination with P300. In the case of reactive tasks, SSVEP is most widely combined with P300 to increase the number of commands. Passive BCIs, however, are rare. After discussing the hardware and strategies involved in the development of hBCI, the second part examines the approaches used to increase the number of control commands and to enhance classification accuracy. The future prospects and the extension of hBCI in real-time applications for daily life scenarios are provided.

  1. Hybrid Brain–Computer Interface Techniques for Improved Classification Accuracy and Increased Number of Commands: A Review

    Directory of Open Access Journals (Sweden)

    Keum-Shik Hong

    2017-07-01

    Full Text Available In this article, non-invasive hybrid brain–computer interface (hBCI technologies for improving classification accuracy and increasing the number of commands are reviewed. Hybridization combining more than two modalities is a new trend in brain imaging and prosthesis control. Electroencephalography (EEG, due to its easy use and fast temporal resolution, is most widely utilized in combination with other brain/non-brain signal acquisition modalities, for instance, functional near infrared spectroscopy (fNIRS, electromyography (EMG, electrooculography (EOG, and eye tracker. Three main purposes of hybridization are to increase the number of control commands, improve classification accuracy and reduce the signal detection time. Currently, such combinations of EEG + fNIRS and EEG + EOG are most commonly employed. Four principal components (i.e., hardware, paradigm, classifiers, and features relevant to accuracy improvement are discussed. In the case of brain signals, motor imagination/movement tasks are combined with cognitive tasks to increase active brain–computer interface (BCI accuracy. Active and reactive tasks sometimes are combined: motor imagination with steady-state evoked visual potentials (SSVEP and motor imagination with P300. In the case of reactive tasks, SSVEP is most widely combined with P300 to increase the number of commands. Passive BCIs, however, are rare. After discussing the hardware and strategies involved in the development of hBCI, the second part examines the approaches used to increase the number of control commands and to enhance classification accuracy. The future prospects and the extension of hBCI in real-time applications for daily life scenarios are provided.

  2. [Selection of Characteristic Wavelengths Using SPA and Qualitative Discrimination of Mildew Degree of Corn Kernels Based on SVM].

    Science.gov (United States)

    Yuan, Ying; Wang, Wei; Chu, Xuan; Xi, Ming-jie

    2016-01-01

    The feasibility of Fourier transform near infrared (FT-NIR) spectroscopy with spectral range between 833 and 2 500 nm to detect the moldy corn kernels with different levels of mildew was verified in this paper. Firstly, to avoid the influence of noise, moving average smoothing was used for spectral data preprocessing after four common pretreatment methods were compared. Then to improve the prediction performance of the model, SPXY (sample set partitioning based on joint x-y distance) was selected and used for sample set partition. Furthermore, in order to reduce the dimensions of the original spectral data, successive projection algorithm (SPA) was adopted and ultimately 7 characteristic wavelengths were extracted, the characteristic wave-lengths were 833, 927, 1 208, 1 337, 1 454, 1 861, 2 280 nm. The experimental results showed when the spectrum data of the 7 characteristic wavelengths were taken as the input of SVM, the radial basic function (RBF) used as the kernel function, and kernel parameter C = 7 760 469, γ = 0.017 003, the classification accuracies of the established SVM model were 97.78% and 93.33% for the training and testing sets respectively. In addition, the independent validation set was selected in the same standard, and used to verify the model. At last, the classification accuracy of 91.11% for the independent validation set was achieved. The result indicated that it is feasible to identify and classify different degree of moldy corn grain kernels using SPA and SVM, and characteristic wavelengths selected by SPA in this paper also lay a foundation for the online NIR detection of mildew corn kernels.

  3. A Bayesian taxonomic classification method for 16S rRNA gene sequences with improved species-level accuracy.

    Science.gov (United States)

    Gao, Xiang; Lin, Huaiying; Revanna, Kashi; Dong, Qunfeng

    2017-05-10

    Species-level classification for 16S rRNA gene sequences remains a serious challenge for microbiome researchers, because existing taxonomic classification tools for 16S rRNA gene sequences either do not provide species-level classification, or their classification results are unreliable. The unreliable results are due to the limitations in the existing methods which either lack solid probabilistic-based criteria to evaluate the confidence of their taxonomic assignments, or use nucleotide k-mer frequency as the proxy for sequence similarity measurement. We have developed a method that shows significantly improved species-level classification results over existing methods. Our method calculates true sequence similarity between query sequences and database hits using pairwise sequence alignment. Taxonomic classifications are assigned from the species to the phylum levels based on the lowest common ancestors of multiple database hits for each query sequence, and further classification reliabilities are evaluated by bootstrap confidence scores. The novelty of our method is that the contribution of each database hit to the taxonomic assignment of the query sequence is weighted by a Bayesian posterior probability based upon the degree of sequence similarity of the database hit to the query sequence. Our method does not need any training datasets specific for different taxonomic groups. Instead only a reference database is required for aligning to the query sequences, making our method easily applicable for different regions of the 16S rRNA gene or other phylogenetic marker genes. Reliable species-level classification for 16S rRNA or other phylogenetic marker genes is critical for microbiome research. Our software shows significantly higher classification accuracy than the existing tools and we provide probabilistic-based confidence scores to evaluate the reliability of our taxonomic classification assignments based on multiple database matches to query sequences. Despite

  4. Comparison of level I land cover classification accuracy for MSS and AVHRR data. [Advanced Very High Resolution Radiometers

    Science.gov (United States)

    Gervin, J. C.; Kerber, A. G.; Witt, R. G.; Lu, Y. C.; Sekhon, R.

    1985-01-01

    The capabilities of the Advanced Very-High-Resolution Radiometer (AVHRR) for land-cover mapping were investigated by comparing the accuracy of land-cover information for the Washington, DC area derived from NOAA-7 AVHRR data with that from Landsat Multispectral Scanner Subsystem (MSS) data. Unsupervised level I land-cover classifications were performed for MSS and AVHRR data sets collected on July 11, 1981. A detailed accuracy assessment was conducted based on ground data delineated on 12 U.S. Geological Survey 7-5 min series topographic maps. These results produced overall land-cover classification accuracies of 71.9 and 76.8 per cent for AVHRR and MSS, respectively. While the accuracies for predominant categories were similar for both sensors, land-cover discrimination for less commonly occurring and/or spatially heterogeneous categories was improved with the MSS data set. The AVHRR, however, performed as well as or better than the MSS in classifying large homogeneous areas. The application of AVHRR data with its lower processing cost and more frequent worldwide coverage appears promising for regional land-cover mapping.

  5. Intrusion detection model using fusion of chi-square feature selection and multi class SVM

    Directory of Open Access Journals (Sweden)

    Ikram Sumaiya Thaseen

    2017-10-01

    Full Text Available Intrusion detection is a promising area of research in the domain of security with the rapid development of internet in everyday life. Many intrusion detection systems (IDS employ a sole classifier algorithm for classifying network traffic as normal or abnormal. Due to the large amount of data, these sole classifier models fail to achieve a high attack detection rate with reduced false alarm rate. However by applying dimensionality reduction, data can be efficiently reduced to an optimal set of attributes without loss of information and then classified accurately using a multi class modeling technique for identifying the different network attacks. In this paper, we propose an intrusion detection model using chi-square feature selection and multi class support vector machine (SVM. A parameter tuning technique is adopted for optimization of Radial Basis Function kernel parameter namely gamma represented by ‘ϒ’ and over fitting constant ‘C’. These are the two important parameters required for the SVM model. The main idea behind this model is to construct a multi class SVM which has not been adopted for IDS so far to decrease the training and testing time and increase the individual classification accuracy of the network attacks. The investigational results on NSL-KDD dataset which is an enhanced version of KDDCup 1999 dataset shows that our proposed approach results in a better detection rate and reduced false alarm rate. An experimentation on the computational time required for training and testing is also carried out for usage in time critical applications.

  6. Improved classification accuracy in 1- and 2-dimensional NMR metabolomics data using the variance stabilising generalised logarithm transformation

    Directory of Open Access Journals (Sweden)

    Günther Ulrich L

    2007-07-01

    Full Text Available Abstract Background Classifying nuclear magnetic resonance (NMR spectra is a crucial step in many metabolomics experiments. Since several multivariate classification techniques depend upon the variance of the data, it is important to first minimise any contribution from unwanted technical variance arising from sample preparation and analytical measurements, and thereby maximise any contribution from wanted biological variance between different classes. The generalised logarithm (glog transform was developed to stabilise the variance in DNA microarray datasets, but has rarely been applied to metabolomics data. In particular, it has not been rigorously evaluated against other scaling techniques used in metabolomics, nor tested on all forms of NMR spectra including 1-dimensional (1D 1H, projections of 2D 1H, 1H J-resolved (pJRES, and intact 2D J-resolved (JRES. Results Here, the effects of the glog transform are compared against two commonly used variance stabilising techniques, autoscaling and Pareto scaling, as well as unscaled data. The four methods are evaluated in terms of the effects on the variance of NMR metabolomics data and on the classification accuracy following multivariate analysis, the latter achieved using principal component analysis followed by linear discriminant analysis. For two of three datasets analysed, classification accuracies were highest following glog transformation: 100% accuracy for discriminating 1D NMR spectra of hypoxic and normoxic invertebrate muscle, and 100% accuracy for discriminating 2D JRES spectra of fish livers sampled from two rivers. For the third dataset, pJRES spectra of urine from two breeds of dog, the glog transform and autoscaling achieved equal highest accuracies. Additionally we extended the glog algorithm to effectively suppress noise, which proved critical for the analysis of 2D JRES spectra. Conclusion We have demonstrated that the glog and extended glog transforms stabilise the technical

  7. Automated, high accuracy classification of Parkinsonian disorders: a pattern recognition approach

    National Research Council Canada - National Science Library

    Marquand, Andre F; Filippone, Maurizio; Ashburner, John; Girolami, Mark; Mourao-Miranda, Janaina; Barker, Gareth J; Williams, Steven C R; Leigh, P Nigel; Blain, Camilla R V

    2013-01-01

    .... In this study, we employed multi-class pattern recognition to assess the value of anatomical patterns derived from a widely available structural neuroimaging sequence for automated classification of these disorders...

  8. Investigating the Effects of Higher Spatial Resolution on Benthic Classification Accuracy at Midway Atoll

    National Research Council Canada - National Science Library

    Arledge, Richard K; Hatcher, Ervin B

    2008-01-01

    ...s. This thesis will compare 2 multispectral systems and investigate the effects of increased spatial resolution on benthic classifications in the highly heterogeneous coral reef environment of Midway Atoll...

  9. Pulmonary subsolid nodules: value of semi-automatic measurement in diagnostic accuracy, diagnostic reproducibility and nodule classification agreement.

    Science.gov (United States)

    Kim, Hyungjin; Park, Chang Min; Hwang, Eui Jin; Ahn, Su Yeon; Goo, Jin Mo

    2017-12-01

    We hypothesized that semi-automatic diameter measurements would improve the accuracy and reproducibility in discriminating preinvasive lesions and minimally invasive adenocarcinomas from invasive pulmonary adenocarcinomas appearing as subsolid nodules (SSNs) and increase the reproducibility in classifying SSNs. Two readers independently performed semi-automatic and manual measurements of the diameters of 102 SSNs and their solid portions. Diagnostic performance in predicting invasive adenocarcinoma based on diameters was tested using logistic regression analysis with subsequent receiver operating characteristic curves. Inter- and intrareader reproducibilities of diagnosis and SSN classification according to Fleischner's guidelines were investigated for each measurement method using Cohen's κ statistics. Semi-automatic effective diameter measurements were superior to manual average diameters for the diagnosis of invasive adenocarcinoma (AUC, 0.905-0.923 for semi-automatic measurement and 0.833-0.864 for manual measurement; pautomatic measurement (κ=0.924 for semi-automatic measurement and 0.690 for manual measurement, p=0.012). Inter-reader SSN classification reproducibility was significantly higher with semi-automatic measurement (κ=0.861 for semi-automatic measurement and 0.683 for manual measurement, p=0.022). Semi-automatic effective diameter measurement offers an opportunity to improve diagnostic accuracy and reproducibility as well as the classification reproducibility of SSNs. • Semi-automatic effective diameter measurement improves the diagnostic accuracy for pulmonary subsolid nodules. • Semi-automatic measurement increases the inter-reader agreement on the diagnosis for subsolid nodules. • Semi-automatic measurement augments the inter-reader reproducibility for the classification of subsolid nodules.

  10. Meta-analysis of the accuracy of tools used for binary classification when the primary studies employ different references.

    Science.gov (United States)

    Botella, Juan; Huang, Huiling; Suero, Manuel

    2015-09-01

    The quality of tools used in binary classification is evaluated by studies that assess the accuracy of the classification. The empirical evidence is summarized in 2 × 2 contingency tables. These provide the joint frequencies between the true status of a sample and the classification made by the test. The accuracy of the test is better estimated in a meta-analysis that synthesizes the results of a set of primary studies. The true status is determined by a reference that ideally is a gold standard, which means that it is error free. However, in psychology, it is rare that all the primary studies have employed the same reference, and often they have used an imperfect reference with suboptimal accuracy instead of an actual gold standard. An imperfect reference biases both the estimates of the accuracy of the test and the empirical prevalence of the target status in the primary studies. We discuss several strategies for meta-analysis when different references are employed. Special attention is paid to the simplest case, where the meta-analyst has 1 group of primary studies using a reference that can be considered a gold standard and a 2nd group of primary studies using an imperfect reference. A procedure is recommended in which the frequencies from the primary studies with the imperfect reference are corrected prior to the meta-analysis itself. Then, a hierarchical meta-analytic model is fitted. An example with actual data from SCOFF (Sick-Control-One-Fat-Food; Hill, Reid, Morgan, & Lacey, 2010; Morgan, Reid, & Lacey, 1999) a simple but efficient test for detecting eating disorders, is described. (c) 2015 APA, all rights reserved).

  11. Sales Growth Rate Forecasting Using Improved PSO and SVM

    Directory of Open Access Journals (Sweden)

    Xibin Wang

    2014-01-01

    Full Text Available Accurate forecast of the sales growth rate plays a decisive role in determining the amount of advertising investment. In this study, we present a preclassification and later regression based method optimized by improved particle swarm optimization (IPSO for sales growth rate forecasting. We use support vector machine (SVM as a classification model. The nonlinear relationship in sales growth rate forecasting is efficiently represented by SVM, while IPSO is optimizing the training parameters of SVM. IPSO addresses issues of traditional PSO, such as relapsing into local optimum, slow convergence speed, and low convergence precision in the later evolution. We performed two experiments; firstly, three classic benchmark functions are used to verify the validity of the IPSO algorithm against PSO. Having shown IPSO outperform PSO in convergence speed, precision, and escaping local optima, in our second experiment, we apply IPSO to the proposed model. The sales growth rate forecasting cases are used to testify the forecasting performance of proposed model. According to the requirements and industry knowledge, the sample data was first classified to obtain types of the test samples. Next, the values of the test samples were forecast using the SVM regression algorithm. The experimental results demonstrate that the proposed model has good forecasting performance.

  12. SVM-Based CAC System for B-Mode Kidney Ultrasound Images.

    Science.gov (United States)

    Subramanya, M B; Kumar, Vinod; Mukherjee, Shaktidev; Saini, Manju

    2015-08-01

    The present study proposes a computer-aided classification (CAC) system for three kidney classes, viz. normal, medical renal disease (MRD) and cyst using B-mode ultrasound images. Thirty-five B-mode kidney ultrasound images consisting of 11 normal images, 8 MRD images and 16 cyst images have been used. Regions of interest (ROIs) have been marked by the radiologist from the parenchyma region of the kidney in case of normal and MRD cases and from regions inside lesions for cyst cases. To evaluate the contribution of texture features extracted from de-speckled images for the classification task, original images have been pre-processed by eight de-speckling methods. Six categories of texture features are extracted. One-against-one multi-class support vector machine (SVM) classifier has been used for the present work. Based on overall classification accuracy (OCA), features from ROIs of original images are concatenated with the features from ROIs of pre-processed images. On the basis of OCA, few feature sets are considered for feature selection. Differential evolution feature selection (DEFS) has been used to select optimal features for the classification task. DEFS process is repeated 30 times to obtain 30 subsets. Run-length matrix features from ROIs of images pre-processed by Lee's sigma concatenated with that of enhanced Lee method have resulted in an average accuracy (in %) and standard deviation of 86.3 ± 1.6. The results obtained in the study indicate that the performance of the proposed CAC system is promising, and it can be used by the radiologists in routine clinical practice for the classification of renal diseases.

  13. Segmentasi Citra menggunakan Support Vector Machine (SVM dan Ellipsoid Region Search Strategy (ERSS Arimoto Entropy berdasarkan Ciri Warna dan Tekstur

    Directory of Open Access Journals (Sweden)

    Lukman Hakim

    2016-02-01

    Full Text Available Abstrak Segmentasi citra merupakan suatu metode penting dalam pengolahan citra digital yang bertujuan membagi citra menjadi beberapa region yang homogen berdasarkan kriteria kemiripan tertentu. Salah satu syarat utama yang harus dimiliki suatu metode segmentasi citra yaitu menghasilkan citra boundary yang optimal.Untuk memenuhi syarat tersebut suatu metode segmentasi membutuhkan suatu klasifikasi piksel citra yang dapat memisahkan piksel secara linier dan non-linear. Pada penelitian ini, penulis mengusulkan metode segmentasi citra menggunakan SVM dan entropi Arimoto berbasis ERSS sehingga tahan terhadap derau dan mempunyai kompleksitas yang rendah untuk menghasilkan citra boundary yang optimal. Pertama, ekstraksi ciri warna dengan local homogeneity dan ciri tekstur dengan menggunakan Gray Level Co-occurrence Matrix (GLCM yang menghasilkan beberapa fitur. Kedua, pelabelan dengan Arimoto berbasis ERSS yang digunakan sebagai kelas dalam klasifikasi. Ketiga, hasil ekstraksi fitur dan training kemudian diklasifikasi berdasarkan label dengan SVM yang telah di-training. Dari percobaan yang dilakukan menunjukkan hasil segmentasi kurang optimal dengan akurasi 69 %. Reduksi fitur perlu dilakukan untuk menghasilkan citra yang tersegmentasi dengan baik. Kata kunci: segmentasi citra, support vector machine, ERSS Arimoto Entropy, ekstraksi ciri. Abstract Image segmentation is an important tool in image processing that divides an image into homogeneous regions based on certain similarity criteria, which ideally should be meaning-full for a certain purpose. Optimal boundary is one of the main criteria that an image segmentation method should has. A classification method that can partitions pixel linearly or non-linearly is needed by an image segmentation method. We propose a color image segmentation using Support Vector Machine (SVM classification and ERSS Arimoto entropy thresholding to get optimal boundary of segmented image that noise-free and low complexity

  14. Application of ANFIS and SVM Systems in Order to Estimate Monthly Reference Crop Evapotranspiration in the Northwest of Iran

    Directory of Open Access Journals (Sweden)

    F. Ahmadi

    2016-10-01

    Full Text Available Introduction Crop evapotranspiration modeling process mainly performs with empirical methods, aerodynamic and energy balance. In these methods, the evapotranspiration is calculated based on the average values of meteorological parameters at different time steps. The linear models didn’t have a good performance in this field due to high variability of evapotranspiration and the researchers have turned to the use of nonlinear and intelligent models. For accurate estimation of this hydrologic variable, it should be spending much time and money to measure many data (19. Materials and Methods Recently the new hybrid methods have been developed by combining some of methods such as artificial neural networks, fuzzy logic and evolutionary computation, that called Soft Computing and Intelligent Systems. These soft techniques are used in various fields of engineering. A fuzzy neurosis is a hybrid system that incorporates the decision ability of fuzzy logic with the computational ability of neural network, which provides a high capability for modeling and estimating. Basically, the Fuzzy part is used to classify the input data set and determines the degree of membership (that each number can be laying between 0 and 1 and decisions for the next activity made based on a set of rules and move to the next stage. Adaptive Neuro-Fuzzy Inference Systems (ANFIS includes some parts of a typical fuzzy expert system which the calculations at each step is performed by the hidden layer neurons and the learning ability of the neural network has been created to increase the system information (9. SVM is a one of supervised learning methods which used for classification and regression affairs. This method was developed by Vapink (15 based on statistical learning theory. The SVM is a method for binary classification in an arbitrary characteristic space, so it is suitable for prediction problems (12. The SVM is originally a two-class Classifier that separates the classes

  15. Classification of Autism Spectrum Disorder Using Random Support Vector Machine Cluster

    Directory of Open Access Journals (Sweden)

    Xia-an Bi

    2018-02-01

    Full Text Available Autism spectrum disorder (ASD is mainly reflected in the communication and language barriers, difficulties in social communication, and it is a kind of neurological developmental disorder. Most researches have used the machine learning method to classify patients and normal controls, among which support vector machines (SVM are widely employed. But the classification accuracy of SVM is usually low, due to the usage of a single SVM as classifier. Thus, we used multiple SVMs to classify ASD patients and typical controls (TC. Resting-state functional magnetic resonance imaging (fMRI data of 46 TC and 61 ASD patients were obtained from the Autism Brain Imaging Data Exchange (ABIDE database. Only 84 of 107 subjects are utilized in experiments because the translation or rotation of 7 TC and 16 ASD patients has surpassed ±2 mm or ±2°. Then the random SVM cluster was proposed to distinguish TC and ASD. The results show that this method has an excellent classification performance based on all the features. Furthermore, the accuracy based on the optimal feature set could reach to 96.15%. Abnormal brain regions could also be found, such as inferior frontal gyrus (IFG (orbital and opercula part, hippocampus, and precuneus. It is indicated that the method of random SVM cluster may apply to the auxiliary diagnosis of ASD.

  16. Multispectral imaging burn wound tissue classification system: a comparison of test accuracies between several common machine learning algorithms

    Science.gov (United States)

    Squiers, John J.; Li, Weizhi; King, Darlene R.; Mo, Weirong; Zhang, Xu; Lu, Yang; Sellke, Eric W.; Fan, Wensheng; DiMaio, J. Michael; Thatcher, Jeffrey E.

    2016-03-01

    The clinical judgment of expert burn surgeons is currently the standard on which diagnostic and therapeutic decisionmaking regarding burn injuries is based. Multispectral imaging (MSI) has the potential to increase the accuracy of burn depth assessment and the intraoperative identification of viable wound bed during surgical debridement of burn injuries. A highly accurate classification model must be developed using machine-learning techniques in order to translate MSI data into clinically-relevant information. An animal burn model was developed to build an MSI training database and to study the burn tissue classification ability of several models trained via common machine-learning algorithms. The algorithms tested, from least to most complex, were: K-nearest neighbors (KNN), decision tree (DT), linear discriminant analysis (LDA), weighted linear discriminant analysis (W-LDA), quadratic discriminant analysis (QDA), ensemble linear discriminant analysis (EN-LDA), ensemble K-nearest neighbors (EN-KNN), and ensemble decision tree (EN-DT). After the ground-truth database of six tissue types (healthy skin, wound bed, blood, hyperemia, partial injury, full injury) was generated by histopathological analysis, we used 10-fold cross validation to compare the algorithms' performances based on their accuracies in classifying data against the ground truth, and each algorithm was tested 100 times. The mean test accuracy of the algorithms were KNN 68.3%, DT 61.5%, LDA 70.5%, W-LDA 68.1%, QDA 68.9%, EN-LDA 56.8%, EN-KNN 49.7%, and EN-DT 36.5%. LDA had the highest test accuracy, reflecting the bias-variance tradeoff over the range of complexities inherent to the algorithms tested. Several algorithms were able to match the current standard in burn tissue classification, the clinical judgment of expert burn surgeons. These results will guide further development of an MSI burn tissue classification system. Given that there are few surgeons and facilities specializing in burn care

  17. Predictive Validity and Classification Accuracy of ActiGraph Energy Expenditure Equations and Cut-Points in Young Children

    OpenAIRE

    Janssen, Xanne; Cliff, Dylan P.; Reilly, John J.; Hinkley, Trina; Jones, Rachel A.; Batterham, Marijka; Ekelund, Ulf; Brage, S?ren; Okely, Anthony D.

    2013-01-01

    © 2013 Janssen et al. Objectives: Evaluate the predictive validity of ActiGraph energy expenditure equations and the classification accuracy of physical activity intensity cut-points in preschoolers. Methods: Forty children aged 4–6 years (5.3±1.0 years) completed a ~150-min room calorimeter protocol involving age-appropriate sedentary, light and moderate-to vigorous-intensity physical activities. Children wore an ActiGraph GT3X on the right mid-axillary line of the hip. Energy expendi...

  18. Extreme learning machine-based classification of ADHD using brain structural MRI data.

    Directory of Open Access Journals (Sweden)

    Xiaolong Peng

    Full Text Available BACKGROUND: Effective and accurate diagnosis of attention-deficit/hyperactivity disorder (ADHD is currently of significant interest. ADHD has been associated with multiple cortical features from structural MRI data. However, most existing learning algorithms for ADHD identification contain obvious defects, such as time-consuming training, parameters selection, etc. The aims of this study were as follows: (1 Propose an ADHD classification model using the extreme learning machine (ELM algorithm for automatic, efficient and objective clinical ADHD diagnosis. (2 Assess the computational efficiency and the effect of sample size on both ELM and support vector machine (SVM methods and analyze which brain segments are involved in ADHD. METHODS: High-resolution three-dimensional MR images were acquired from 55 ADHD subjects and 55 healthy controls. Multiple brain measures (cortical thickness, etc. were calculated using a fully automated procedure in the FreeSurfer software package. In total, 340 cortical features were automatically extracted from 68 brain segments with 5 basic cortical features. F-score and SFS methods were adopted to select the optimal features for ADHD classification. Both ELM and SVM were evaluated for classification accuracy using leave-one-out cross-validation. RESULTS: We achieved ADHD prediction accuracies of 90.18% for ELM using eleven combined features, 84.73% for SVM-Linear and 86.55% for SVM-RBF. Our results show that ELM has better computational efficiency and is more robust as sample size changes than is SVM for ADHD classification. The most pronounced differences between ADHD and healthy subjects were observed in the frontal lobe, temporal lobe, occipital lobe and insular. CONCLUSION: Our ELM-based algorithm for ADHD diagnosis performs considerably better than the traditional SVM algorithm. This result suggests that ELM may be used for the clinical diagnosis of ADHD and the investigation of different brain diseases.

  19. Extreme learning machine-based classification of ADHD using brain structural MRI data.

    Science.gov (United States)

    Peng, Xiaolong; Lin, Pan; Zhang, Tongsheng; Wang, Jue

    2013-01-01

    Effective and accurate diagnosis of attention-deficit/hyperactivity disorder (ADHD) is currently of significant interest. ADHD has been associated with multiple cortical features from structural MRI data. However, most existing learning algorithms for ADHD identification contain obvious defects, such as time-consuming training, parameters selection, etc. The aims of this study were as follows: (1) Propose an ADHD classification model using the extreme learning machine (ELM) algorithm for automatic, efficient and objective clinical ADHD diagnosis. (2) Assess the computational efficiency and the effect of sample size on both ELM and support vector machine (SVM) methods and analyze which brain segments are involved in ADHD. High-resolution three-dimensional MR images were acquired from 55 ADHD subjects and 55 healthy controls. Multiple brain measures (cortical thickness, etc.) were calculated using a fully automated procedure in the FreeSurfer software package. In total, 340 cortical features were automatically extracted from 68 brain segments with 5 basic cortical features. F-score and SFS methods were adopted to select the optimal features for ADHD classification. Both ELM and SVM were evaluated for classification accuracy using leave-one-out cross-validation. We achieved ADHD prediction accuracies of 90.18% for ELM using eleven combined features, 84.73% for SVM-Linear and 86.55% for SVM-RBF. Our results show that ELM has better computational efficiency and is more robust as sample size changes than is SVM for ADHD classification. The most pronounced differences between ADHD and healthy subjects were observed in the frontal lobe, temporal lobe, occipital lobe and insular. Our ELM-based algorithm for ADHD diagnosis performs considerably better than the traditional SVM algorithm. This result suggests that ELM may be used for the clinical diagnosis of ADHD and the investigation of different brain diseases.

  20. Improving ECG classification accuracy using an ensemble of neural network modules.

    Directory of Open Access Journals (Sweden)

    Mehrdad Javadi

    Full Text Available This paper illustrates the use of a combined neural network model based on Stacked Generalization method for classification of electrocardiogram (ECG beats. In conventional Stacked Generalization method, the combiner learns to map the base classifiers' outputs to the target data. We claim adding the input pattern to the base classifiers' outputs helps the combiner to obtain knowledge about the input space and as the result, performs better on the same task. Experimental results support our claim that the additional knowledge according to the input space, improves the performance of the proposed method which is called Modified Stacked Generalization. In particular, for classification of 14966 ECG beats that were not previously seen during training phase, the Modified Stacked Generalization method reduced the error rate for 12.41% in comparison with the best of ten popular classifier fusion methods including Max, Min, Average, Product, Majority Voting, Borda Count, Decision Templates, Weighted Averaging based on Particle Swarm Optimization and Stacked Generalization.

  1. Diagnostic accuracy of CSF neurofilament light chain protein in the biomarker-guided classification system for Alzheimer's disease.

    Science.gov (United States)

    Lista, Simone; Toschi, Nicola; Baldacci, Filippo; Zetterberg, Henrik; Blennow, Kaj; Kilimann, Ingo; Teipel, Stefan J; Cavedo, Enrica; Dos Santos, Antonio Melo; Epelbaum, Stéphane; Lamari, Foudil; Dubois, Bruno; Floris, Roberto; Garaci, Francesco; Hampel, Harald

    2017-09-01

    We assessed the diagnostic accuracy of cerebrospinal fluid (CSF) neurofilament light chain (NFL) protein in the classification of patients with Alzheimer's disease (AD) and cognitively healthy control individuals (HCs) and patients with frontotemporal dementia (FTD) as comparisons. Particularly, we tested the performance of CSF NFL concentration in differentiating patient groups stratified by fluid biomarker profiles, independently of the severity of cognitive impairment (mild cognitive impairment (MCI) and AD dementia individuals), using a biomarker-guided descriptive classification system for AD. CSF NFL concentrations were examined in a multicenter cross-sectional study of 108 participants stratified in AD pathophysiology-negative (both CSF tau and the 42-amino acid-long amyloid-beta (Aβ) peptide (Aβ1-42)) (n = 15), tau pathology-positive only (n = 15), Aβ pathology-positive only (n = 13), AD pathophysiology-positive (n = 33), FTD (n = 9) patients, and HCs (n = 23), according to the biomarker-based classification system. The performance of CSF NFL in discriminating AD pathophysiology-positive patients from HCs is fair, whereas the ability in differentiating tau-positive patients from HCs is poor. The classificatory performance in distinguishing AD pathophysiology-positive patients from FTD is unsatisfactory. Copyright © 2017 Elsevier Ltd. All rights reserved.

  2. The effects of MMPI-A T-score elevation on classification accuracy for normal and clinical adolescent samples.

    Science.gov (United States)

    Fontaine, J L; Archer, R P; Elkins, D E; Johansen, J

    2001-04-01

    In this investigation we examined the ability of the Minnesota Multiphasic Personality Inventory-Adolescent (MMPI-A; Butcher et al., 1992) to classify accurately both clinical and normal adolescents using 2 different T-score elevation ranges, T > or = 60 and T > or = 65, and using 2 different clinical base rates for the occurrence of psychopathology. A clinical base rate of 50% and 20%, respectively, were created by comparing a clinical sample of 203 adolescent inpatients with cooccurring substance abuse and psychiatric disorders with 2 subsamples from the MMPI-A normative group. These subsamples consisted of 203 adolescents matched for sex and age, and a larger subsample of 1,015 adolescents proportionately matched for sex and age, with the clinical group. Classification accuracy analyses revealed that although clinical base rate did affect the accurate classification of cases, a T-score cutoff of 65 resulted in higher levels of accurate classification overall while minimizing the misclassification of both clinical and normal cases. Implications of these findings for the recommended use of the MMPI-A "gray zone" are presented, and the relative areas of strength and weakness of the MMPI-A are reviewed in the identification and description of psychopathology.

  3. Linear Discriminant Analysis Achieves High Classification Accuracy for the BOLD fMRI Response to Naturalistic Movie Stimuli

    Science.gov (United States)

    Mandelkow, Hendrik; de Zwart, Jacco A.; Duyn, Jeff H.

    2016-01-01

    Naturalistic stimuli like movies evoke complex perceptual processes, which are of great interest in the study of human cognition by functional MRI (fMRI). However, conventional fMRI analysis based on statistical parametric mapping (SPM) and the general linear model (GLM) is hampered by a lack of accurate parametric models of the BOLD response to complex stimuli. In this situation, statistical machine-learning methods, a.k.a. multivariate pattern analysis (MVPA), have received growing attention for their ability to generate stimulus response models in a data-driven fashion. However, machine-learning methods typically require large amounts of training data as well as computational resources. In the past, this has largely limited their application to fMRI experiments involving small sets of stimulus categories and small regions of interest in the brain. By contrast, the present study compares several classification algorithms known as Nearest Neighbor (NN), Gaussian Naïve Bayes (GNB), and (regularized) Linear Discriminant Analysis (LDA) in terms of their classification accuracy in discriminating the global fMRI response patterns evoked by a large number of naturalistic visual stimuli presented as a movie. Results show that LDA regularized by principal component analysis (PCA) achieved high classification accuracies, above 90% on average for single fMRI volumes acquired 2 s apart during a 300 s movie (chance level 0.7% = 2 s/300 s). The largest source of classification errors were autocorrelations in the BOLD signal compounded by the similarity of consecutive stimuli. All classifiers performed best when given input features from a large region of interest comprising around 25% of the voxels that responded significantly to the visual stimulus. Consistent with this, the most informative principal components represented widespread distributions of co-activated brain regions that were similar between subjects and may represent functional networks. In light of these

  4. Linear Discriminant Analysis Achieves High Classification Accuracy for the BOLD fMRI Response to Naturalistic Movie Stimuli.

    Science.gov (United States)

    Mandelkow, Hendrik; de Zwart, Jacco A; Duyn, Jeff H

    2016-01-01

    Naturalistic stimuli like movies evoke complex perceptual processes, which are of great interest in the study of human cognition by functional MRI (fMRI). However, conventional fMRI analysis based on statistical parametric mapping (SPM) and the general linear model (GLM) is hampered by a lack of accurate parametric models of the BOLD response to complex stimuli. In this situation, statistical machine-learning methods, a.k.a. multivariate pattern analysis (MVPA), have received growing attention for their ability to generate stimulus response models in a data-driven fashion. However, machine-learning methods typically require large amounts of training data as well as computational resources. In the past, this has largely limited their application to fMRI experiments involving small sets of stimulus categories and small regions of interest in the brain. By contrast, the present study compares several classification algorithms known as Nearest Neighbor (NN), Gaussian Naïve Bayes (GNB), and (regularized) Linear Discriminant Analysis (LDA) in terms of their classification accuracy in discriminating the global fMRI response patterns evoked by a large number of naturalistic visual stimuli presented as a movie. Results show that LDA regularized by principal component analysis (PCA) achieved high classification accuracies, above 90% on average for single fMRI volumes acquired 2 s apart during a 300 s movie (chance level 0.7% = 2 s/300 s). The largest source of classification errors were autocorrelations in the BOLD signal compounded by the similarity of consecutive stimuli. All classifiers performed best when given input features from a large region of interest comprising around 25% of the voxels that responded significantly to the visual stimulus. Consistent with this, the most informative principal components represented widespread distributions of co-activated brain regions that were similar between subjects and may represent functional networks. In light of these

  5. Linear Discriminant Analysis achieves high classification accuracy for the BOLD fMRI response to naturalistic movie stimuli.

    Directory of Open Access Journals (Sweden)

    Hendrik eMandelkow

    2016-03-01

    Full Text Available Naturalistic stimuli like movies evoke complex perceptual processes, which are of great interest in the study of human cognition by functional MRI (fMRI. However, conventional fMRI analysis based on statistical parametric mapping (SPM and the general linear model (GLM is hampered by a lack of accurate parametric models of the BOLD response to complex stimuli. In this situation, statistical machine-learning methods, a.k.a. multivariate pattern analysis (MVPA, have received growing attention for their ability to generate stimulus response models in a data-driven fashion. However, machine-learning methods typically require large amounts of training data as well as computational resources. In the past this has largely limited their application to fMRI experiments involving small sets of stimulus categories and small regions of interest in the brain. By contrast, the present study compares several classification algorithms known as Nearest Neighbour (NN, Gaussian Naïve Bayes (GNB, and (regularised Linear Discriminant Analysis (LDA in terms of their classification accuracy in discriminating the global fMRI response patterns evoked by a large number of naturalistic visual stimuli presented as a movie.Results show that LDA regularised by principal component analysis (PCA achieved high classification accuracies, above 90% on average for single fMRI volumes acquired 2s apart during a 300s movie (chance level 0.7% = 2s/300s. The largest source of classification errors were autocorrelations in the BOLD signal compounded by the similarity of consecutive stimuli. All classifiers performed best when given input features from a large region of interest comprising around 25% of the voxels that responded significantly to the visual stimulus. Consistent with this, the most informative principal components represented widespread distributions of co-activated brain regions that were similar between subjects and may represent functional networks. In light of these

  6. Overview of existing algorithms for emotion classification. Uncertainties in evaluations of accuracies.

    Science.gov (United States)

    Avetisyan, H.; Bruna, O.; Holub, J.

    2016-11-01

    A numerous techniques and algorithms are dedicated to extract emotions from input data. In our investigation it was stated that emotion-detection approaches can be classified into 3 following types: Keyword based / lexical-based, learning based, and hybrid. The most commonly used techniques, such as keyword-spotting method, Support Vector Machines, Naïve Bayes Classifier, Hidden Markov Model and hybrid algorithms, have impressive results in this sphere and can reach more than 90% determining accuracy.

  7. Hybrid Optimization of Object-Based Classification in High-Resolution Images Using Continous ANT Colony Algorithm with Emphasis on Building Detection

    Science.gov (United States)

    Tamimi, E.; Ebadi, H.; Kiani, A.

    2017-09-01

    Automatic building detection from High Spatial Resolution (HSR) images is one of the most important issues in Remote Sensing (RS). Due to the limited number of spectral bands in HSR images, using other features will lead to improve accuracy. By adding these features, the presence probability of dependent features will be increased, which leads to accuracy reduction. In addition, some parameters should be determined in Support Vector Machine (SVM) classification. Therefore, it is necessary to simultaneously determine classification parameters and select independent features according to image type. Optimization algorithm is an efficient method to solve this problem. On the other hand, pixel-based classification faces several challenges such as producing salt-paper results and high computational time in high dimensional data. Hence, in this paper, a novel method is proposed to optimize object-based SVM classification by applying continuous Ant Colony Optimization (ACO) algorithm. The advantages of the proposed method are relatively high automation level, independency of image scene and type, post processing reduction for building edge reconstruction and accuracy improvement. The proposed method was evaluated by pixel-based SVM and Random Forest (RF) classification in terms of accuracy. In comparison with optimized pixel-based SVM classification, the results showed that the proposed method improved quality factor and overall accuracy by 17% and 10%, respectively. Also, in the proposed method, Kappa coefficient was improved by 6% rather than RF classification. Time processing of the proposed method was relatively low because of unit of image analysis (image object). These showed the superiority of the proposed method in terms of time and accuracy.

  8. Large margin classification with indefinite similarities

    KAUST Repository

    Alabdulmohsin, Ibrahim

    2016-01-07

    Classification with indefinite similarities has attracted attention in the machine learning community. This is partly due to the fact that many similarity functions that arise in practice are not symmetric positive semidefinite, i.e. the Mercer condition is not satisfied, or the Mercer condition is difficult to verify. Examples of such indefinite similarities in machine learning applications are ample including, for instance, the BLAST similarity score between protein sequences, human-judged similarities between concepts and words, and the tangent distance or the shape matching distance in computer vision. Nevertheless, previous works on classification with indefinite similarities are not fully satisfactory. They have either introduced sources of inconsistency in handling past and future examples using kernel approximation, settled for local-minimum solutions using non-convex optimization, or produced non-sparse solutions by learning in Krein spaces. Despite the large volume of research devoted to this subject lately, we demonstrate in this paper how an old idea, namely the 1-norm support vector machine (SVM) proposed more than 15 years ago, has several advantages over more recent work. In particular, the 1-norm SVM method is conceptually simpler, which makes it easier to implement and maintain. It is competitive, if not superior to, all other methods in terms of predictive accuracy. Moreover, it produces solutions that are often sparser than more recent methods by several orders of magnitude. In addition, we provide various theoretical justifications by relating 1-norm SVM to well-established learning algorithms such as neural networks, SVM, and nearest neighbor classifiers. Finally, we conduct a thorough experimental evaluation, which reveals that the evidence in favor of 1-norm SVM is statistically significant.

  9. Human Walking Pattern Recognition Based on KPCA and SVM with Ground Reflex Pressure Signal

    Directory of Open Access Journals (Sweden)

    Zhaoqin Peng

    2013-01-01

    Full Text Available Algorithms based on the ground reflex pressure (GRF signal obtained from a pair of sensing shoes for human walking pattern recognition were investigated. The dimensionality reduction algorithms based on principal component analysis (PCA and kernel principal component analysis (KPCA for walking pattern data compression were studied in order to obtain higher recognition speed. Classifiers based on support vector machine (SVM, SVM-PCA, and SVM-KPCA were designed, and the classification performances of these three kinds of algorithms were compared using data collected from a person who was wearing the sensing shoes. Experimental results showed that the algorithm fusing SVM and KPCA had better recognition performance than the other two methods. Experimental outcomes also confirmed that the sensing shoes developed in this paper can be employed for automatically recognizing human walking pattern in unlimited environments which demonstrated the potential application in the control of exoskeleton robots.

  10. Multiclass Classification for the Differential Diagnosis on the ADHD Subtypes Using Recursive Feature Elimination and Hierarchical Extreme Learning Machine: Structural MRI Study.

    Directory of Open Access Journals (Sweden)

    Muhammad Naveed Iqbal Qureshi

    Full Text Available The classification of neuroimaging data for the diagnosis of certain brain diseases is one of the main research goals of the neuroscience and clinical communities. In this study, we performed multiclass classification using a hierarchical extreme learning machine (H-ELM classifier. We compared the performance of this classifier with that of a support vector machine (SVM and basic extreme learning machine (ELM for cortical MRI data from attention deficit/hyperactivity disorder (ADHD patients. We used 159 structural MRI images of children from the publicly available ADHD-200 MRI dataset. The data consisted of three types, namely, typically developing (TDC, ADHD-inattentive (ADHD-I, and ADHD-combined (ADHD-C. We carried out feature selection by using standard SVM-based recursive feature elimination (RFE-SVM that enabled us to achieve good classification accuracy (60.78%. In this study, we found the RFE-SVM feature selection approach in combination with H-ELM to effectively enable the acquisition of high multiclass classification accuracy rates for structural neuroimaging data. In addition, we found that the most important features for classification were the surface area of the superior frontal lobe, and the cortical thickness, volume, and mean surface area of the whole cortex.

  11. Multiclass Classification for the Differential Diagnosis on the ADHD Subtypes Using Recursive Feature Elimination and Hierarchical Extreme Learning Machine: Structural MRI Study.

    Science.gov (United States)

    Qureshi, Muhammad Naveed Iqbal; Min, Beomjun; Jo, Hang Joon; Lee, Boreom

    2016-01-01

    The classification of neuroimaging data for the diagnosis of certain brain diseases is one of the main research goals of the neuroscience and clinical communities. In this study, we performed multiclass classification using a hierarchical extreme learning machine (H-ELM) classifier. We compared the performance of this classifier with that of a support vector machine (SVM) and basic extreme learning machine (ELM) for cortical MRI data from attention deficit/hyperactivity disorder (ADHD) patients. We used 159 structural MRI images of children from the publicly available ADHD-200 MRI dataset. The data consisted of three types, namely, typically developing (TDC), ADHD-inattentive (ADHD-I), and ADHD-combined (ADHD-C). We carried out feature selection by using standard SVM-based recursive feature elimination (RFE-SVM) that enabled us to achieve good classification accuracy (60.78%). In this study, we found the RFE-SVM feature selection approach in combination with H-ELM to effectively enable the acquisition of high multiclass classification accuracy rates for structural neuroimaging data. In addition, we found that the most important features for classification were the surface area of the superior frontal lobe, and the cortical thickness, volume, and mean surface area of the whole cortex.

  12. Rapid classification of Chinese quince (Chaenomeles speciosa Nakai) fruit provenance by near-infrared spectroscopy and multivariate calibration.

    Science.gov (United States)

    Shao, Wenhao; Li, Yanjie; Diao, Songfeng; Jiang, Jingmin; Dong, Ruxiang

    2017-01-01

    The quality of Chinese quince fruit is a significant factor for medicinal materials, influencing the quality of the medicine. However, it is difficult to distinguish different types of Chinese quince fruit. The main objective of this work was to use near-infrared (NIR) spectroscopy, which is a rapid and non-destructive analysis method, to classify the varieties of Chinese quince fruits. Raw spectra in the range of 1000 to 2500 nm were combined with linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and support vector machines (SVMs) for classification. The first three principal component analysis (PCA) scores were used as input variables to build LDA, QDA, and SVM discriminant models. The results indicate that all three of these methods are effective for distinguishing the different types of Chinese quince fruit. The classification accuracies for LDA, QDA, and SVM are 94, 96, and 98 %, respectively. QDA led to high-level classification accuracy of Chinese quince fruit.

  13. An SVM-Based Classifier for Estimating the State of Various Rotating Components in Agro-Industrial Machinery with a Vibration Signal Acquired from a Single Point on the Machine Chassis

    Directory of Open Access Journals (Sweden)

    Ruben Ruiz-Gonzalez

    2014-11-01

    Full Text Available The goal of this article is to assess the feasibility of estimating the state of various rotating components in agro-industrial machinery by employing just one vibration signal acquired from a single point on the machine chassis. To do so, a Support Vector Machine (SVM-based system is employed. Experimental tests evaluated this system by acquiring vibration data from a single point of an agricultural harvester, while varying several of its working conditions. The whole process included two major steps. Initially, the vibration data were preprocessed through twelve feature extraction algorithms, after which the Exhaustive Search method selected the most suitable features. Secondly, the SVM-based system accuracy was evaluated by using Leave-One-Out cross-validation, with the selected features as the input data. The results of this study provide evidence that (i accurate estimation of the status of various rotating components in agro-industrial machinery is possible by processing the vibration signal acquired from a single point on the machine structure; (ii the vibration signal can be acquired with a uniaxial accelerometer, the orientation of which does not significantly affect the classification accuracy; and, (iii when using an SVM classifier, an 85% mean cross-validation accuracy can be reached, which only requires a maximum of seven features as its input, and no significant improvements are noted between the use of either nonlinear or linear kernels.

  14. Tradeoff between User Experience and BCI Classification Accuracy with Frequency Modulated Steady-State Visual Evoked Potentials.

    Science.gov (United States)

    Dreyer, Alexander M; Herrmann, Christoph S; Rieger, Jochem W

    2017-01-01

    Steady-state visual evoked potentials (SSVEPs) have been widely employed for the control of brain-computer interfaces (BCIs) because they are very robust, lead to high performance, and allow for a high number of commands. However, such flickering stimuli often also cause user discomfort and fatigue, especially when several light sources are used simultaneously. Different variations of SSVEP driving signals have been proposed to increase user comfort. Here, we investigate the suitability of frequency modulation of a high frequency carrier for SSVEP-BCIs. We compared BCI performance and user experience between frequency modulated (FM) and traditional sinusoidal (SIN) SSVEPs in an offline classification paradigm with four independently flickering light-emitting diodes which were overtly attended (fixated). While classification performance was slightly reduced with the FM stimuli, the user comfort was significantly increased. Comparing the SSVEPs for covert attention to the stimuli (without fixation) was not possible, as no reliable SSVEPs were evoked. Our results reveal that several, simultaneously flickering, light emitting diodes can be used to generate FM-SSVEPs with different frequencies and the resulting occipital electroencephalography (EEG) signals can be classified with high accuracy. While the performance we report could be further improved with adjusted stimuli and algorithms, we argue that the increased comfort is an important result and suggest the use of FM stimuli for future SSVEP-BCI applications.

  15. Accuracy of Emergency Severity Index, Version 4 in emergency room patients’ classification

    Directory of Open Access Journals (Sweden)

    Samad EJ Golzari

    2014-02-01

    Full Text Available Introduction: Emergency Severity Index Version 4 (ESI v.4 is a validated triage tool for emergency departments, with an easy training system optimizing the allocation of limitedresources to emergency patients. The present study aimed to determine the outcomes of triagewith ESI v.4 method in all five levels of patients triage in emergency departments. Methods: In this retrospective observational-descriptive study, following the training coursesand implementation of triage with ESI v.4 method, the third quarter of 2008 was randomly selected for study. In this period, all patient files with their codes ending in zero were selectedequaling one-tenth of all files. Triage levels and outcomes were extracted and the obtaineddata from 1309 were expressed using descriptive statistics. Results: The mean age of the patients was 40.73 ± 21.37 years and 59.4% of the subjects weremales. Classification of patients by ESI v.4 level was as the following: 1 (4.0%, 2 (11.6%, 3 (52.8%, 4 (25.5% and 5 (6.1%. Hospitalization rate by ESI v.4 level was as below: 1(80.76%, 2 (23.68%, 3 (25.75%, 4 (11.76% and 5 (14.5%. Conclusion: The rate of hospitalization decreased from ESI level 1 to ESI level 5. Althoughthe findings of this study were in line with the previous reports, some discrepancies indicated the existing inaccuracy in out-patient hospitalization system in the evening and night shiftsand also at stage 5 triage level.

  16. CLASSIFICATION OF SKIN AUTOFLUORESCENCE SPECTRUM USING SUPPORT VECTOR MACHINE IN TYPE 2 DIABETES SCREENING

    Directory of Open Access Journals (Sweden)

    YUANZHI ZHANG

    2013-10-01

    Full Text Available Advanced glycation end products (AGEs are a complex and heterogeneous group of compounds that have been implicated in diabetes related complifications. Skin autofluorescence was recently introduced as an alternative tool for skin AGEs accumulation assessment in diabetes. Successful optical diagnosis of diabetes requires a rapid and accurate classification algorithm. In order to improve the performance of noninvasive and optical diagnosis of type 2 diabetes, support vector machines (SVM algorithm was implemented for the classification of skin autofluorescence from diabetics and control subjects. Cross-validation and grid-optimization methods were employed to calculate the optimal parameters that maximize classification accuracy. Classification model was set up according to the training set and then verified by the testing set. The results show that radical basis function is the best choice in the four common kernels in SVM. Moreover, a diagnostic accuracy of 82.61%, a sensitivity of 69.57%, and a specificity of 95.65% for discriminating diabetics from control subjects were achieved using a mixed kernel function, which is based on liner kernel function and radical basis function. In comparison with fasting plasma glucose and HbA1c test, the classification method of skin autofluorescence spectrum based on SVM shows great potential in screening of diabetes.

  17. Accuracy of classification of invasive lobular carcinoma on needle core biopsy of the breast.

    Science.gov (United States)

    Naidoo, Kalnisha; Beardsley, Brooke; Carder, Pauline J; Deb, Rahul; Fish, David; Girling, Anne; Hales, Sally; Howe, Miles; Wastall, Laura M; Lane, Sally; Lee, Andrew H S; Philippidou, Marianna; Quinn, Cecily; Stephenson, Tim; Pinder, Sarah E

    2016-12-01

    Although the UK National Institute for Health and Care Excellence guidelines recommend that in patients with biopsy-proven invasive lobular carcinoma (ILC), preoperative MRI scan is considered, the accuracy of diagnosis of ILC in core biopsy of the breast has not been previously investigated. Eleven pathology laboratories from the UK and Ireland submitted data on 1112 cases interpreted as showing features of ILC, or mixed ILC and IDC/no special type (NST)/other tumour type, on needle core biopsy through retrieval of histology reports. Of the total 1112 cases, 844 were shown to be pure ILC on surgical excision, 154 were mixed ILC plus another type (invariably ductal/NST) and 113 were shown to be ductal/NST. Of those lesions categorised as pure ILC on core, 93% had an element of ILC correctly identified in the core biopsy sample and could be considered concordant. Of cores diagnosed as mixed ILC plus another type on core, complete agreement between core and excision was 46%, with 27% cases of pure ILC, whilst 26% non-concordant. These data indicate that there is not a large excess of expensive MRIs being performed as a result of miscategorisation histologically. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.

  18. One-Dimensional Convolutional Neural Network Land-Cover Classification of Multi-Seasonal Hyperspectral Imagery in the San Francisco Bay Area, California

    Directory of Open Access Journals (Sweden)

    Daniel Guidici

    2017-06-01

    Full Text Available In this study, a 1-D Convolutional Neural Network (CNN architecture was developed, trained and utilized to classify single (summer and three seasons (spring, summer, fall of hyperspectral imagery over the San Francisco Bay Area, California for the year 2015. For comparison, the Random Forests (RF and Support Vector Machine (SVM classifiers were trained and tested with the same data. In order to support space-based hyperspectral applications, all analyses were performed with simulated Hyperspectral Infrared Imager (HyspIRI imagery. Three-season data improved classifier overall accuracy by 2.0% (SVM, 1.9% (CNN to 3.5% (RF over single-season data. The three-season CNN provided an overall classification accuracy of 89.9%, which was comparable to overall accuracy of 89.5% for SVM. Both three-season CNN and SVM outperformed RF by over 7% overall accuracy. Analysis and visualization of the inner products for the CNN provided insight to distinctive features within the spectral-temporal domain. A method for CNN kernel tuning was presented to assess the importance of learned features. We concluded that CNN is a promising candidate for hyperspectral remote sensing applications because of the high classification accuracy and interpretability of its inner products.

  19. One-Class Classification of Airborne LiDAR Data in Urban Areas Using a Presence and Background Learning Algorithm

    Directory of Open Access Journals (Sweden)

    Zurui Ao

    2017-09-01

    Full Text Available Automatic classification of light detection and ranging (LiDAR data in urban areas is of great importance for many applications such as generating three-dimensional (3D building models and monitoring power lines. Traditional supervised classification methods require training samples of all classes to construct a reliable classifier. However, complete training samples are normally hard and costly to collect, and a common circumstance is that only training samples for a class of interest are available, in which traditional supervised classification methods may be inappropriate. In this study, we investigated the possibility of using a novel one-class classification algorithm, i.e., the presence and background learning (PBL algorithm, to classify LiDAR data in an urban scenario. The results demonstrated that the PBL algorithm implemented by back propagation (BP neural network (PBL-BP could effectively classify a single class (e.g., building, tree, terrain, power line, and others from airborne LiDAR point cloud with very high accuracy. The mean F-score for all of the classes from the PBL-BP classification results was 0.94, which was higher than those from one-class support vector machine (SVM, biased SVM, and maximum entropy methods (0.68, 0.82 and 0.93, respectively. Moreover, the PBL-BP algorithm yielded a comparable overall accuracy to the multi-class SVM method. Therefore, this method is very promising in the classification of the LiDAR point cloud.

  20. Static Voltage Stability Analysis by Using SVM and Neural Network

    Directory of Open Access Journals (Sweden)

    Mehdi Hajian

    2013-01-01

    Full Text Available Voltage stability is an important problem in power system networks. In this paper, in terms of static voltage stability, and application of Neural Networks (NN and Supported Vector Machine (SVM for estimating of voltage stability margin (VSM and predicting of voltage collapse has been investigated. This paper considers voltage stability in power system in two parts. The first part calculates static voltage stability margin by Radial Basis Function Neural Network (RBFNN. The advantage of the used method is high accuracy in online detecting the VSM. Whereas the second one, voltage collapse analysis of power system is performed by Probabilistic Neural Network (PNN and SVM. The obtained results in this paper indicate, that time and number of training samples of SVM, are less than NN. In this paper, a new model of training samples for detection system, using the normal distribution load curve at each load feeder, has been used. Voltage stability analysis is estimated by well-know L and VSM indexes. To demonstrate the validity of the proposed methods, IEEE 14 bus grid and the actual network of Yazd Province are used.

  1. FUSION OF NON-THERMAL AND THERMAL SATELLITE IMAGES BY BOOSTED SVM CLASSIFIERS FOR CLOUD DETECTION

    Directory of Open Access Journals (Sweden)

    N. Ghasemian

    2017-09-01

    Full Text Available The goal of ensemble learning methods like Bagging and Boosting is to improve the classification results of some weak classifiers gradually. Usually, Boosting algorithms show better results than Bagging. In this article, we have examined the possibility of fusion of non-thermal and thermal bands of Landsat 8 satellite images for cloud detection by using the boosting method. We used SVM as a base learner and the performance of two kinds of Boosting methods including AdaBoost.M1 and σ Boost was compared on remote sensing images of Landsat 8 satellite. We first extracted the co-occurrence matrix features of non-thermal and thermal bands separately and then used PCA method for feature selection. In the next step AdaBoost.M1 and σ Boost algorithms were applied on non-thermal and thermal bands and finally, the classifiers were fused using majority voting. Also, we showed that by changing the regularization parameter (C the result of σ Boost algorithm can significantly change and achieve overall accuracy and cloud producer accuracy of 74%, and 0.53 kappa coefficient that shows better results in comparison to AdaBoost.M1.

  2. Fusion of Non-Thermal and Thermal Satellite Images by Boosted Svm Classifiers for Cloud Detection

    Science.gov (United States)

    Ghasemian, N.; Akhoondzadeh, M.

    2017-09-01

    The goal of ensemble learning methods like Bagging and Boosting is to improve the classification results of some weak classifiers gradually. Usually, Boosting algorithms show better results than Bagging. In this article, we have examined the possibility of fusion of non-thermal and thermal bands of Landsat 8 satellite images for cloud detection by using the boosting method. We used SVM as a base learner and the performance of two kinds of Boosting methods including AdaBoost.M1 and σ Boost was compared on remote sensing images of Landsat 8 satellite. We first extracted the co-occurrence matrix features of non-thermal and thermal bands separately and then used PCA method for feature selection. In the next step AdaBoost.M1 and σ Boost algorithms were applied on non-thermal and thermal bands and finally, the classifiers were fused using majority voting. Also, we showed that by changing the regularization parameter (C) the result of σ Boost algorithm can significantly change and achieve overall accuracy and cloud producer accuracy of 74%, and 0.53 kappa coefficient that shows better results in comparison to AdaBoost.M1.

  3. Multimodal analysis of functional and structural disconnection in Alzheimer's disease using multiple kernel SVM.

    Science.gov (United States)

    Dyrba, Martin; Grothe, Michel; Kirste, Thomas; Teipel, Stefan J

    2015-06-01

    Alzheimer's disease (AD) patients exhibit alterations in the functional connectivity between spatially segregated brain regions which may be related to both local gray matter (GM) atrophy as well as a decline in the fiber integrity of the underlying white matter tracts. Machine learning algorithms are able to automatically detect the patterns of the disease in image data, and therefore, constitute a suitable basis for automated image diagnostic systems. The question of which magnetic resonance imaging (MRI) modalities are most useful in a clinical context is as yet unresolved. We examined multimodal MRI data acquired from 28 subjects with clinically probable AD and 25 healthy controls. Specifically, we used fiber tract integrity as measured by diffusion tensor imaging (DTI), GM volume derived from structural MRI, and the graph-theoretical measures 'local clustering coefficient' and 'shortest path length' derived from resting-state functional MRI (rs-fMRI) to evaluate the utility of the three imaging methods in automated multimodal image diagnostics, to assess their individual performance, and the level of concordance between them. We ran the support vector machine (SVM) algorithm and validated the results using leave-one-out cross-validation. For the single imaging modalities, we obtained an area under the curve (AUC) of 80% for rs-fMRI, 87% for DTI, and 86% for GM volume. When it came to the multimodal SVM, we obtained an AUC of 82% using all three modalities, and 89% using only DTI measures and GM volume. Combined multimodal imaging data did not significantly improve classification accuracy compared to the best single measures alone. © 2015 Wiley Periodicals, Inc.

  4. Classification

    Science.gov (United States)

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  5. Effects of hardware heterogeneity on the performance of SVM Alzheimer's disease classifier.

    Science.gov (United States)

    Abdulkadir, Ahmed; Mortamet, Bénédicte; Vemuri, Prashanthi; Jack, Clifford R; Krueger, Gunnar; Klöppel, Stefan

    2011-10-01

    Fully automated machine learning methods based on structural magnetic resonance imaging (MRI) data can assist radiologists in the diagnosis of Alzheimer's disease (AD). These algorithms require large data sets to learn the separation of subjects with and without AD. Training and test data may come from heterogeneous hardware settings, which can potentially affect the performance of disease classification. A total of 518 MRI sessions from 226 healthy controls and 191 individuals with probable AD from the multicenter Alzheimer's Disease Neuroimaging Initiative (ADNI) were used to investigate whether grouping data by acquisition hardware (i.e. vendor, field strength, coil system) is beneficial for the performance of a support vector machine (SVM) classifier, compared to the case where data from different hardware is mixed. We compared the change of the SVM decision value resulting from (a) changes in hardware against the effect of disease and (b) changes resulting simply from rescanning the same subject on the same machine. Maximum accuracy of 87% was obtained with a training set of all 417 subjects. Classifiers trained with 95 subjects in each diagnostic group and acquired with heterogeneous scanner settings had an empirical detection accuracy of 84.2±2.4% when tested on an independent set of the same size. These results mirror the accuracy reported in recent studies. Encouragingly, classifiers trained on images acquired with homogenous and heterogeneous hardware settings had equivalent cross-validation performances. Two scans of the same subject acquired on the same machine had very similar decision values and were generally classified into the same group. Higher variation was introduced when two acquisitions of the same subject were performed on two scanners with different field strengths. The variation was unbiased and similar for both diagnostic groups. The findings of the study encourage the pooling of data from different sites to increase the number of

  6. A New Classification Method of Infrasound Events Using Hilbert-Huang Transform and Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Xueyong Liu

    2014-01-01

    Full Text Available Infrasound is a type of low frequency signal that occurs in nature and results from man-made events, typically ranging in frequency from 0.01 Hz to 20 Hz. In this paper, a classification method based on Hilbert-Huang transform (HHT and support vector machine (SVM is proposed to discriminate between three different natural events. The frequency spectrum characteristics of infrasound signals produced by different events, such as volcanoes, are unique, which lays the foundation for infrasound signal classification. First, the HHT method was used to extract the feature vectors of several kinds of infrasound events from the Hilbert marginal spectrum. Then, the feature vectors were classified by the SVM method. Finally, the present of classification and identification accuracy are given. The simulation results show that the recognition rate is above 97.7%, and that approach is effective for classifying event types for small samples.

  7. Classification of different tomato seed cultivars by multispectral visible-near infrared spectroscopy and chemometrics

    DEFF Research Database (Denmark)

    Shrestha, Santosh; Deleuran, Lise Christina; Gislum, René

    2016-01-01

    The feasibility of rapid and non-destructive classification of five different tomato seed cultivars was investigated by using visible and short-wave near infrared (Vis-NIR) spectra combined with chemometric approaches. Vis-NIR spectra containing 19 different wavelengths ranging from 375 nm to 970...... nm were extracted from multispectral images of tomato seeds. Principal component analysis (PCA) was used for data exploration, while partial least squares discriminant analysis (PLS-DA) and support vector machine discriminant analysis (SVM-DA) were used to classify the five different tomato cultivars....... The results showed very good classification accuracy for two independent test sets ranging from 94% to 100% for all tomato cultivars irrespective of chemometric methods. The overall classification error rates were 3.2% and 0.4% for the PLS-DA and SVM-DA calibration models, respectively. The results indicate...

  8. Diagnosis of periodontal diseases using different classification algorithms: a preliminary study.

    Science.gov (United States)

    Ozden, F O; Özgönenel, O; Özden, B; Aydogdu, A

    2015-01-01

    The purpose of the proposed study was to develop an identification unit for classifying periodontal diseases using support vector machine (SVM), decision tree (DT), and artificial neural networks (ANNs). A total of 150 patients was divided into two groups such as training (100) and testing (50). The codes created for risk factors, periodontal data, and radiographically bone loss were formed as a matrix structure and regarded as inputs for the classification unit. A total of six periodontal conditions was the outputs of the classification unit. The accuracy of the suggested methods was compared according to their resolution and working time. DT and SVM were best to classify the periodontal diseases with a high accuracy according to the clinical research based on 150 patients. The performances of SVM and DT were found 98% with total computational time of 19.91 and 7.00 s, respectively. ANN had the worst correlation between input and output variable, and its performance was calculated as 46%. SVM and DT appeared to be sufficiently complex to reflect all the factors associated with the periodontal status, simple enough to be understandable and practical as a decision-making aid for prediction of periodontal disease.

  9. A Hybrid Vehicle Detection Method Based on Viola-Jones and HOG + SVM from UAV Images.

    Science.gov (United States)

    Xu, Yongzheng; Yu, Guizhen; Wang, Yunpeng; Wu, Xinkai; Ma, Yalong

    2016-08-19

    A new hybrid vehicle detection scheme which integrates the Viola-Jones (V-J) and linear SVM classifier with HOG feature (HOG + SVM) methods is proposed for vehicle detection from low-altitude unmanned aerial vehicle (UAV) images. As both V-J and HOG + SVM are sensitive to on-road vehicles' in-plane rotation, the proposed scheme first adopts a roadway orientation adjustment method, which rotates each UAV image to align the roads with the horizontal direction so the original V-J or HOG + SVM method can be directly applied to achieve fast detection and high accuracy. To address the issue of descending detection speed for V-J and HOG + SVM, the proposed scheme further develops an adaptive switching strategy which sophistically integrates V-J and HOG + SVM methods based on their different descending trends of detection speed to improve detection efficiency. A comprehensive evaluation shows that the switching strategy, combined with the road orientation adjustment method, can significantly improve the efficiency and effectiveness of the vehicle detection from UAV images. The results also show that the proposed vehicle detection method is competitive compared with other existing vehicle detection methods. Furthermore, since the proposed vehicle detection method can be performed on videos captured from moving UAV platforms without the need of image registration or additional road database, it has great potentials of field applications. Future research will be focusing on expanding the current method for detecting other transportation modes such as buses, trucks, motors, bicycles, and pedestrians.

  10. Classification

    DEFF Research Database (Denmark)

    Hjørland, Birger

    2017-01-01

    This article presents and discusses definitions of the term “classification” and the related concepts “Concept/conceptualization,”“categorization,” “ordering,” “taxonomy” and “typology.” It further presents and discusses theories of classification including the influences of Aristotle...... and Wittgenstein. It presents different views on forming classes, including logical division, numerical taxonomy, historical classification, hermeneutical and pragmatic/critical views. Finally, issues related to artificial versus natural classification and taxonomic monism versus taxonomic pluralism are briefly...

  11. Fusion of Airborne Discrete-Return LiDAR and Hyperspectral Data for Land Cover Classification

    Directory of Open Access Journals (Sweden)

    Shezhou Luo

    2015-12-01

    Full Text Available Accurate land cover classification information is a critical variable for many applications. This study presents a method to classify land cover using the fusion data of airborne discrete return LiDAR (Light Detection and Ranging and CASI (Compact Airborne Spectrographic Imager hyperspectral data. Four LiDAR-derived images (DTM, DSM, nDSM, and intensity and CASI data (48 bands with 1 m spatial resolution were spatially resampled to 2, 4, 8, 10, 20 and 30 m resolutions using the nearest neighbor resampling method. These data were thereafter fused using the layer stacking and principal components analysis (PCA methods. Land cover was classified by commonly used supervised classifications in remote sensing images, i.e., the support vector machine (SVM and maximum likelihood (MLC classifiers. Each classifier was applied to four types of datasets (at seven different spatial resolutions: (1 the layer stacking fusion data; (2 the PCA fusion data; (3 the LiDAR data alone; and (4 the CASI data alone. In this study, the land cover category was classified into seven classes, i.e., buildings, road, water bodies, forests, grassland, cropland and barren land. A total of 56 classification results were produced, and the classification accuracies were assessed and compared. The results show that the classification accuracies produced from two fused datasets were higher than that of the single LiDAR and CASI data at all seven spatial resolutions. Moreover, we find that the layer stacking method produced higher overall classification accuracies than the PCA fusion method using both the SVM and MLC classifiers. The highest classification accuracy obtained (OA = 97.8%, kappa = 0.964 using the SVM classifier on the layer stacking fusion data at 1 m spatial resolution. Compared with the best classification results of the CASI and LiDAR data alone, the overall classification accuracies improved by 9.1% and 19.6%, respectively. Our findings also demonstrated that the

  12. PolSAR Land Cover Classification Based on Roll-Invariant and Selected Hidden Polarimetric Features in the Rotation Domain

    Directory of Open Access Journals (Sweden)

    Chensong Tao

    2017-07-01

    Full Text Available Land cover classification is an important application for polarimetric synthetic aperture radar (PolSAR. Target polarimetric response is strongly dependent on its orientation. Backscattering responses of the same target with different orientations to the SAR flight path may be quite different. This target orientation diversity effect hinders PolSAR image understanding and interpretation. Roll-invariant polarimetric features such as entropy, anisotropy, mean alpha angle, and total scattering power are independent of the target orientation and are commonly adopted for PolSAR image classification. On the other aspect, target orientation diversity also contains rich information which may not be sensed by roll-invariant polarimetric features. In this vein, only using the roll-invariant polarimetric features may limit the final classification accuracy. To address this problem, this work uses the recently reported uniform polarimetric matrix rotation theory and a visualization and characterization tool of polarimetric coherence pattern to investigate hidden polarimetric features in the rotation domain along the radar line of sight. Then, a feature selection scheme is established and a set of hidden polarimetric features are selected in the rotation domain. Finally, a classification method is developed using the complementary information between roll-invariant and selected hidden polarimetric features with a support vector machine (SVM/decision tree (DT classifier. Comparison experiments are carried out with NASA/JPL AIRSAR and multi-temporal UAVSAR data. For AIRSAR data, the overall classification accuracy of the proposed classification method is 95.37% (with SVM/96.38% (with DT, while that of the conventional classification method is 93.87% (with SVM/94.12% (with DT, respectively. Meanwhile, for multi-temporal UAVSAR data, the mean overall classification accuracy of the proposed method is up to 97.47% (with SVM/99.39% (with DT, which is also higher

  13. Diagnostic performance of whole brain volume perfusion CT in intra-axial brain tumors: Preoperative classification accuracy and histopathologic correlation

    Energy Technology Data Exchange (ETDEWEB)

    Xyda, Argyro, E-mail: argyro.xyda@med.uni-goettingen.de [Department of Neuroradiology, Georg-August University, University Hospital of Goettingen, Robert-Koch Strasse 40, 37075 Goettingen (Germany); Department of Radialogy, University Hospital of Heraklion, Voutes, 71110 Heraklion, Crete (Greece); Haberland, Ulrike, E-mail: ulrike.haberland@siemens.com [Siemens AG Healthcare Sector, Computed Tomography, Siemensstr. 1, 91301 Forchheim (Germany); Klotz, Ernst, E-mail: ernst.klotz@siemens.com [Siemens AG Healthcare Sector, Computed Tomography, Siemensstr. 1, 91301 Forchheim (Germany); Jung, Klaus, E-mail: kjung1@uni-goettingen.de [Department of Medical Statistics, Georg-August University, Humboldtallee 32, 37073 Goettingen (Germany); Bock, Hans Christoph, E-mail: cbock@gmx.de [Department of Neurosurgery, Johannes Gutenberg University Hospital of Mainz, Langenbeckstraße 1, 55101 Mainz (Germany); Schramm, Ramona, E-mail: ramona.schramm@med.uni-goettingen.de [Department of Neuroradiology, Georg-August University, University Hospital of Goettingen, Robert-Koch Strasse 40, 37075 Goettingen (Germany); Knauth, Michael, E-mail: michael.knauth@med.uni-goettingen.de [Department of Neuroradiology, Georg-August University, University Hospital of Goettingen, Robert-Koch Strasse 40, 37075 Goettingen (Germany); Schramm, Peter, E-mail: p.schramm@med.uni-goettingen.de [Department of Neuroradiology, Georg-August University, University Hospital of Goettingen, Robert-Koch Strasse 40, 37075 Goettingen (Germany)

    2012-12-15

    Background: To evaluate the preoperative diagnostic power and classification accuracy of perfusion parameters derived from whole brain volume perfusion CT (VPCT) in patients with cerebral tumors. Methods: Sixty-three patients (31 male, 32 female; mean age 55.6 ± 13.9 years), with MRI findings suspected of cerebral lesions, underwent VPCT. Two readers independently evaluated VPCT data. Volumes of interest (VOIs) were marked circumscript around the tumor according to maximum intensity projection volumes, and then mapped automatically onto the cerebral blood volume (CBV), flow (CBF) and permeability Ktrans perfusion datasets. A second VOI was placed in the contra lateral cortex, as control. Correlations among perfusion values, tumor grade, cerebral hemisphere and VOIs were evaluated. Moreover, the diagnostic power of VPCT parameters, by means of positive and negative predictive value, was analyzed. Results: Our cohort included 32 high-grade gliomas WHO III/IV, 18 low-grade I/II, 6 primary cerebral lymphomas, 4 metastases and 3 tumor-like lesions. Ktrans demonstrated the highest sensitivity, specificity and positive predictive value, with a cut-off point of 2.21 mL/100 mL/min, for both the comparisons between high-grade versus low-grade and low-grade versus primary cerebral lymphomas. However, for the differentiation between high-grade and primary cerebral lymphomas, CBF and CBV proved to have 100% specificity and 100% positive predictive value, identifying preoperatively all the histopathologically proven high-grade gliomas. Conclusion: Volumetric perfusion data enable the hemodynamic assessment of the entire tumor extent and provide a method of preoperative differentiation among intra-axial cerebral tumors with promising diagnostic accuracy.

  14. Extraction of prostatic lumina and automated recognition for prostatic calculus image using PCA-SVM.

    Science.gov (United States)

    Wang, Zhuocai; Xu, Xiangmin; Ding, Xiaojun; Xiao, Hui; Huang, Yusheng; Liu, Jian; Xing, Xiaofen; Wang, Hua; Liao, D Joshua

    2011-01-01

    Identification of prostatic calculi is an important basis for determining the tissue origin. Computation-assistant diagnosis of prostatic calculi may have promising potential but is currently still less studied. We studied the extraction of prostatic lumina and automated recognition for calculus images. Extraction of lumina from prostate histology images was based on local entropy and Otsu threshold recognition using PCA-SVM and based on the texture features of prostatic calculus. The SVM classifier showed an average time 0.1432 second, an average training accuracy of 100%, an average test accuracy of 93.12%, a sensitivity of 87.74%, and a specificity of 94.82%. We concluded that the algorithm, based on texture features and PCA-SVM, can recognize the concentric structure and visualized features easily. Therefore, this method is effective for the automated recognition of prostatic calculi.

  15. Automatic classification of athletes with residual functional deficits following concussion by means of EEG signal using support vector machine.

    Science.gov (United States)

    Cao, Cheng; Tutwiler, Richard Laurence; Slobounov, Semyon

    2008-08-01

    There is a growing body of knowledge indicating long-lasting residual electroencephalography (EEG) abnormalities in concussed athletes that may persist up to 10-year postinjury. Most often, these abnormalities are initially overlooked using traditional concussion assessment tools. Accordingly, premature return to sport participation may lead to recurrent episodes of concussion, increasing the risk of recurrent concussions with more severe consequences. Sixty-one athletes at high risk for concussion (i.e., collegiate rugby and football players) were recruited and underwent EEG baseline assessment. Thirty of these athletes suffered from concussion and were retested at day 30 postinjury. A number of task-related EEG recordings were conducted. A novel classification algorithm, the support vector machine (SVM), was applied as a classifier to identify residual functional abnormalities in athletes suffering from concussion using a multichannel EEG data set. The total accuracy of the classifier using the 10 features was 77.1%. The classifier has a high sensitivity of 96.7% (linear SVM), 80.0% (nonlinear SVM), and a relatively lower but acceptable selectivity of 69.1% (linear SVM) and 75.0% (nonlinear SVM). The major findings of this report are as follows: 1) discriminative features were observed at theta, alpha, and beta frequency bands, 2) the minimal redundancy relevance method was identified as being superior to the univariate t -test method in selecting features for the model calculation, 3) the EEG features selected for the classification model are linked to temporal and occipital areas, and 4) postural parameters influence EEG data set and can be used as discriminative features for the classification model. Overall, this report provides sufficient evidence that 10 EEG features selected for final analysis and SVM may be potentially used in clinical practice for automatic classification of athletes with residual brain functional abnormalities following a concussion

  16. The Effects of Point or Polygon Based Training Data on RandomForest Classification Accuracy of Wetlands

    Directory of Open Access Journals (Sweden)

    Jennifer Corcoran

    2015-04-01

    Full Text Available Wetlands are dynamic in space and time, providing varying ecosystem services. Field reference data for both training and assessment of wetland inventories in the State of Minnesota are typically collected as GPS points over wide geographical areas and at infrequent intervals. This status-quo makes it difficult to keep updated maps of wetlands with adequate accuracy, efficiency, and consistency to monitor change. Furthermore, point reference data may not be representative of the prevailing land cover type for an area, due to point location or heterogeneity within the ecosystem of interest. In this research, we present techniques for training a land cover classification for two study sites in different ecoregions by implementing the RandomForest classifier in three ways: (1 field and photo interpreted points; (2 fixed window surrounding the points; and (3 image objects that intersect the points. Additional assessments are made to identify the key input variables. We conclude that the image object area training method is the most accurate and the most important variables include: compound topographic index, summer season green and blue bands, and grid statistics from LiDAR point cloud data, especially those that relate to the height of the return.

  17. Classification of Sporting Activities Using Smartphone Accelerometers

    Directory of Open Access Journals (Sweden)

    Noel E. O'Connor

    2013-04-01

    Full Text Available In this paper we present a framework that allows for the automatic identification of sporting activities using commonly available smartphones. We extract discriminative informational features from smartphone accelerometers using the Discrete Wavelet Transform (DWT. Despite the poor quality of their accelerometers, smartphones were used as capture devices due to their prevalence in today’s society. Successful classification on this basis potentially makes the technology accessible to both elite and non-elite athletes. Extracted features are used to train different categories of classifiers. No one classifier family has a reportable direct advantage in activity classification problems to date; thus we examine classifiers from each of the most widely used classifier families. We investigate three classification approaches; a commonly used SVM-based approach, an optimized classification model and a fusion of classifiers. We also investigate the effect of changing several of the DWT input parameters, including mother wavelets, window lengths and DWT decomposition levels. During the course of this work we created a challenging sports activity analysis dataset, comprised of soccer and field-hockey activities. The average maximum F-measure accuracy of 87% was achieved using a fusion of classifiers, which was 6% better than a single classifier model and 23% better than a standard SVM approach.

  18. Exploiting machine learning algorithms for tree species classification in a semiarid woodland using RapidEye image

    Science.gov (United States)

    Adelabu, Samuel; Mutanga, Onisimo; Adam, Elhadi; Cho, Moses Azong

    2013-01-01

    Classification of different tree species in semiarid areas can be challenging as a result of the change in leaf structure and orientation due to soil moisture constraints. Tree species mapping is, however, a key parameter for forest management in semiarid environments. In this study, we examined the suitability of 5-band RapidEye satellite data for the classification of five tree species in mopane woodland of Botswana using machine leaning algorithms with limited training samples.We performed classification using random forest (RF) and support vector machines (SVM) based on EnMap box. The overall accuracies for classifying the five tree species was 88.75 and 85% for both SVM and RF, respectively. We also demonstrated that the new red-edge band in the RapidEye sensor has the potential for classifying tree species in semiarid environments when integrated with other standard bands. Similarly, we observed that where there are limited training samples, SVM is preferred over RF. Finally, we demonstrated that the two accuracy measures of quantity and allocation disagreement are simpler and more helpful for the vast majority of remote sensing classification process than the kappa coefficient. Overall, high species classification can be achieved using strategically located RapidEye bands integrated with advanced processing algorithms.

  19. Large margin distribution machine for hyperspectral image classification

    Science.gov (United States)

    Zhan, Kun; Wang, Haibo; Huang, He; Xie, Yuange

    2016-11-01

    Support vector machine (SVM) classifiers are widely applied to hyperspectral image (HSI) classification and provide significant advantages in terms of accuracy, simplicity, and robustness. SVM is a well-known learning algorithm that maximizes the minimum margin. However, recent theoretical results pointed out that maximizing the minimum margin leads to a lower generalization performance than optimizing the margin distribution, and proved that the margin distribution is more important. In this paper, a large margin distribution machine (LDM) is applied to HSI classification, and optimizing the margin distribution achieves a better generalization performance than SVM. Since the raw HSI feature space is not the most effective space for representing HSI, we adopt factor analysis to learn an effective HSI feature and the learned features are further filtered by a structure-preserved filter to fully exploit the spatial structure information of HSI. The spatial structure information is integrated in the feature learning process to obtain a better HSI feature. Then we propose a multiclass LDM to classify the filtered HSI feature. Experimental results show that the proposed LDM with feature learning method achieves the classification performance of the state-of-the-art methods in terms of visual quality and three quantitative evaluations and indicates that LDM has a high generalization performance.

  20. Novel cascade FPGA accelerator for support vector machines classification.

    Science.gov (United States)

    Papadonikolakis, Markos; Bouganis, Christos-Savvas

    2012-07-01

    Support vector machines (SVMs) are a powerful machine learning tool, providing state-of-the-art accuracy to many classification problems. However, SVM classification is a computationally complex task, suffering from linear dependencies on the number of the support vectors and the problem's dimensionality. This paper presents a fully scalable field programmable gate array (FPGA) architecture for the acceleration of SVM classification, which exploits the device heterogeneity and the dynamic range diversities among the dataset attributes. An adaptive and fully-customized processing unit is proposed, which utilizes the available heterogeneous resources of a modern FPGA device in efficient way with respect to the problem's characteristics. The implementation results demonstrate the efficiency of the heterogeneous architecture, presenting a speed-up factor of 2-3 orders of magnitude, compared to the CPU implementation. The proposed architecture outperforms other proposed FPGA and graphic processor unit approaches by more than seven times. Furthermore, based on the special properties of the heterogeneous architecture, this paper introduces the first FPGA-oriented cascade SVM classifier scheme, which exploits the FPGA reconfigurability and intensifies the custom-arithmetic properties of the heterogeneous architecture. The results show that the proposed cascade scheme is able to increase the heterogeneous classifier throughput even further, without introducing any penalty on the resource utilization.

  1. Classification complexity in myoelectric pattern recognition.

    Science.gov (United States)

    Nilsson, Niclas; Håkansson, Bo; Ortiz-Catalan, Max

    2017-07-10

    Limb prosthetics, exoskeletons, and neurorehabilitation devices can be intuitively controlled using myoelectric pattern recognition (MPR) to decode the subject's intended movement. In conventional MPR, descriptive electromyography (EMG) features representing the intended movement are fed into a classification algorithm. The separability of the different movements in the feature space significantly affects the classification complexity. Classification complexity estimating algorithms (CCEAs) were studied in this work in order to improve feature selection, predict MPR performance, and inform on faulty data acquisition. CCEAs such as nearest neighbor separability (NNS), purity, repeatability index (RI), and separability index (SI) were evaluated based on their correlation with classification accuracy, as well as on their suitability to produce highly performing EMG feature sets. SI was evaluated using Mahalanobis distance, Bhattacharyya distance, Hellinger distance, Kullback-Leibler divergence, and a modified version of Mahalanobis distance. Three commonly used classifiers in MPR were used to compute classification accuracy (linear discriminant analysis (LDA), multi-layer perceptron (MLP), and support vector machine (SVM)). The algorithms and analytic graphical user interfaces produced in this work are freely available in BioPatRec. NNS and SI were found to be highly correlated with classification accuracy (correlations up to 0.98 for both algorithms) and capable of yielding highly descriptive feature sets. Additionally, the experiments revealed how the level of correlation between the inputs of the classifiers influences classification accuracy, and emphasizes the classifiers' sensitivity to such redundancy. This study deepens the understanding of the classification complexity in prediction of motor volition based on myoelectric information. It also provides researchers with tools to analyze myoelectric recordings in order to improve classification performance.

  2. Research On The Classification Of High Resolution Image Based On Object-oriented And Class Rule

    Science.gov (United States)

    Li, C. K.; Fang, W.; Dong, X. J.

    2015-06-01

    With the development of remote sensing technology, the spatial resolution, spectral resolution and time resolution of remote sensing data is greatly improved. How to efficiently process and interpret the massive high resolution remote sensing image data for ground objects, which with spatial geometry and texture information, has become the focus and difficulty in the field of remote sensing research. An object oriented and rule of the classification method of remote sensing data has presents in this paper. Through the discovery and mining the rich knowledge of spectrum and spatial characteristics of high-resolution remote sensing image, establish a multi-level network image object segmentation and classification structure of remote sensing image to achieve accurate and fast ground targets classification and accuracy assessment. Based on worldview-2 image data in the Zangnan area as a study object, using the object-oriented image classification method and rules to verify the experiment which is combination of the mean variance method, the maximum area method and the accuracy comparison to analysis, selected three kinds of optimal segmentation scale and established a multi-level image object network hierarchy for image classification experiments. The results show that the objectoriented rules classification method to classify the high resolution images, enabling the high resolution image classification results similar to the visual interpretation of the results and has higher classification accuracy. The overall accuracy and Kappa coefficient of the object-oriented rules classification method were 97.38%, 0.9673; compared with object-oriented SVM method, respectively higher than 6.23%, 0.078; compared with object-oriented KNN method, respectively more than 7.96%, 0.0996. The extraction precision and user accuracy of the building compared with object-oriented SVM method, respectively higher than 18.39%, 3.98%, respectively better than the object-oriented KNN method 21

  3. Land Cover Classification of Landsat Data with Phenological Features Extracted from Time Series MODIS NDVI Data

    Directory of Open Access Journals (Sweden)

    Kun Jia

    2014-11-01

    Full Text Available Temporal-related features are important for improving land cover classification accuracy using remote sensing data. This study investigated the efficacy of phenological features extracted from time series MODIS Normalized Difference Vegetation Index (NDVI data in improving the land cover classification accuracy of Landsat data. The MODIS NDVI data were first fused with Landsat data via the Spatial and Temporal Adaptive Reflectance Fusion Model (STARFM algorithm to obtain NDVI data at the Landsat spatial resolution. Next, phenological features, including the beginning and ending dates of the growing season, the length of the growing season, seasonal amplitude, and the maximum fitted NDVI value, were extracted from the fused time series NDVI data using the TIMESAT tool. The extracted data were integrated with the spectral data of the Landsat data to improve classification accuracy using a maximum likelihood classifier (MLC and support vector machine (SVM classifier. The results indicated that phenological features had a statistically significant effect on improving the land cover classification accuracy of single Landsat data (an approximately 3% increase in overall classification accuracy, especially for vegetation type discrimination. However, the phenological features did not improve on statistical measures including the maximum, the minimum, the mean, and the standard deviation values of the time series NDVI dataset, especially for human-managed vegetation types. Regarding different classifiers, SVM could achieve better classification accuracy than the traditional MLC classifier, but the improvement in accuracy obtained using advanced classifiers was inferior to that achieved by involving the temporally derived features for land cover classification.

  4. SVM classifier to predict genes important for self-renewal and pluripotency of mouse embryonic stem cells

    Directory of Open Access Journals (Sweden)

    Xu Huilei

    2010-12-01

    Full Text Available Abstract Background Mouse embryonic stem cells (mESCs are derived from the inner cell mass of a developing blastocyst and can be cultured indefinitely in-vitro. Their distinct features are their ability to self-renew and to differentiate to all adult cell types. Genes that maintain mESCs self-renewal and pluripotency identity are of interest to stem cell biologists. Although significant steps have been made toward the identification and characterization of such genes, the list is still incomplete and controversial. For example, the overlap among candidate self-renewal and pluripotency genes across different RNAi screens is surprisingly small. Meanwhile, machine learning approaches have been used to analyze multi-dimensional experimental data and integrate results from many studies, yet they have not been applied to specifically tackle the task of predicting and classifying self-renewal and pluripotency gene membership. Results For this study we developed a classifier, a supervised machine learning framework for predicting self-renewal and pluripotency mESCs stemness membership genes (MSMG using support vector machines (SVM. The data used to train the classifier was derived from mESCs-related studies using mRNA microarrays, measuring gene expression in various stages of early differentiation, as well as ChIP-seq studies applied to mESCs profiling genome-wide binding of key transcription factors, such as Nanog, Oct4, and Sox2, to the regulatory regions of other genes. Comparison to other classification methods using the leave-one-out cross-validation method was employed to evaluate the accuracy and generality of the classification. Finally, two sets of candidate genes from genome-wide RNA interference screens are used to test the generality and potential application of the classifier. Conclusions Our results reveal that an SVM approach can be useful for prioritizing genes for functional validation experiments and complement the analyses of high

  5. Influence of multi-source and multi-temporal remotely sensed and ancillary data on the accuracy of random forest classification of wetlands in northern Minnesota

    Science.gov (United States)

    Corcoran, Jennifer M.; Knight, Joseph F.; Gallant, Alisa L.

    2013-01-01

    Wetland mapping at the landscape scale using remotely sensed data requires both affordable data and an efficient accurate classification method. Random forest classification offers several advantages over traditional land cover classification techniques, including a bootstrapping technique to generate robust estimations of outliers in the training data, as well as the capability of measuring classification confidence. Though the random forest classifier can generate complex decision trees with a multitude of input data and still not run a high risk of over fitting, there is a great need to reduce computational and operational costs by including only key input data sets without sacrificing a significant level of accuracy. Our main questions for this study site in Northern Minnesota were: (1) how does classification accuracy and confidence of mapping wetlands compare using different remote sensing platforms and sets of input data; (2) what are the key input variables for accurate differentiation of upland, water, and wetlands, including wetland type; and (3) which datasets and seasonal imagery yield the best accuracy for wetland classification. Our results show the key input variables include terrain (elevation and curvature) and soils descriptors (hydric), along with an assortment of remotely sensed data collected in the spring (satellite visible, near infrared, and thermal bands; satellite normalized vegetation index and Tasseled Cap greenness and wetness; and horizontal-horizontal (HH) and horizontal-vertical (HV) polarization using L-band satellite radar). We undertook this exploratory analysis to inform decisions by natural resource managers charged with monitoring wetland ecosystems and to aid in designing a system for consistent operational mapping of wetlands across landscapes similar to those found in Northern Minnesota.

  6. Influence of Multi-Source and Multi-Temporal Remotely Sensed and Ancillary Data on the Accuracy of Random Forest Classification of Wetlands in Northern Minnesota

    Directory of Open Access Journals (Sweden)

    Alisa L. Gallant

    2013-07-01

    Full Text Available Wetland mapping at the landscape scale using remotely sensed data requires both affordable data and an efficient accurate classification method. Random forest classification offers several advantages over traditional land cover classification techniques, including a bootstrapping technique to generate robust estimations of outliers in the training data, as well as the capability of measuring classification confidence. Though the random forest classifier can generate complex decision trees with a multitude of input data and still not run a high risk of over fitting, there is a great need to reduce computational and operational costs by including only key input data sets without sacrificing a significant level of accuracy. Our main questions for this study site in Northern Minnesota were: (1 how does classification accuracy and confidence of mapping wetlands compare using different remote sensing platforms and sets of input data; (2 what are the key input variables for accurate differentiation of upland, water, and wetlands, including wetland type; and (3 which datasets and seasonal imagery yield the best accuracy for wetland classification. Our results show the key input variables include terrain (elevation and curvature and soils descriptors (hydric, along with an assortment of remotely sensed data collected in the spring (satellite visible, near infrared, and thermal bands; satellite normalized vegetation index and Tasseled Cap greenness and wetness; and horizontal-horizontal (HH and horizontal-vertical (HV polarization using L-band satellite radar. We undertook this exploratory analysis to inform decisions by natural resource managers charged with monitoring wetland ecosystems and to aid in designing a system for consistent operational mapping of wetlands across landscapes similar to those found in Northern Minnesota.

  7. PSO-SVM-Based Online Locomotion Mode Identification for Rehabilitation Robotic Exoskeletons

    Directory of Open Access Journals (Sweden)

    Yi Long

    2016-09-01

    Full Text Available Locomotion mode identification is essential for the control of a robotic rehabilitation exoskeletons. This paper proposes an online support vector machine (SVM optimized by particle swarm optimization (PSO to identify different locomotion modes to realize a smooth and automatic locomotion transition. A PSO algorithm is used to obtain the optimal parameters of SVM for a better overall performance. Signals measured by the foot pressure sensors integrated in the insoles of wearable shoes and the MEMS-based attitude and heading reference systems (AHRS attached on the shoes and shanks of leg segments are fused together as the input information of SVM. Based on the chosen window whose size is 200 ms (with sampling frequency of 40 Hz, a three-layer wavelet packet analysis (WPA is used for feature extraction, after which, the kernel principal component analysis (kPCA is utilized to reduce the dimension of the feature set to reduce computation cost of the SVM. Since the signals are from two types of different sensors, the normalization is conducted to scale the input into the interval of [0, 1]. Five-fold cross validation is adapted to train the classifier, which prevents the classifier over-fitting. Based on the SVM model obtained offline in MATLAB, an online SVM algorithm is constructed for locomotion mode identification. Experiments are performed for different locomotion modes and experimental results show the effectiveness of the proposed algorithm with an accuracy of 96.00% ± 2.45%. To improve its accuracy, majority vote algorithm (MVA is used for post-processing, with which the identification accuracy is better than 98.35% ± 1.65%. The proposed algorithm can be extended and employed in the field of robotic rehabilitation and assistance.

  8. A linear support higher-order tensor machine for classification.

    Science.gov (United States)

    Hao, Zhifeng; He, Lifang; Chen, Bingqian; Yang, Xiaowei

    2013-07-01

    There has been growing interest in developing more effective learning machines for tensor classification. At present, most of the existing learning machines, such as support tensor machine (STM), involve nonconvex optimization problems and need to resort to iterative techniques. Obviously, it is very time-consuming and may suffer from local minima. In order to overcome these two shortcomings, in this paper, we present a novel linear support higher-order tensor machine (SHTM) which integrates the merits of linear C-support vector machine (C-SVM) and tensor rank-one decomposition. Theoretically, SHTM is an extension of the linear C-SVM to tensor patterns. When the input patterns are vectors, SHTM degenerates into the standard C-SVM. A set of experiments is conducted on nine second-order face recognition datasets and three third-order gait recognition datasets to illustrate the performance of the proposed SHTM. The statistic test shows that compared with STM and C-SVM with the RBF kernel, SHTM provides significant performance gain in terms of test accuracy and training speed, especially in the case of higher-order tensors.

  9. A DWT and SVM based method for rolling element bearing fault diagnosis and its comparison with Artificial Neural Networks

    Directory of Open Access Journals (Sweden)

    Sunil Tyagi

    2017-04-01

    Full Text Available A classification technique using Support Vector Machine (SVM classifier for detection of rolling element bearing fault is presented here.  The SVM was fed from features that were extracted from of vibration signals obtained from experimental setup consisting of rotating driveline that was mounted on rolling element bearings which were run in normal and with artificially faults induced conditions. The time-domain vibration signals were divided into 40 segments and simple features such as peaks in time domain and spectrum along with statistical features such as standard deviation, skewness, kurtosis etc. were extracted. Effectiveness of SVM classifier was compared with the performance of Artificial Neural Network (ANN classifier and it was found that the performance of SVM classifier is superior to that of ANN. The effect of pre-processing of the vibration signal by Discreet Wavelet Transform (DWT prior to feature extraction is also studied and it is shown that pre-processing of vibration signal with DWT enhances the effectiveness of both ANN and SVM classifiers. It has been demonstrated from experiment results that performance of SVM classifier is better than ANN in detection of bearing condition and pre-processing the vibration signal with DWT improves the performance of SVM classifier.

  10. Prediction of nuclear proteins using SVM and HMM models

    Directory of Open Access Journals (Sweden)

    Raghava Gajendra PS

    2009-01-01

    Full Text Available Abstract Background The nucleus, a highly organized organelle, plays important role in cellular homeostasis. The nuclear proteins are crucial for chromosomal maintenance/segregation, gene expression, RNA processing/export, and many other processes. Several methods have been developed for predicting the nuclear proteins in the past. The aim of the present study is to develop a new method for predicting nuclear proteins with higher accuracy. Results All modules were trained and tested on a non-redundant dataset and evaluated using five-fold cross-validation technique. Firstly, Support Vector Machines (SVM based modules have been developed using amino acid and dipeptide compositions and achieved a Mathews correlation coefficient (MCC of 0.59 and 0.61 respectively. Secondly, we have developed SVM modules using split amino acid compositions (SAAC and achieved the maximum MCC of 0.66. Thirdly, a hidden Markov model (HMM based module/profile was developed for searching exclusively nuclear and non-nuclear domains in a protein. Finally, a hybrid module was developed by combining SVM module and HMM profile and achieved a MCC of 0.87 with an accuracy of 94.61%. This method performs better than the existing methods when evaluated on blind/independent datasets. Our method estimated 31.51%, 21.89%, 26.31%, 25.72% and 24.95% of the proteins as nuclear proteins in Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, mouse and human proteomes respectively. Based on the above modules, we have developed a web server NpPred for predicting nuclear proteins http://www.imtech.res.in/raghava/nppred/. Conclusion This study describes a highly accurate method for predicting nuclear proteins. SVM module has been developed for the first time using SAAC for predicting nuclear proteins, where amino acid composition of N-terminus and the remaining protein were computed separately. In addition, our study is a first documentation where exclusively nuclear

  11. A study of computer-aided diagnosis for pulmonary nodule: comparison between classification accuracies using calculated image features and imaging findings annotated by radiologists.

    Science.gov (United States)

    Kawagishi, Masami; Chen, Bin; Furukawa, Daisuke; Sekiguchi, Hiroyuki; Sakai, Koji; Kubo, Takeshi; Yakami, Masahiro; Fujimoto, Koji; Sakamoto, Ryo; Emoto, Yutaka; Aoyama, Gakuto; Iizuka, Yoshio; Nakagomi, Keita; Yamamoto, Hiroyuki; Togashi, Kaori

    2017-05-01

    In our previous study, we developed a computer-aided diagnosis (CADx) system using imaging findings annotated by radiologists. The system, however, requires radiologists to input many imaging findings. In order to reduce such an interaction of radiologists, we further developed a CADx system using derived imaging findings based on calculated image features, in which the system only requires few user operations. The purpose of this study is to check whether calculated image features (CFT) or derived imaging findings (DFD) can represent information in imaging findings annotated by radiologists (AFD). We calculate 2282 image features and derive 39 imaging findings by using information on a nodule position and its type (solid or ground-glass). These image features are categorized into shape features, texture features and imaging findings-specific features. Each imaging finding is derived based on each corresponding classifier using random forest. To check whether CFT or DFD can represent information in AFD, under an assumption that the accuracies of classifiers are the same if information included in input is the same, we constructed classifiers by using various types of information (CTT, DFD and AFD) and compared accuracies on an inferred diagnosis of a nodule. We employ SVM with RBF kernel as classifier to infer a diagnosis name. Accuracies of classifiers using DFD, CFT, AFD and CFT [Formula: see text] AFD were 0.613, 0.577, 0.773 and 0.790, respectively. Concordance rates between DFD and AFD of shape findings, texture findings and surrounding findings were 0.644, 0.871 and 0.768, respectively. The results suggest that CFT and AFD are similar information and CFT represent only a portion of AFD. Particularly, CFT did not contain shape information in AFD. In order to decrease an interaction of radiologists, a development of a method which overcomes these problems is necessary.

  12. A Method for Aileron Actuator Fault Diagnosis Based on PCA and PGC-SVM

    Directory of Open Access Journals (Sweden)

    Wei-Li Qin

    2016-01-01

    Full Text Available Aileron actuators are pivotal components for aircraft flight control system. Thus, the fault diagnosis of aileron actuators is vital in the enhancement of the reliability and fault tolerant capability. This paper presents an aileron actuator fault diagnosis approach combining principal component analysis (PCA, grid search (GS, 10-fold cross validation (CV, and one-versus-one support vector machine (SVM. This method is referred to as PGC-SVM and utilizes the direct drive valve input, force motor current, and displacement feedback signal to realize fault detection and location. First, several common faults of aileron actuators, which include force motor coil break, sensor coil break, cylinder leakage, and amplifier gain reduction, are extracted from the fault quadrantal diagram; the corresponding fault mechanisms are analyzed. Second, the data feature extraction is performed with dimension reduction using PCA. Finally, the GS and CV algorithms are employed to train a one-versus-one SVM for fault classification, thus obtaining the optimal model parameters and assuring the generalization of the trained SVM, respectively. To verify the effectiveness of the proposed approach, four types of faults are introduced into the simulation model established by AMESim and Simulink. The results demonstrate its desirable diagnostic performance which outperforms that of the traditional SVM by comparison.

  13. Towards multilevel mental stress assessment using SVM with ECOC: an EEG approach.

    Science.gov (United States)

    Al-Shargie, Fares; Tang, Tong Boon; Badruddin, Nasreen; Kiguchi, Masashi

    2017-10-18

    Mental stress has been identified as one of the major contributing factors that leads to various diseases such as heart attack, depression, and stroke. To avoid this, stress quantification is important for clinical intervention and disease prevention. This study aims to investigate the feasibility of exploiting electroencephalography (EEG) signals to discriminate between different stress levels. We propose a new assessment protocol whereby the stress level is represented by the complexity of mental arithmetic (MA) task for example, at three levels of difficulty, and the stressors are time pressure and negative feedback. Using 18-male subjects, the experimental results showed that there were significant differences in EEG response between the control and stress conditions at different levels of MA task with p values < 0.001. Furthermore, we found a significant reduction in alpha rhythm power from one stress level to another level, p values < 0.05. In comparison, results from self-reporting questionnaire NASA-TLX approach showed no significant differences between stress levels. In addition, we developed a discriminant analysis method based on multiclass support vector machine (SVM) with error-correcting output code (ECOC). Different stress levels were detected with an average classification accuracy of 94.79%. The lateral index (LI) results further showed dominant right prefrontal cortex (PFC) to mental stress (reduced alpha rhythm). The study demonstrated the feasibility of using EEG in classifying multilevel mental stress and reported alpha rhythm power at right prefrontal cortex as a suitable index.

  14. Intelligent Agent-Based Intrusion Detection System Using Enhanced Multiclass SVM

    Directory of Open Access Journals (Sweden)

    S. Ganapathy

    2012-01-01

    Full Text Available Intrusion detection systems were used in the past along with various techniques to detect intrusions in networks effectively. However, most of these systems are able to detect the intruders only with high false alarm rate. In this paper, we propose a new intelligent agent-based intrusion detection model for mobile ad hoc networks using a combination of attribute selection, outlier detection, and enhanced multiclass SVM classification methods. For this purpose, an effective preprocessing technique is proposed that improves the detection accuracy and reduces the processing time. Moreover, two new algorithms, namely, an Intelligent Agent Weighted Distance Outlier Detection algorithm and an Intelligent Agent-based Enhanced Multiclass Support Vector Machine algorithm are proposed for detecting the intruders in a distributed database environment that uses intelligent agents for trust management and coordination in transaction processing. The experimental results of the proposed model show that this system detects anomalies with low false alarm rate and high-detection rate when tested with KDD Cup 99 data set.

  15. Intelligent agent-based intrusion detection system using enhanced multiclass SVM.

    Science.gov (United States)

    Ganapathy, S; Yogesh, P; Kannan, A

    2012-01-01

    Intrusion detection systems were used in the past along with various techniques to detect intrusions in networks effectively. However, most of these systems are able to detect the intruders only with high false alarm rate. In this paper, we propose a new intelligent agent-based intrusion detection model for mobile ad hoc networks using a combination of attribute selection, outlier detection, and enhanced multiclass SVM classification methods. For this purpose, an effective preprocessing technique is proposed that improves the detection accuracy and reduces the processing time. Moreover, two new algorithms, namely, an Intelligent Agent Weighted Distance Outlier Detection algorithm and an Intelligent Agent-based Enhanced Multiclass Support Vector Machine algorithm are proposed for detecting the intruders in a distributed database environment that uses intelligent agents for trust management and coordination in transaction processing. The experimental results of the proposed model show that this system detects anomalies with low false alarm rate and high-detection rate when tested with KDD Cup 99 data set.

  16. Intelligent Agent-Based Intrusion Detection System Using Enhanced Multiclass SVM

    Science.gov (United States)

    Ganapathy, S.; Yogesh, P.; Kannan, A.

    2012-01-01

    Intrusion detection systems were used in the past along with various techniques to detect intrusions in networks effectively. However, most of these systems are able to detect the intruders only with high false alarm rate. In this paper, we propose a new intelligent agent-based intrusion detection model for mobile ad hoc networks using a combination of attribute selection, outlier detection, and enhanced multiclass SVM classification methods. For this purpose, an effective preprocessing technique is proposed that improves the detection accuracy and reduces the processing time. Moreover, two new algorithms, namely, an Intelligent Agent Weighted Distance Outlier Detection algorithm and an Intelligent Agent-based Enhanced Multiclass Support Vector Machine algorithm are proposed for detecting the intruders in a distributed database environment that uses intelligent agents for trust management and coordination in transaction processing. The experimental results of the proposed model show that this system detects anomalies with low false alarm rate and high-detection rate when tested with KDD Cup 99 data set. PMID:23056036

  17. Automated Tissue Classification Framework for Reproducible Chronic Wound Assessment

    Directory of Open Access Journals (Sweden)

    Rashmi Mukherjee

    2014-01-01

    Full Text Available The aim of this paper was to develop a computer assisted tissue classification (granulation, necrotic, and slough scheme for chronic wound (CW evaluation using medical image processing and statistical machine learning techniques. The red-green-blue (RGB wound images grabbed by normal digital camera were first transformed into HSI (hue, saturation, and intensity color space and subsequently the “S” component of HSI color channels was selected as it provided higher contrast. Wound areas from 6 different types of CW were segmented from whole images using fuzzy divergence based thresholding by minimizing edge ambiguity. A set of color and textural features describing granulation, necrotic, and slough tissues in the segmented wound area were extracted using various mathematical techniques. Finally, statistical learning algorithms, namely, Bayesian classification and support vector machine (SVM, were trained and tested for wound tissue classification in different CW images. The performance of the wound area segmentation protocol was further validated by ground truth images labeled by clinical experts. It was observed that SVM with 3rd order polynomial kernel provided the highest accuracies, that is, 86.94%, 90.47%, and 75.53%, for classifying granulation, slough, and necrotic tissues, respectively. The proposed automated tissue classification technique achieved the highest overall accuracy, that is, 87.61%, with highest kappa statistic value (0.793.

  18. Intrusion detection using rough set classification.

    Science.gov (United States)

    Zhang, Lian-hua; Zhang, Guan-hua; Zhang, Jie; Bai, Ying-cai

    2004-09-01

    Recently machine learning-based intrusion detection approaches have been subjected to extensive researches because they can detect both misuse and anomaly. In this paper, rough set classification (RSC), a modern learning algorithm, is used to rank the features extracted for detecting intrusions and generate intrusion detection models. Feature ranking is a very critical step when building the model. RSC performs feature ranking before generating rules, and converts the feature ranking to minimal hitting set problem addressed by using genetic algorithm (GA). This is done in classical approaches using Support Vector Machine (SVM) by executing many iterations, each of which removes one useless feature. Compared with those methods, our method can avoid many iterations. In addition, a hybrid genetic algorithm is proposed to increase the convergence speed and decrease the training time of RSC. The models generated by RSC take the form of "IF-THEN" rules, which have the advantage of explication. Tests and comparison of RSC with SVM on DARPA benchmark data showed that for Probe and DoS attacks both RSC and SVM yielded highly accurate results (greater than 99% accuracy on testing set).

  19. Positioning Errors Predicting Method of Strapdown Inertial Navigation Systems Based on PSO-SVM

    Directory of Open Access Journals (Sweden)

    Xunyuan Yin

    2013-01-01

    Full Text Available The strapdown inertial navigation systems (SINS have been widely used for many vehicles, such as commercial airplanes, Unmanned Aerial Vehicles (UAVs, and other types of aircrafts. In order to evaluate the navigation errors precisely and efficiently, a prediction method based on support vector machine (SVM is proposed for positioning error assessment. Firstly, SINS error models that are used for error calculation are established considering several error resources with respect to inertial units. Secondly, flight paths for simulation are designed. Thirdly, the -SVR based prediction method is proposed to predict the positioning errors of navigation systems, and particle swarm optimization (PSO is used for the SVM parameters optimization. Finally, 600 sets of error parameters of SINS are utilized to train the SVM model, which is used for the performance prediction of new navigation systems. By comparing the predicting results with the real errors, the latitudinal predicting accuracy is 92.73%, while the longitudinal predicting accuracy is 91.64%, and PSO is effective to increase the prediction accuracy compared with traditional SVM with fixed parameters. This method is also demonstrated to be effective for error prediction for an entire flight process. Moreover, the prediction method can save 75% of calculation time compared with analyses based on error models.

  20. A RLS-SVM Aided Fusion Methodology for INS during GPS Outages

    Science.gov (United States)

    Yao, Yiqing; Xu, Xiaosu

    2017-01-01

    In order to maintain a relatively high accuracy of navigation performance during global positioning system (GPS) outages, a novel robust least squares support vector machine (LS-SVM)-aided fusion methodology is explored to provide the pseudo-GPS position information for the inertial navigation system (INS). The relationship between the yaw, specific force, velocity, and the position increment is modeled. Rather than share the same weight in the traditional LS-SVM, the proposed algorithm allocates various weights for different data, which makes the system immune to the outliers. Field test data was collected to evaluate the proposed algorithm. The comparison results indicate that the proposed algorithm can effectively provide position corrections for standalone INS during the 300 s GPS outage, which outperforms the traditional LS-SVM method. Historical information is also involved to better represent the vehicle dynamics. PMID:28245549

  1. A RLS-SVM Aided Fusion Methodology for INS during GPS Outages

    Directory of Open Access Journals (Sweden)

    Yiqing Yao

    2017-02-01

    Full Text Available In order to maintain a relatively high accuracy of navigation performance during global positioning system (GPS outages, a novel robust least squares support vector machine (LS-SVM-aided fusion methodology is explored to provide the pseudo-GPS position information for the inertial navigation system (INS. The relationship between the yaw, specific force, velocity, and the position increment is modeled. Rather than share the same weight in the traditional LS-SVM, the proposed algorithm allocates various weights for different data, which makes the system immune to the outliers. Field test data was collected to evaluate the proposed algorithm. The comparison results indicate that the proposed algorithm can effectively provide position corrections for standalone INS during the 300 s GPS outage, which outperforms the traditional LS-SVM method. Historical information is also involved to better represent the vehicle dynamics.

  2. Examining applying high performance genetic data feature selection and classification algorithms for colon cancer diagnosis.

    Science.gov (United States)

    Al-Rajab, Murad; Lu, Joan; Xu, Qiang

    2017-07-01

    This paper examines the accuracy and efficiency (time complexity) of high performance genetic data feature selection and classification algorithms for colon cancer diagnosis. The need for this research derives from the urgent and increasing need for accurate and efficient algorithms. Colon cancer is a leading cause of death worldwide, hence it is vitally important for the cancer tissues to be expertly identified and classified in a rapid and timely manner, to assure both a fast detection of the disease and to expedite the drug discovery process. In this research, a three-phase approach was proposed and implemented: Phases One and Two examined the feature selection algorithms and classification algorithms employed separately, and Phase Three examined the performance of the combination of these. It was found from Phase One that the Particle Swarm Optimization (PSO) algorithm performed best with the colon dataset as a feature selection (29 genes selected) and from Phase Two that the Support Vector Machine (SVM) algorithm outperformed other classifications, with an accuracy of almost 86%. It was also found from Phase Three that the combined use of PSO and SVM surpassed other algorithms in accuracy and performance, and was faster in terms of time analysis (94%). It is concluded that applying feature selection algorithms prior to classification algorithms results in better accuracy than when the latter are applied alone. This conclusion is important and significant to industry and society. Copyright © 2017 Elsevier B.V. All rights reserved.

  3. Statistical Fractal Models Based on GND-PCA and Its Application on Classification of Liver Diseases

    Directory of Open Access Journals (Sweden)

    Huiyan Jiang

    2013-01-01

    Full Text Available A new method is proposed to establish the statistical fractal model for liver diseases classification. Firstly, the fractal theory is used to construct the high-order tensor, and then Generalized -dimensional Principal Component Analysis (GND-PCA is used to establish the statistical fractal model and select the feature from the region of liver; at the same time different features have different weights, and finally, Support Vector Machine Optimized Ant Colony (ACO-SVM algorithm is used to establish the classifier for the recognition of liver disease. In order to verify the effectiveness of the proposed method, PCA eigenface method and normal SVM method are chosen as the contrast methods. The experimental results show that the proposed method can reconstruct liver volume better and improve the classification accuracy of liver diseases.

  4. FINGERPRINT CLASSIFICATION BASED ON RECURSIVE NEURAL NETWORK WITH SUPPORT VECTOR MACHINE

    Directory of Open Access Journals (Sweden)

    T. Chakravarthy

    2011-01-01

    Full Text Available Fingerprint classification based on statistical and structural (RNN and SVM approach. RNNs are trained on a structured representation of the fingerprint image. They are also used to extract a set of distributed features of the fingerprint which can be integrated in this support vector machine. SVMs are combined with a new error correcting codes scheme. This approach has two main advantages. (a It can tolerate the presence of ambiguous fingerprint images in the training set and (b It can effectively identify the most difficult fingerprint images in the test set. In this experiment on the fingerprint database NIST-4 (National Institute of Science and Technology, our best classification accuracy of 94.7% is obtained by training SVM on both fingerCode and RNN –extracted futures of segmentation algorithm which has used very sophisticated “region growing process”.

  5. A Spectral-Texture Kernel-Based Classification Method for Hyperspectral Images

    Directory of Open Access Journals (Sweden)

    Yi Wang

    2016-11-01

    Full Text Available Classification of hyperspectral images always suffers from high dimensionality and very limited labeled samples. Recently, the spectral-spatial classification has attracted considerable attention and can achieve higher classification accuracy and smoother classification maps. In this paper, a novel spectral-spatial classification method for hyperspectral images by using kernel methods is investigated. For a given hyperspectral image, the principle component analysis (PCA transform is first performed. Then, the first principle component of the input image is segmented into non-overlapping homogeneous regions by using the entropy rate superpixel (ERS algorithm. Next, the local spectral histogram model is applied to each homogeneous region to obtain the corresponding texture features. Because this step is performed within each homogenous region, instead of within a fixed-size image window, the obtained local texture features in the image are more accurate, which can effectively benefit the improvement of classification accuracy. In the following step, a contextual spectral-texture kernel is constructed by combining spectral information in the image and the extracted texture information using the linearity property of the kernel methods. Finally, the classification map is achieved by the support vector machines (SVM classifier using the proposed spectral-texture kernel. Experiments on two benchmark airborne hyperspectral datasets demonstrate that our method can effectively improve classification accuracies, even though only a very limited training sample is available. Specifically, our method can achieve from 8.26% to 15.1% higher in terms of overall accuracy than the traditional SVM classifier. The performance of our method was further compared to several state-of-the-art classification methods of hyperspectral images using objective quantitative measures and a visual qualitative evaluation.

  6. SVM versus MAP on accelerometer data to distinguish among locomotor activities executed at different speeds.

    Science.gov (United States)

    Schmid, Maurizio; Riganti-Fulginei, Francesco; Bernabucci, Ivan; Laudani, Antonino; Bibbo, Daniele; Muscillo, Rossana; Salvini, Alessandro; Conforto, Silvia

    2013-01-01

    Two approaches to the classification of different locomotor activities performed at various speeds are here presented and evaluated: a maximum a posteriori (MAP) Bayes' classification scheme and a Support Vector Machine (SVM) are applied on a 2D projection of 16 features extracted from accelerometer data. The locomotor activities (level walking, stair climbing, and stair descending) were recorded by an inertial sensor placed on the shank (preferred leg), performed in a natural indoor-outdoor scenario by 10 healthy young adults (age 25-35 yrs.). From each segmented activity epoch, sixteen features were chosen in the frequency and time domain. Dimension reduction was then performed through 2D Sammon's mapping. An Artificial Neural Network (ANN) was trained to mimic Sammon's mapping on the whole dataset. In the Bayes' approach, the two features were then fed to a Bayes' classifier that incorporates an update rule, while, in the SVM scheme, the ANN was considered as the kernel function of the classifier. Bayes' approach performed slightly better than SVM on both the training set (91.4% versus 90.7%) and the testing set (84.2% versus 76.0%), favoring the proposed Bayes' scheme as more suitable than the proposed SVM in distinguishing among the different monitored activities.

  7. Towards understanding the influence of SVM hyperparameters

    CSIR Research Space (South Africa)

    Van Heerden, CJ

    2010-11-01

    Full Text Available -consuming and resource-intensive. On large datasets, 10-fold cross-validation grid searches can become intractable without supercomputers or high performance computing clusters. They present theoretical and empirical arguments as to how SVM hyperparameters scale with N...

  8. Detecting suboptimal cognitive effort: classification accuracy of the Conner's Continuous Performance Test-II, Brief Test Of Attention, and Trail Making Test.

    Science.gov (United States)

    Busse, Michelle; Whiteside, Douglas

    2012-01-01

    Many cognitive measures have been studied for their ability to detect suboptimal cognitive effort; however, attention measures have not been extensively researched. The current study evaluated the classification accuracy of commonly used attention/concentration measures, the Brief Test of Attention (BTA), Trail Making Test (TMT), and the Conners' Continuous Performance Test (CPT-II). Participants included 413 consecutive patients who completed a comprehensive neuropsychological evaluation. Participants were separated into two groups, identified as either unbiased responders or biased responders as determined by performance on the TOMM. Based on Mann-Whitney U results, the two groups differed significantly on all attentional measures. Classification accuracy of the BTA (.83), CPT-II omission errors (OE; .76) and TMT B (.75) were acceptable; however, classification accuracy of CPT-II commission errors (CE; .64) and TMT A (.62) were poor. When variables were combined in different combinations, sensitivity did not significantly increase. Results indicated for optimal cut-off scores, sensitivity ranged from 48% to 64% when specificity was at least 85%. Given that sensitivity rates were not adequate, there remains a need to utilize highly sensitive measures in addition to these embedded measures. Results were discussed within the context of research promoting the need for multiple measures of cognitive effort.

  9. Research on the classification result and accuracy of building windows in high resolution satellite images: take the typical rural buildings in Guangxi, China, as an example

    Science.gov (United States)

    Li, Baishou; Gao, Yujiu

    2015-12-01

    The information extracted from the high spatial resolution remote sensing images has become one of the important data sources of the GIS large scale spatial database updating. The realization of the building information monitoring using the high resolution remote sensing, building small scale information extracting and its quality analyzing has become an important precondition for the applying of the high-resolution satellite image information, because of the large amount of regional high spatial resolution satellite image data. In this paper, a clustering segmentation classification evaluation method for the high resolution satellite images of the typical rural buildings is proposed based on the traditional KMeans clustering algorithm. The factors of separability and building density were used for describing image classification characteristics of clustering window. The sensitivity of the factors influenced the clustering result was studied from the perspective of the separability between high image itself target and background spectrum. This study showed that the number of the sample contents is the important influencing factor to the clustering accuracy and performance, the pixel ratio of the objects in images and the separation factor can be used to determine the specific impact of cluster-window subsets on the clustering accuracy, and the count of window target pixels (Nw) does not alone affect clustering accuracy. The result can provide effective research reference for the quality assessment of the segmentation and classification of high spatial resolution remote sensing images.

  10. Penerapan Support Vector Machine (SVM untuk Pengkategorian Penelitian

    Directory of Open Access Journals (Sweden)

    Fithri Selva Jumeilah

    2017-07-01

    Full Text Available Research every college will continue to grow. Research will be stored in softcopy and hardcopy. The preparation of the research should be categorized in order to facilitate the search for people who need reference. To categorize the research, we need a method for text mining, one of them is with the implementation of Support Vector Machines (SVM. The data used to recognize the characteristics of each category then it takes secondary data which is a collection of abstracts of research. The data will be pre-processed with several stages: case folding converts all the letters into lowercase, stop words removal removal of very common words, tokenizing discard punctuation, and stemming searching for root words by removing the prefix and suffix. Further data that has undergone preprocessing will be converted into a numerical form with for the term weighting stage that is the weighting contribution of each word. From the results of term weighting then obtained data that can be used for data training and test data. The training process is done by providing input in the form of text data that is known to the class or category. Then by using the Support Vector Machines algorithm, the input data is transformed into a rule, function, or knowledge model that can be used in the prediction process. From the results of this study obtained that the categorization of research produced by SVM has been very good. This is proven by the results of the test which resulted in an accuracy of 90%.

  11. Forecasting Dry Bulk Freight Index with Improved SVM

    Directory of Open Access Journals (Sweden)

    Qianqian Han

    2014-01-01

    Full Text Available An improved SVM model is presented to forecast dry bulk freight index (BDI in this paper, which is a powerful tool for operators and investors to manage the market trend and avoid price risking shipping industry. The BDI is influenced by many factors, especially the random incidents in dry bulk market, inducing the difficulty in forecasting of BDI. Therefore, to eliminate the impact of random incidents in dry bulk market, wavelet transform is adopted to denoise the BDI data series. Hence, the combined model of wavelet transform and support vector machine is developed to forecast BDI in this paper. Lastly, the BDI data in 2005 to 2012 are presented to test the proposed model. The 84 prior consecutive monthly BDI data are the inputs of the model, and the last 12 monthly BDI data are the outputs of model. The parameters of the model are optimized by genetic algorithm and the final model is conformed through SVM training. This paper compares the forecasting result of proposed method and three other forecasting methods. The result shows that the proposed method has higher accuracy and could be used to forecast the short-term trend of the BDI.

  12. Hyperspectral Image Enhancement and Mixture Deep-Learning Classification of Corneal Epithelium Injuries

    Science.gov (United States)

    Md Noor, Siti Salwa; Michael, Kaleena; Marshall, Stephen; Ren, Jinchang

    2017-01-01

    In our preliminary study, the reflectance signatures obtained from hyperspectral imaging (HSI) of normal and abnormal corneal epithelium tissues of porcine show similar morphology with subtle differences. Here we present image enhancement algorithms that can be used to improve the interpretability of data into clinically relevant information to facilitate diagnostics. A total of 25 corneal epithelium images without the application of eye staining were used. Three image feature extraction approaches were applied for image classification: (i) image feature classification from histogram using a support vector machine with a Gaussian radial basis function (SVM-GRBF); (ii) physical image feature classification using deep-learning Convolutional Neural Networks (CNNs) only; and (iii) the combined classification of CNNs and SVM-Linear. The performance results indicate that our chosen image features from the histogram and length-scale parameter were able to classify with up to 100% accuracy; particularly, at CNNs and CNNs-SVM, by employing 80% of the data sample for training and 20% for testing. Thus, in the assessment of corneal epithelium injuries, HSI has high potential as a method that could surpass current technologies regarding speed, objectivity, and reliability. PMID:29144388

  13. DETERMINATION OF OPTIMUM CLASSIFICATION SYSTEM FOR HYPERSPECTRAL IMAGERY AND LIDAR DATA BASED ON BEES ALGORITHM

    Directory of Open Access Journals (Sweden)

    F. Samadzadega

    2015-12-01

    Full Text Available Hyperspectral imagery is a rich source of spectral information and plays very important role in discrimination of similar land-cover classes. In the past, several efforts have been investigated for improvement of hyperspectral imagery classification. Recently the interest in the joint use of LiDAR data and hyperspectral imagery has been remarkably increased. Because LiDAR can provide structural information of scene while hyperspectral imagery provide spectral and spatial information. The complementary information of LiDAR and hyperspectral data may greatly improve the classification performance especially in the complex urban area. In this paper feature level fusion of hyperspectral and LiDAR data is proposed where spectral and structural features are extract from both dataset, then hybrid feature space is generated by feature stacking. Support Vector Machine (SVM classifier is applied on hybrid feature space to classify the urban area. In order to optimize the classification performance, two issues should be considered: SVM parameters values determination and feature subset selection. Bees Algorithm (BA is powerful meta-heuristic optimization algorithm which is applied to determine the optimum SVM parameters and select the optimum feature subset simultaneously. The obtained results show the proposed method can improve the classification accuracy in addition to reducing significantly the dimension of feature space.

  14. A fuzzy based feature selection from independent component subspace for machine learning classification of microarray data

    Directory of Open Access Journals (Sweden)

    Rabia Aziz

    2016-06-01

    Full Text Available Feature (gene selection and classification of microarray data are the two most interesting machine learning challenges. In the present work two existing feature selection/extraction algorithms, namely independent component analysis (ICA and fuzzy backward feature elimination (FBFE are used which is a new combination of selection/extraction. The main objective of this paper is to select the independent components of the DNA microarray data using FBFE to improve the performance of support vector machine (SVM and Naïve Bayes (NB classifier, while making the computational expenses affordable. To show the validity of the proposed method, it is applied to reduce the number of genes for five DNA microarray datasets namely; colon cancer, acute leukemia, prostate cancer, lung cancer II, and high-grade glioma. Now these datasets are then classified using SVM and NB classifiers. Experimental results on these five microarray datasets demonstrate that gene selected by proposed approach, effectively improve the performance of SVM and NB classifiers in terms of classification accuracy. We compare our proposed method with principal component analysis (PCA as a standard extraction algorithm and find that the proposed method can obtain better classification accuracy, using SVM and NB classifiers with a smaller number of selected genes than the PCA. The curve between the average error rate and number of genes with each dataset represents the selection of required number of genes for the highest accuracy with our proposed method for both the classifiers. ROC shows best subset of genes for both the classifier of different datasets with propose method.

  15. [Epileptic EEG signal classification based on wavelet packet transform and multivariate multiscale entropy].

    Science.gov (United States)

    Xu, Yonghong; Li, Xingxing; Zhao, Yong

    2013-10-01

    In this paper, a new method combining wavelet packet transform and multivariate multiscale entropy for the classification of epilepsy EEG signals is introduced. Firstly, the original EEG signals are decomposed at multi-scales with the wavelet packet transform, and the wavelet packet coefficients of the required frequency bands are extracted. Secondly, the wavelet packet coefficients are processed with multivariate multiscale entropy algorithm. Finally, the EEG data are classified by support vector machines (SVM). The experimental results on the international public Bonn epilepsy EEG dataset show that the proposed method can efficiently extract epileptic features and the accuracy of classification result is satisfactory.

  16. Towards a multimodal brain-computer interface: combining fNIRS and fTCD measurements to enable higher classification accuracy.

    Science.gov (United States)

    Faress, Ahmed; Chau, Tom

    2013-08-15

    Previous brain-computer interface (BCI) research has largely focused on single neuroimaging modalities such as near-infrared spectroscopy (NIRS) or transcranial Doppler ultrasonography (TCD). However, multimodal brain-computer interfaces, which combine signals from different brain modalities, have been suggested as a potential means of improving the accuracy of BCI systems. In this paper, we compare the classification accuracies attainable using NIRS signals alone, TCD signals alone, and a combination of NIRS and TCD signals. Nine able-bodied subjects (mean age=25.7) were recruited and simultaneous measurements were made with NIRS and TCD instruments while participants were prompted to perform a verbal fluency task or to remain at rest, within the context of a block-stimulus paradigm. Using Linear Discriminant Analysis, the verbal fluency task was classified at mean accuracies of 76.1±9.9%, 79.4±10.3%, and 86.5±6.0% using NIRS, TCD, and NIRS-TCD systems respectively. In five of nine participants, classification accuracies with the NIRS-TCD system were significantly higher (pbrain-computer interfaces. Copyright © 2013 Elsevier Inc. All rights reserved.

  17. Automatic schizophrenic discrimination on fNIRS by using complex brain network analysis and SVM.

    Science.gov (United States)

    Song, Hong; Chen, Lei; Gao, RuiQi; Bogdan, Iordachescu Ilie Mihaita; Yang, Jian; Wang, Shuliang; Dong, Wentian; Quan, Wenxiang; Dang, Weimin; Yu, Xin

    2017-12-20

    Schizophrenia is a kind of serious mental illness. Due to the lack of an objective physiological data supporting and a unified data analysis method, doctors can only rely on the subjective experience of the data to distinguish normal people and patients, which easily lead to misdiagnosis. In recent years, functional Near-Infrared Spectroscopy (fNIRS) has been widely used in clinical diagnosis, it can get the hemoglobin concentration through the variation of optical intensity. Firstly, the prefrontal brain networks were constructed based on oxy-Hb signals from 52-channel fNIRS data of schizophrenia and healthy controls. Then, Complex Brain Network Analysis (CBNA) was used to extract features from the prefrontal brain networks. Finally, a classier based on Support Vector Machine (SVM) is designed and trained to discriminate schizophrenia from healthy controls. We recruited a sample which contains 34 healthy controls and 42 schizophrenia patients to do the one-back memory task. The hemoglobin response was measured in the prefrontal cortex during the task using a 52-channel fNIRS system. The experimental results indicate that the proposed method can achieve a satisfactory classification with the accuracy of 85.5%, 92.8% for schizophrenia samples and 76.5% for healthy controls. Also, our results suggested that fNIRS has the potential capacity to be an effective objective biomarker for the diagnosis of schizophrenia. Our results suggested that, using the appropriate classification method, fNIRS has the potential capacity to be an effective objective biomarker for the diagnosis of schizophrenia.

  18. Robust automated detection of microstructural white matter degeneration in Alzheimer's disease using machine learning classification of multicenter DTI data.

    Directory of Open Access Journals (Sweden)

    Martin Dyrba

    Full Text Available Diffusion tensor imaging (DTI based assessment of white matter fiber tract integrity can support the diagnosis of Alzheimer's disease (AD. The use of DTI as a biomarker, however, depends on its applicability in a multicenter setting accounting for effects of different MRI scanners. We applied multivariate machine learning (ML to a large multicenter sample from the recently created framework of the European DTI study on Dementia (EDSD. We hypothesized that ML approaches may amend effects of multicenter acquisition. We included a sample of 137 patients with clinically probable AD (MMSE 20.6±5.3 and 143 healthy elderly controls, scanned in nine different scanners. For diagnostic classification we used the DTI indices fractional anisotropy (FA and mean diffusivity (MD and, for comparison, gray matter and white matter density maps from anatomical MRI. Data were classified using a Support Vector Machine (SVM and a Naïve Bayes (NB classifier. We used two cross-validation approaches, (i test and training samples randomly drawn from the entire data set (pooled cross-validation and (ii data from each scanner as test set, and the data from the remaining scanners as training set (scanner-specific cross-validation. In the pooled cross-validation, SVM achieved an accuracy of 80% for FA and 83% for MD. Accuracies for NB were significantly lower, ranging between 68% and 75%. Removing variance components arising from scanners using principal component analysis did not significantly change the classification results for both classifiers. For the scanner-specific cross-validation, the classification accuracy was reduced for both SVM and NB. After mean correction, classification accuracy reached a level comparable to the results obtained from the pooled cross-validation. Our findings support the notion that machine learning classification allows robust classification of DTI data sets arising from multiple scanners, even if a new data set comes from a scanner that

  19. Robust automated detection of microstructural white matter degeneration in Alzheimer's disease using machine learning classification of multicenter DTI data.

    Science.gov (United States)

    Dyrba, Martin; Ewers, Michael; Wegrzyn, Martin; Kilimann, Ingo; Plant, Claudia; Oswald, Annahita; Meindl, Thomas; Pievani, Michela; Bokde, Arun L W; Fellgiebel, Andreas; Filippi, Massimo; Hampel, Harald; Klöppel, Stefan; Hauenstein, Karlheinz; Kirste, Thomas; Teipel, Stefan J

    2013-01-01

    Diffusion tensor imaging (DTI) based assessment of white matter fiber tract integrity can support the diagnosis of Alzheimer's disease (AD). The use of DTI as a biomarker, however, depends on its applicability in a multicenter setting accounting for effects of different MRI scanners. We applied multivariate machine learning (ML) to a large multicenter sample from the recently created framework of the European DTI study on Dementia (EDSD). We hypothesized that ML approaches may amend effects of multicenter acquisition. We included a sample of 137 patients with clinically probable AD (MMSE 20.6±5.3) and 143 healthy elderly controls, scanned in nine different scanners. For diagnostic classification we used the DTI indices fractional anisotropy (FA) and mean diffusivity (MD) and, for comparison, gray matter and white matter density maps from anatomical MRI. Data were classified using a Support Vector Machine (SVM) and a Naïve Bayes (NB) classifier. We used two cross-validation approaches, (i) test and training samples randomly drawn from the entire data set (pooled cross-validation) and (ii) data from each scanner as test set, and the data from the remaining scanners as training set (scanner-specific cross-validation). In the pooled cross-validation, SVM achieved an accuracy of 80% for FA and 83% for MD. Accuracies for NB were significantly lower, ranging between 68% and 75%. Removing variance components arising from scanners using principal component analysis did not significantly change the classification results for both classifiers. For the scanner-specific cross-validation, the classification accuracy was reduced for both SVM and NB. After mean correction, classification accuracy reached a level comparable to the results obtained from the pooled cross-validation. Our findings support the notion that machine learning classification allows robust classification of DTI data sets arising from multiple scanners, even if a new data set comes from a scanner that was

  20. Assessing the accuracy of acoustic seabed classification for mapping coral reef environments in South Florida (Broward County,USA

    Directory of Open Access Journals (Sweden)

    Ryan P Moyer

    2005-05-01

    Full Text Available The Atlantic coast of Broward County,Florida (USAis paralleled by a series of progressively deeper, shore-parallel coral reef communities.Two of these reef systems are drowned early Holocene coral reefs of 5 ky and 7 ky uncorrected radiocarbon age.Despite the ease of access to these reefs,and their major contribution to the local economy,accurate benthic habitat maps of the area are not available.Ecological studies have shown that different benthic communities (i.e.communities composed of different biological taxaexist along several spatial gradients on all reefs.Since these studies are limited by time and spatial extent,acoustic surveys with the QTCView V bottom classification system based on a 50 kHz transducer were used as an alternative method of producing habitat maps.From the acoustic data of a 3.1 km² survey area,spatial prediction maps were created for the area.These were compared with habitat maps interpreted from in situ data and Laser Airborne Depth Sounder (LADSbathymetry,in order to ground-truth the remotely sensed data.An error matrix was used to quantitatively determine the accuracy of the acoustically derived spatial prediction model against the maps derived from the in situ and LADS data sets.Confusion analysis of 100 random points showed that the system was able to distinguish areas of reef from areas of rubble and sand with an overall accuracy of 61%.When asked to detect more subtle spatial differences,for example,those between distinct reef communities,the classification was only about 40%accurate.We discuss to what degree a synthesis of acoustic and in situ techniques can provide accurate habitat maps in coral reef environments,and conclude that acoustic methods were able to reflect the spatial extent and composition of at least three different biological communities.La costa Atlántica del Condado de Broward,Florida (EEUUes paralela a una serie de arrecifes coralinos lineales de creciente profundos.Dos de estos sistemas

  1. Support-vector-machine tree-based domain knowledge learning toward automated sports video classification

    Science.gov (United States)

    Xiao, Guoqiang; Jiang, Yang; Song, Gang; Jiang, Jianmin

    2010-12-01

    We propose a support-vector-machine (SVM) tree to hierarchically learn from domain knowledge represented by low-level features toward automatic classification of sports videos. The proposed SVM tree adopts a binary tree structure to exploit the nature of SVM's binary classification, where each internal node is a single SVM learning unit, and each external node represents the classified output type. Such a SVM tree presents a number of advantages, which include: 1. low computing cost; 2. integrated learning and classification while preserving individual SVM's learning strength; and 3. flexibility in both structure and learning modules, where different numbers of nodes and features can be added to address specific learning requirements, and various learning models can be added as individual nodes, such as neural networks, AdaBoost, hidden Markov models, dynamic Bayesian networks, etc. Experiments support that the proposed SVM tree achieves good performances in sports video classifications.

  2. A critical appraisal of the accuracy of the RIFLE and AKIN classifications in defining "acute kidney insufficiency" in critically ill patients.

    Science.gov (United States)

    Valette, Xavier; du Cheyron, Damien

    2013-04-01

    The lack of a consensus definition for acute kidney injury (AKI) has led to a great deal of discrepancies and confusion in the literature in this field. Thus, the RIFLE (Risk of renal dysfunction, Injury to the kidney, Failure of kidney function, Loss of kidney function and End-stage renal disease) and Acute Kidney Injury Network (AKIN) classifications were developed by multidisciplinary collaborative groups and were validated by experts in an international consensus conference in 2007 under an umbrella "acute kidney insufficiency" definition. Search in the MEDLINE and PUBMED databases for relevant literature from January 2000 to June 2011 was performed to assess the accuracy of the novel consensus definitions for AKI. Both systems are based on serum creatinine level and urine output criteria and are staged in 3 severity levels. A major difference between these 2 classifications is that smaller and more rapid changes in serum creatinine are considered in the AKIN stage 1. Each AKI classification has demonstrated its ability to stratify patients according to their AKI severity and to predict outcomes. No classification system has been shown to be superior over the others. Their application in clinical studies would benefit from standardization and the new Kidney Disease Improving Global Outcomes definition of AKI was recently proposed to achieve this aim. Because these classifications do not allow earlier AKI diagnosis and do not optimize the timing of RRT initiation, they remain of moderate utility from the patient's point of view. Copyright © 2013 Elsevier Inc. All rights reserved.

  3. Accuracy assessment of land cover/land use classifiers in dry and humid areas of Iran.

    Science.gov (United States)

    Yousefi, Saleh; Khatami, Reza; Mountrakis, Giorgos; Mirzaee, Somayeh; Pourghasemi, Hamid Reza; Tazeh, Mehdi

    2015-10-01

    Land cover/land use (LCLU) maps are essential inputs for environmental analysis. Remote sensing provides an opportunity to construct LCLU maps of large geographic areas in a timely fashion. Knowing the most accurate classification method to produce LCLU maps based on site characteristics is necessary for the environment managers. The aim of this research is to examine the performance of various classification algorithms for LCLU mapping in dry and humid climates (from June to August). Testing is performed in three case studies from each of the two climates in Iran. The reference dataset of each image was randomly selected from the entire images and was randomly divided into training and validation set. Training sets included 400 pixels, and validation sets included 200 pixels of each LCLU. Results indicate that the support vector machine (SVM) and neural network methods can achieve higher overall accuracy (86.7 and 86.6%) than other examined algorithms, with a slight advantage for the SVM. Dry areas exhibit higher classification difficulty as man-made features often have overlapping spectral responses to soil. A further observation is that spatial segregation and lower mixture of LCLU classes can increase classification overall accuracy.

  4. Automatic vehicle classification using linked visual words

    Science.gov (United States)

    Watcharapinchai, Nattachai; Aramvith, Supavadee; Siddhichai, Supakorn

    2017-07-01

    An improvement in the method of automatic vehicle classification is investigated. The challenges are to correctly classify vehicles regardless of changes in illumination, differences in points of view of the camera, and variations in the types of vehicles. Our proposed appearance-based feature extraction algorithm is called linked visual words (LVWs) and is based on the existing technique bag-of-visual word (BoVW) with the addition of spatial information to improve accuracy of classification. In addition, to prevent over-fitting due to a large number of LVWs, four common sampling techniques with LVWs are investigated. Our results suggest that the sampling of LVWs using TF-IDF with grouping improved the accuracy of classification for the test dataset. In summary, the proposed system is able to classify nine types of vehicles and work with surveillance cameras in real-world scenarios. The classification accuracy of the proposed system is 5.58% and 4.27% higher on average for three datasets when compared with BoVW + SVM and Lenet-5, respectively.

  5. A Classification Method for Seed Viability Assessment with Infrared Thermography

    Directory of Open Access Journals (Sweden)

    Sen Men

    2017-04-01

    Full Text Available This paper presents a viability assessment method for Pisum sativum L. seeds based on the infrared thermography technique. In this work, different artificial treatments were conducted to prepare seeds samples with different viability. Thermal images and visible images were recorded every five minutes during the standard five day germination test. After the test, the root length of each sample was measured, which can be used as the viability index of that seed. Each individual seed area in the visible images was segmented with an edge detection method, and the average temperature of the corresponding area in the infrared images was calculated as the representative temperature for this seed at that time. The temperature curve of each seed during germination was plotted. Thirteen characteristic parameters extracted from the temperature curve were analyzed to show the difference of the temperature fluctuations between the seeds samples with different viability. With above parameters, support vector machine (SVM was used to classify the seed samples into three categories: viable, aged and dead according to the root length, the classification accuracy rate was 95%. On this basis, with the temperature data of only the first three hours during the germination, another SVM model was proposed to classify the seed samples, and the accuracy rate was about 91.67%. From these experimental results, it can be seen that infrared thermography can be applied for the prediction of seed viability, based on the SVM algorithm.

  6. Pre-cancer risk assessment in habitual smokers from DIC images of oral exfoliative cells using active contour and SVM analysis.

    Science.gov (United States)

    Dey, Susmita; Sarkar, Ripon; Chatterjee, Kabita; Datta, Pallab; Barui, Ananya; Maity, Santi P

    2017-04-01

    Habitual smokers are known to be at higher risk for developing oral cancer, which is increasing at an alarming rate globally. Conventionally, oral cancer is associated with high mortality rates, although recent reports show the improved survival outcomes by early diagnosis of disease. An effective prediction system which will enable to identify the probability of cancer development amongst the habitual smokers, is thus expected to benefit sizable number of populations. Present work describes a non-invasive, integrated method for early detection of cellular abnormalities based on analysis of different cyto-morphological features of exfoliative oral epithelial cells. Differential interference contrast (DIC) microscopy provides a potential optical tool as this mode provides a pseudo three dimensional (3-D) image with detailed morphological and textural features obtained from noninvasive, label free epithelial cells. For segmentation of DIC images, gradient vector flow snake model active contour process has been adopted. To evaluate cellular abnormalities amongst habitual smokers, the selected morphological and textural features of epithelial cells are compared with the non-smoker (-ve control group) group and clinically diagnosed pre-cancer patients (+ve control group) using support vector machine (SVM) classifier. Accuracy of the developed SVM based classification has been found to be 86% with 80% sensitivity and 89% specificity in classifying the features from the volunteers having smoking habit. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Abnormal Gait Behavior Detection for Elderly Based on Enhanced Wigner-Ville Analysis and Cloud Incremental SVM Learning

    Directory of Open Access Journals (Sweden)

    Jian Luo

    2016-01-01

    Full Text Available A cloud based health care system is proposed in this paper for the elderly by providing abnormal gait behavior detection, classification, online diagnosis, and remote aid service. Intelligent mobile terminals with triaxial acceleration sensor embedded are used to capture the movement and ambulation information of elderly. The collected signals are first enhanced by a Kalman filter. And the magnitude of signal vector features is then extracted and decomposed into a linear combination of enhanced Gabor atoms. The Wigner-Ville analysis method is introduced and the problem is studied by joint time-frequency analysis. In order to solve the large-scale abnormal behavior data lacking problem in training process, a cloud based incremental SVM (CI-SVM learning method is proposed. The original abnormal behavior data are first used to get the initial SVM classifier. And the larger abnormal behavior data of elderly collected by mobile devices are then gathered in cloud platform to conduct incremental training and get the new SVM classifier. By the CI-SVM learning method, the knowledge of SVM classifier could be accumulated due to the dynamic incremental learning. Experimental results demonstrate that the proposed method is feasible and can be applied to aged care, emergency aid, and related fields.

  8. Application of texture analysis method for mammogram density classification

    Science.gov (United States)

    Nithya, R.; Santhi, B.

    2017-07-01

    Mammographic density is considered a major risk factor for developing breast cancer. This paper proposes an automated approach to classify breast tissue types in digital mammogram. The main objective of the proposed Computer-Aided Diagnosis (CAD) system is to investigate various feature extraction methods and classifiers to improve the diagnostic accuracy in mammogram density classification. Texture analysis methods are used to extract the features from the mammogram. Texture features are extracted by using histogram, Gray Level Co-Occurrence Matrix (GLCM), Gray Level Run Length Matrix (GLRLM), Gray Level Difference Matrix (GLDM), Local Binary Pattern (LBP), Entropy, Discrete Wavelet Transform (DWT), Wavelet Packet Transform (WPT), Gabor transform and trace transform. These extracted features are selected using Analysis of Variance (ANOVA). The features selected by ANOVA are fed into the classifiers to characterize the mammogram into two-class (fatty/dense) and three-class (fatty/glandular/dense) breast density classification. This work has been carried out by using the mini-Mammographic Image Analysis Society (MIAS) database. Five classifiers are employed namely, Artificial Neural Network (ANN), Linear Discriminant Analysis (LDA), Naive Bayes (NB), K-Nearest Neighbor (KNN), and Support Vector Machine (SVM). Experimental results show that ANN provides better performance than LDA, NB, KNN and SVM classifiers. The proposed methodology has achieved 97.5% accuracy for three-class and 99.37% for two-class density classification.

  9. Segmentation-Based PolSAR Image Classification Using Visual Features: RHLBP and Color Features

    Directory of Open Access Journals (Sweden)

    Jian Cheng

    2015-05-01

    Full Text Available A segmentation-based fully-polarimetric synthetic aperture radar (PolSAR image classification method that incorporates texture features and color features is designed and implemented. This method is based on the framework that conjunctively uses statistical region merging (SRM for segmentation and support vector machine (SVM for classification. In the segmentation step, we propose an improved local binary pattern (LBP operator named the regional homogeneity local binary pattern (RHLBP to guarantee the regional homogeneity in PolSAR images. In the classification step, the color features extracted from false color images are applied to improve the classification accuracy. The RHLBP operator and color features can provide discriminative information to separate those pixels and regions with similar polarimetric features, which are from different classes. Extensive experimental comparison results with conventional methods on L-band PolSAR data demonstrate the effectiveness of our proposed method for PolSAR image classification.

  10. A SVM framework for fault detection of the braking system in a high speed train

    Science.gov (United States)

    Liu, Jie; Li, Yan-Fu; Zio, Enrico

    2017-03-01

    In April 2015, the number of operating High Speed Trains (HSTs) in the world has reached 3603. An efficient, effective and very reliable braking system is evidently very critical for trains running at a speed around 300 km/h. Failure of a highly reliable braking system is a rare event and, consequently, informative recorded data on fault conditions are scarce. This renders the fault detection problem a classification problem with highly unbalanced data. In this paper, a Support Vector Machine (SVM) framework, including feature selection, feature vector selection, model construction and decision boundary optimization, is proposed for tackling this problem. Feature vector selection can largely reduce the data size and, thus, the computational burden. The constructed model is a modified version of the least square SVM, in which a higher cost is assigned to the error of classification of faulty conditions than the error of classification of normal conditions. The proposed framework is successfully validated on a number of public unbalanced datasets. Then, it is applied for the fault detection of braking systems in HST: in comparison with several SVM approaches for unbalanced datasets, the proposed framework gives better results.

  11. A Hybrid Vehicle Detection Method Based on Viola-Jones and HOG + SVM from UAV Images

    Directory of Open Access Journals (Sweden)

    Yongzheng Xu

    2016-08-01

    Full Text Available A new hybrid vehicle detection scheme which integrates the Viola-Jones (V-J and linear SVM classifier with HOG feature (HOG + SVM methods is proposed for vehicle detection from low-altitude unmanned aerial vehicle (UAV images. As both V-J and HOG + SVM are sensitive to on-road vehicles’ in-plane rotation, the proposed scheme first adopts a roadway orientation adjustment method, which rotates each UAV image to align the roads with the horizontal direction so the original V-J or HOG + SVM method can be directly applied to achieve fast detection and high accuracy. To address the issue of descending detection speed for V-J and HOG + SVM, the proposed scheme further develops an adaptive switching strategy which sophistically integrates V-J and HOG + SVM methods based on their different descending trends of detection speed to improve detection efficiency. A comprehensive evaluation shows that the switching strategy, combined with the road orientation adjustment method, can significantly improve the efficiency and effectiveness of the vehicle detection from UAV images. The results also show that the proposed vehicle detection method is competitive compared with other existing vehicle detection methods. Furthermore, since the proposed vehicle detection method can be performed on videos captured from moving UAV platforms without the need of image registration or additional road database, it has great potentials of field applications. Future research will be focusing on expanding the current method for detecting other transportation modes such as buses, trucks, motors, bicycles, and pedestrians.

  12. DTI based diagnostic prediction of a disease via pattern classification.

    Science.gov (United States)

    Ingalhalikar, Madhura; Kanterakis, Stathis; Gur, Ruben; Roberts, Timothy P L; Verma, Ragini

    2010-01-01

    The paper presents a method of creating abnormality classifiers learned from Diffusion Tensor Imaging (DTI) data of a population of patients and controls. The score produced by the classifier can be used to aid in diagnosis as it quantifies the degree of pathology. Using anatomically meaningful features computed from the DTI data we train a non-linear support vector machine (SVM) pattern classifier. The method begins with high dimensional elastic registration of DT images followed by a feature extraction step that involves creating a feature by concatenating average anisotropy and diffusivity values in anatomically meaningful regions. Feature selection is performed via a mutual information based technique followed by sequential elimination of the features. A non-linear SVM classifier is then constructed by training on the selected features. The classifier assigns each test subject with a probabilistic abnormality score that indicates the extent of pathology. In this study, abnormality classifiers were created for two populations; one consisting of schizophrenia patients (SCZ) and the other with individuals with autism spectrum disorder (ASD). A clear distinction between the SCZ patients and controls was achieved with 90.62% accuracy while for individuals with ASD, 89.58% classification accuracy was obtained. The abnormality scores clearly separate the groups and the high classification accuracy indicates the prospect of using the scores as a diagnostic and prognostic marker.

  13. Development and evaluation of an automated histology classification system for veterinary pathology.

    Science.gov (United States)

    Hattel, Arthur; Monga, Vishal; Srinivas, Umamahesh; Gillespie, Jim; Brooks, Jason; Fisher, Jenny; Jayarao, Bhushan

    2013-11-01

    A 2-stage algorithmic framework was developed to automatically classify digitized photomicrographs of tissues obtained from bovine liver, lung, spleen, and kidney into different histologic categories. The categories included normal tissue, acute necrosis, and inflammation (acute suppurative; chronic). In the current study, a total of 60 images per category (normal; acute necrosis; acute suppurative inflammation) were obtained from liver samples, 60 images per category (normal; acute suppurative inflammation) were obtained from spleen and lung samples, and 60 images per category (normal; chronic inflammation) were obtained from kidney samples. An automated support vector machine (SVM) classifier was trained to assign each test image to a specific category. Using 10 training images/category/organ, 40 test images/category/organ were examined. Employing confusion matrices to represent category-specific classification accuracy, the classifier-attained accuracies were found to be in the 74-90% range. The same set of test images was evaluated using a SVM classifier trained on 20 images/category/organ. The average classification accuracies were noted to be in the 84-95% range. The accuracy in correctly identifying normal tissue and specific tissue lesions was markedly improved by a small increase in the number of training images. The preliminary results from the study indicate the importance and potential use of automated image classification systems in the histologic identification of normal tissues and specific tissue lesions.

  14. The Effects of Q-Matrix Design on Classification Accuracy in the Log-Linear Cognitive Diagnosis Model

    Science.gov (United States)

    Madison, Matthew J.; Bradshaw, Laine P.

    2015-01-01

    Diagnostic classification models are psychometric models that aim to classify examinees according to their mastery or non-mastery of specified latent characteristics. These models are well-suited for providing diagnostic feedback on educational assessments because of their practical efficiency and increased reliability when compared with other…

  15. The Effects of Q-Matrix Misspecification on Parameter Estimates and Classification Accuracy in the DINA Model

    Science.gov (United States)

    Rupp, Andre A.; Templin, Jonathan

    2008-01-01

    This article reports a study that investigated the effects of Q-matrix misspecifications on parameter estimates and misclassification rates for the deterministic-input, noisy "and" gate (DINA) model, which is a restricted latent class model for multiple classifications of respondents that can be useful for cognitively motivated diagnostic…

  16. Analyzing the diagnostic accuracy of the causes of spinal pain at neurology hospital in accordance with the International Classification of Diseases

    Directory of Open Access Journals (Sweden)

    I. G. Mikhailyuk

    2014-01-01

    Full Text Available Spinal pain is of great socioeconomic significance as it is widely prevalent and a common cause of disability. However, the diagnosis of its true causes frequently leads to problems. A study has been conducted to evaluate the accuracy of a clinical diagnosis and its coding in conformity with the International Classification of Diseases. The diagnosis of vertebral osteochondrosis and the hypodiagnosis of nonspecific and nonvertebrogenic pain syndromes have been found to be unreasonably widely used. Ways to solve these problems have been proposed, by applying approaches to diagnosing the causes of spinal pain in accordance with international practice.

  17. Real-Time Subject-Independent Pattern Classification of Overt and Covert Movements from fNIRS Signals.

    Directory of Open Access Journals (Sweden)

    Neethu Robinson

    Full Text Available Recently, studies have reported the use of Near Infrared Spectroscopy (NIRS for developing Brain-Computer Interface (BCI by applying online pattern classification of brain states from subject-specific fNIRS signals. The purpose of the present study was to develop and test a real-time method for subject-specific and subject-independent classification of multi-channel fNIRS signals using support-vector machines (SVM, so as to determine its feasibility as an online neurofeedback system. Towards this goal, we used left versus right hand movement execution and movement imagery as study paradigms in a series of experiments. In the first two experiments, activations in the motor cortex during movement execution and movement imagery were used to develop subject-dependent models that obtained high classification accuracies thereby indicating the robustness of our classification method. In the third experiment, a generalized classifier-model was developed from the first two experimental data, which was then applied for subject-independent neurofeedback training. Application of this method in new participants showed mean classification accuracy of 63% for movement imagery tasks and 80% for movement execution tasks. These results, and their corresponding offline analysis reported in this study demonstrate that SVM based real-time subject-independent classification of fNIRS signals is feasible. This method has important applications in the field of hemodynamic BCIs, and neuro-rehabilitation where patients can be trained to learn spatio-temporal patterns of healthy brain activity.

  18. CLASSIFICATION OF CERVICAL CANCER CELLS IN PAP SMEAR SCREENING TEST

    Directory of Open Access Journals (Sweden)

    S. Athinarayanan

    2016-05-01

    Full Text Available Cervical cancer is second topmost cancers among women but also, it was a curable one. Regular smear test can discover the sign of precancerous cell and treated the patient according to the result. However sometimes the detection errors can be occurred by smear thickness, cell overlapping or by un-wanted particles in the smear and cytotechnologists faulty diagnosis. Therefore the reason automatic cancer detection was developed. This was help to increase cancer cell mindfulness, diagnosis accuracy with low cost. This detection process consists of some techniques of the image preprocessing that is segmentation and effective texture feature extraction with SVM classification. Then the Final Classification Results of this proposed technique was compared to the previous classification techniques of KNN and ANN and the result would be very useful to cytotechnologists for their further analysis

  19. Emotion classification based on gamma-band EEG.

    Science.gov (United States)

    Li, Mu; Lu, Bao-Liang

    2009-01-01

    In this paper, we use EEG signals to classify two emotions-happiness and sadness. These emotions are evoked by showing subjects pictures of smile and cry facial expressions. We propose a frequency band searching method to choose an optimal band into which the recorded EEG signal is filtered. We use common spatial patterns (CSP) and linear-SVM to classify these two emotions. To investigate the time resolution of classification, we explore two kinds of trials with lengths of 3s and 1s. Classification accuracies of 93.5% +/- 6.7% and 93.0%+/-6.2% are achieved on 10 subjects for 3s-trials and 1s-trials, respectively. Our experimental results indicate that the gamma band (roughly 30-100 Hz) is suitable for EEG-based emotion classification.

  20. TESTING THE POTENTIAL OF VEGETATION INDICES FOR LAND USE/COVER CLASSIFICATION USING HIGH RESOLUTION DATA

    Directory of Open Access Journals (Sweden)

    A. Karakacan Kuzucu

    2017-11-01

    Full Text Available Accurate and reliable land use/land cover (LULC information obtained by remote sensing technology is necessary in many applications such as environmental monitoring, agricultural management, urban planning, hydrological applications, soil management, vegetation condition study and suitability analysis. But this information still remains a challenge especially in heterogeneous landscapes covering urban and rural areas due to spectrally similar LULC features. In parallel with technological developments, supplementary data such as satellite-derived spectral indices have begun to be used as additional bands in classification to produce data with high accuracy. The aim of this research is to test the potential of spectral vegetation indices combination with supervised classification methods and to extract reliable LULC information from SPOT 7 multispectral imagery. The Normalized Difference Vegetation Index (NDVI, the Ratio Vegetation Index (RATIO, the Soil Adjusted Vegetation Index (SAVI were the three vegetation indices used in this study. The classical maximum likelihood classifier (MLC and support vector machine (SVM algorithm were applied to classify SPOT 7 image. Catalca is selected region located in the north west of the Istanbul in Turkey, which has complex landscape covering artificial surface, forest and natural area, agricultural field, quarry/mining area, pasture/scrubland and water body. Accuracy assessment of all classified images was performed through overall accuracy and kappa coefficient. The results indicated that the incorporation of these three different vegetation indices decrease the classification accuracy for the MLC and SVM classification. In addition, the maximum likelihood classification slightly outperformed the support vector machine classification approach in both overall accuracy and kappa statistics.

  1. Testing the Potential of Vegetation Indices for Land Use/cover Classification Using High Resolution Data

    Science.gov (United States)

    Karakacan Kuzucu, A.; Bektas Balcik, F.

    2017-11-01

    Accurate and reliable land use/land cover (LULC) information obtained by remote sensing technology is necessary in many applications such as environmental monitoring, agricultural management, urban planning, hydrological applications, soil management, vegetation condition study and suitability analysis. But this information still remains a challenge especially in heterogeneous landscapes covering urban and rural areas due to spectrally similar LULC features. In parallel with technological developments, supplementary data such as satellite-derived spectral indices have begun to be used as additional bands in classification to produce data with high accuracy. The aim of this research is to test the potential of spectral vegetation indices combination with supervised classification methods and to extract reliable LULC information from SPOT 7 multispectral imagery. The Normalized Difference Vegetation Index (NDVI), the Ratio Vegetation Index (RATIO), the Soil Adjusted Vegetation Index (SAVI) were the three vegetation indices used in this study. The classical maximum likelihood classifier (MLC) and support vector machine (SVM) algorithm were applied to classify SPOT 7 image. Catalca is selected region located in the north west of the Istanbul in Turkey, which has complex landscape covering artificial surface, forest and natural area, agricultural field, quarry/mining area, pasture/scrubland and water body. Accuracy assessment of all classified images was performed through overall accuracy and kappa coefficient. The results indicated that the incorporation of these three different vegetation indices decrease the classification accuracy for the MLC and SVM classification. In addition, the maximum likelihood classification slightly outperformed the support vector machine classification approach in both overall accuracy and kappa statistics.

  2. Using support vector machines with tract-based spatial statistics for automated classification of Tourette syndrome children

    Science.gov (United States)

    Wen, Hongwei; Liu, Yue; Wang, Jieqiong; Zhang, Jishui; Peng, Yun; He, Huiguang

    2016-03-01

    Tourette syndrome (TS) is a developmental neuropsychiatric disorder with the cardinal symptoms of motor and vocal tics which emerges in early childhood and fluctuates in severity in later years. To date, the neural basis of TS is not fully understood yet and TS has a long-term prognosis that is difficult to accurately estimate. Few studies have looked at the potential of using diffusion tensor imaging (DTI) in conjunction with machine learning algorithms in order to automate the classification of healthy children and TS children. Here we apply Tract-Based Spatial Statistics (TBSS) method to 44 TS children and 48 age and gender matched healthy children in order to extract the diffusion values from each voxel in the white matter (WM) skeleton, and a feature selection algorithm (ReliefF) was used to select the most salient voxels for subsequent classification with support vector machine (SVM). We use a nested cross validation to yield an unbiased assessment of the classification method and prevent overestimation. The accuracy (88.04%), sensitivity (88.64%) and specificity (87.50%) were achieved in our method as peak performance of the SVM classifier was achieved using the axial diffusion (AD) metric, demonstrating the potential of a joint TBSS and SVM pipeline for fast, objective classification of healthy and TS children. These results support that our methods may be useful for the early identification of subjects with TS, and hold promise for predicting prognosis and treatment outcome for individuals with TS.

  3. Classification of high-resolution multispectral satellite remote sensing images using extended morphological attribute profiles and independent component analysis

    Science.gov (United States)

    Wu, Yu; Zheng, Lijuan; Xie, Donghai; Zhong, Ruofei

    2017-07-01

    In this study, the extended morphological attribute profiles (EAPs) and independent component analysis (ICA) were combined for feature extraction of high-resolution multispectral satellite remote sensing images and the regularized least squares (RLS) approach with the radial basis function (RBF) kernel was further applied for the classification. Based on the major two independent components, the geometrical features were extracted using the EAPs method. In this study, three morphological attributes were calculated and extracted for each independent component, including area, standard deviation, and moment of inertia. The extracted geometrical features classified results using RLS approach and the commonly used LIB-SVM library of support vector machines method. The Worldview-3 and Chinese GF-2 multispectral images were tested, and the results showed that the features extracted by EAPs and ICA can effectively improve the accuracy of the high-resolution multispectral image classification, 2% larger than EAPs and principal component analysis (PCA) method, and 6% larger than APs and original high-resolution multispectral data. Moreover, it is also suggested that both the GURLS and LIB-SVM libraries are well suited for the multispectral remote sensing image classification. The GURLS library is easy to be used with automatic parameter selection but its computation time may be larger than the LIB-SVM library. This study would be helpful for the classification application of high-resolution multispectral satellite remote sensing images.

  4. A modular spectrum sensing system based on PSO-SVM.

    Science.gov (United States)

    Cai, Zhuoran; Zhao, Honglin; Yang, Zhutian; Mo, Yun

    2012-11-08

    In the cognitive radio system, spectrum sensing for detecting the presence of primary users in a licensed spectrum is a fundamental problem. Energy detection is the most popular spectrum sensing scheme used to differentiate the case where the primary user’s signal is present from the case where there is only noise. In fact, the nature of spectrum sensing can be taken as a binary classification problem, and energy detection is a linear classifier. If the signal-to-noise ratio (SNR) of the received signal is low, and the number of received signal samples for sensing is small, the binary classification problem is linearly inseparable. In this situation the performance of energy detection will decrease seriously. In this paper, a novel approach for obtaining a nonlinear threshold based on support vector machine with particle swarm optimization (PSO-SVM) to replace the linear threshold used in traditional energy detection is proposed. Simulations demonstrate that the performance of the proposed algorithm is much better than that of traditional energy detection.

  5. Feature extraction and wall motion classification of 2D stress echocardiography with support vector machines

    Science.gov (United States)

    Chykeyuk, Kiryl; Clifton, David A.; Noble, J. Alison

    2011-03-01

    Stress echocardiography is a common clinical procedure for diagnosing heart disease. Clinically, diagnosis of the heart wall motion depends mostly on visual assessment, which is highly subjective and operator-dependent. Introduction of automated methods for heart function assessment have the potential to minimise the variance in operator assessment. Automated wall motion analysis consists of two main steps: (i) segmentation of heart wall borders, and (ii) classification of heart function as either "normal" or "abnormal" based on the segmentation. This paper considers automated classification of rest and stress echocardiography. Most previous approaches to the classification of heart function have considered rest or stress data separately, and have only considered using features extracted from the two main frames (corresponding to the end-of-diastole and end-of-systole). One previous attempt [1] has been made to combine information from rest and stress sequences utilising a Hidden Markov Model (HMM), which has proven to be the best performing approach to date. Here, we propose a novel alternative feature selection approach using combined information from rest and stress sequences for motion classification of stress echocardiography, utilising a Support Vector Machines (SVM) classifier. We describe how the proposed SVM-based method overcomes difficulties that occur with HMM classification. Overall accuracy with the new method for global wall motion classification using datasets from 173 patients is 92.47%, and the accuracy of local wall motion classification is 87.20%, showing that the proposed method outperforms the current state-of-the-art HMM-based approach (for which global and local classification accuracy is 82.15% and 78.33%, respectively).

  6. Feature Selection Method Based on Artificial Bee Colony Algorithm and Support Vector Machines for Medical Datasets Classification

    Directory of Open Access Journals (Sweden)

    Mustafa Serter Uzer

    2013-01-01

    Full Text Available This paper offers a hybrid approach that uses the artificial bee colony (ABC algorithm for feature selection and support vector machines for classification. The purpose of this paper is to test the effect of elimination of the unimportant and obsolete features of the datasets on the success of the classification, using the SVM classifier. The developed approach conventionally used in liver diseases and diabetes diagnostics, which are commonly observed and reduce the quality of life, is developed. For the diagnosis of these diseases, hepatitis, liver disorders and diabetes datasets from the UCI database were used, and the proposed system reached a classification accuracies of 94.92%, 74.81%, and 79.29%, respectively. For these datasets, the classification accuracies were obtained by the help of the 10-fold cross-validation method. The results show that the performance of the method is highly successful compared to other results attained and seems very promising for pattern recognition applications.

  7. Robust Automated Detection of Microstructural White Matter Degeneration in Alzheimer’s Disease Using Machine Learning Classification of Multicenter DTI Data

    Science.gov (United States)

    Dyrba, Martin; Ewers, Michael; Wegrzyn, Martin; Kilimann, Ingo; Plant, Claudia; Oswald, Annahita; Meindl, Thomas; Pievani, Michela; Bokde, Arun L. W.; Fellgiebel, Andreas; Filippi, Massimo; Hampel, Harald; Klöppel, Stefan; Hauenstein, Karlheinz; Kirste, Thomas; Teipel, Stefan J.

    2013-01-01

    Diffusion tensor imaging (DTI) based assessment of white matter fiber tract integrity can support the diagnosis of Alzheimer’s disease (AD). The use of DTI as a biomarker, however, depends on its applicability in a multicenter setting accounting for effects of different MRI scanners. We applied multivariate machine learning (ML) to a large multicenter sample from the recently created framework of the European DTI study on Dementia (EDSD). We hypothesized that ML approaches may amend effects of multicenter acquisition. We included a sample of 137 patients with clinically probable AD (MMSE 20.6±5.3) and 143 healthy elderly controls, scanned in nine different scanners. For diagnostic classification we used the DTI indices fractional anisotropy (FA) and mean diffusivity (MD) and, for comparison, gray matter and white matter density maps from anatomical MRI. Data were classified using a Support Vector Machine (SVM) and a Naïve Bayes (NB) classifier. We used two cross-validation approaches, (i) test and training samples randomly drawn from the entire data set (pooled cross-validation) and (ii) data from each scanner as test set, and the data from the remaining scanners as training set (scanner-specific cross-validation). In the pooled cross-validation, SVM achieved an accuracy of 80% for FA and 83% for MD. Accuracies for NB were significantly lower, ranging between 68% and 75%. Removing variance components arising from scanners using principal component analysis did not significantly change the classification results for both classifiers. For the scanner-specific cross-validation, the classification accuracy was reduced for both SVM and NB. After mean correction, classification accuracy reached a level comparable to the results obtained from the pooled cross-validation. Our findings support the notion that machine learning classification allows robust classification of DTI data sets arising from multiple scanners, even if a new data set comes from a scanner that was

  8. Classification of toxicity effects of biotransformed hepatic drugs using whale optimized support vector machines.

    Science.gov (United States)

    Tharwat, Alaa; Moemen, Yasmine S; Hassanien, Aboul Ella

    2017-04-01

    Measuring toxicity is an important step in drug development. Nevertheless, the current experimental methods used to estimate the drug toxicity are expensive and time-consuming, indicating that they are not suitable for large-scale evaluation of drug toxicity in the early stage of drug development. Hence, there is a high demand to develop computational models that can predict the drug toxicity risks. In this study, we used a dataset that consists of 553 drugs that biotransformed in liver. The toxic effects were calculated for the current data, namely, mutagenic, tumorigenic, irritant and reproductive effect. Each drug is represented by 31 chemical descriptors (features). The proposed model consists of three phases. In the first phase, the most discriminative subset of features is selected using rough set-based methods to reduce the classification time while improving the classification performance. In the second phase, different sampling methods such as Random Under-Sampling, Random Over-Sampling and Synthetic Minority Oversampling Technique (SMOTE), BorderLine SMOTE and Safe Level SMOTE are used to solve the problem of imbalanced dataset. In the third phase, the Support Vector Machines (SVM) classifier is used to classify an unknown drug into toxic or non-toxic. SVM parameters such as the penalty parameter and kernel parameter have a great impact on the classification accuracy of the model. In this paper, Whale Optimization Algorithm (WOA) has been proposed to optimize the parameters of SVM, so that the classification error can be reduced. The experimental results proved that the proposed model achieved high sensitivity to all toxic effects. Overall, the high sensitivity of the WOA+SVM model indicates that it could be used for the prediction of drug toxicity in the early stage of drug development. Copyright © 2017 Elsevier Inc. All rights reserved.

  9. Classification of crops across heterogeneous agricultural landscape in Kenya using AisaEAGLE imaging spectroscopy data

    Science.gov (United States)

    Piiroinen, Rami; Heiskanen, Janne; Mõttus, Matti; Pellikka, Petri

    2015-07-01

    Land use practices are changing at a fast pace in the tropics. In sub-Saharan Africa forests, woodlands and bushlands are being transformed for agricultural use to produce food for the rapidly growing population. The objective of this study was to assess the prospects of mapping the common agricultural crops in highly heterogeneous study area in south-eastern Kenya using high spatial and spectral resolution AisaEAGLE imaging spectroscopy data. Minimum noise fraction transformation was used to pack the coherent information in smaller set of bands and the data was classified with support vector machine (SVM) algorithm. A total of 35 plant species were mapped in the field and seven most dominant ones were used as classification targets. Five of the targets were agricultural crops. The overall accuracy (OA) for the classification was 90.8%. To assess the possibility of excluding the remaining 28 plant species from the classification results, 10 different probability thresholds (PT) were tried with SVM. The impact of PT was assessed with validation polygons of all 35 mapped plant species. The results showed that while PT was increased more pixels were excluded from non-target polygons than from the polygons of the seven classification targets. This increased the OA and reduced salt-and-pepper effects in the classification results. Very high spatial resolution imagery and pixel-based classification approach worked well with small targets such as maize while there was mixing of classes on the sides of the tree crowns.

  10. Pressure Model of Control Valve Based on LS-SVM with the Fruit Fly Algorithm

    Directory of Open Access Journals (Sweden)

    Huang Aiqin

    2014-07-01

    Full Text Available Control valve is a kind of essential terminal control component which is hard to model by traditional methodologies because of its complexity and nonlinearity. This paper proposes a new modeling method for the upstream pressure of control valve using the least squares support vector machine (LS-SVM, which has been successfully used to identify nonlinear system. In order to improve the modeling performance, the fruit fly optimization algorithm (FOA is used to optimize two critical parameters of LS-SVM. As an example, a set of actual production data from a controlling system of chlorine in a salt chemistry industry is applied. The validity of LS-SVM modeling method using FOA is verified by comparing the predicted results with the actual data with a value of MSE 2.474 × 10−3. Moreover, it is demonstrated that the initial position of FOA does not affect its optimal ability. By comparison, simulation experiments based on PSO algorithm and the grid search method are also carried out. The results show that LS-SVM based on FOA has equal performance in prediction accuracy. However, from the respect of calculation time, FOA has a significant advantage and is more suitable for the online prediction.

  11. Wind Power Prediction Based on LS-SVM Model with Error Correction

    Directory of Open Access Journals (Sweden)

    ZHANG, Y.

    2017-02-01

    Full Text Available As conventional energy sources are non-renewable, the world's major countries are investing heavily in renewable energy research. Wind power represents the development trend of future energy, but the intermittent and volatility of wind energy are the main reasons that leads to the poor accuracy of wind power prediction. However, by analyzing the error level at different time points, it can be found that the errors of adjacent time are often approximately the same, the least square support vector machine (LS-SVM model with error correction is used to predict the wind power in this paper. According to the simulation of wind power data of two wind farms, the proposed method can effectively improve the prediction accuracy of wind power, and the error distribution is concentrated almost without deviation. The improved method proposed in this paper takes into account the error correction process of the model, which improved the prediction accuracy of the traditional model (RBF, Elman, LS-SVM. Compared with the single LS-SVM prediction model in this paper, the mean absolute error of the proposed method had decreased by 52 percent. The research work in this paper will be helpful to the reasonable arrangement of dispatching operation plan, the normal operation of the wind farm and the large-scale development as well as fully utilization of renewable energy resources.

  12. Spectral Reconstruction Based on Svm for Cross Calibration

    Science.gov (United States)

    Gao, H.; Ma, Y.; Liu, W.; He, H.

    2017-05-01

    Chinese HY-1C/1D satellites will use a 5nm/10nm-resolutional visible-near infrared(VNIR) hyperspectral sensor with the solar calibrator to cross-calibrate with other sensors. The hyperspectral radiance data are composed of average radiance in the sensor's passbands and bear a spectral smoothing effect, a transform from the hyperspectral radiance data to the 1-nm-resolution apparent spectral radiance by spectral reconstruction need to be implemented. In order to solve the problem of noise cumulation and deterioration after several times of iteration by the iterative algorithm, a novel regression method based on SVM is proposed, which can approach arbitrary complex non-linear relationship closely and provide with better generalization capability by learning. In the opinion of system, the relationship between the apparent radiance and equivalent radiance is nonlinear mapping introduced by spectral response function(SRF), SVM transform the low-dimensional non-linear question into high-dimensional linear question though kernel function, obtaining global optimal solution by virtue of quadratic form. The experiment is performed using 6S-simulated spectrums considering the SRF and SNR of the hyperspectral sensor, measured reflectance spectrums of water body and different atmosphere conditions. The contrastive result shows: firstly, the proposed method is with more reconstructed accuracy especially to the high-frequency signal; secondly, while the spectral resolution of the hyperspectral sensor reduces, the proposed method performs better than the iterative method; finally, the root mean square relative error(RMSRE) which is used to evaluate the difference of the reconstructed spectrum and the real spectrum over the whole spectral range is calculated, it decreses by one time at least by proposed method.

  13. SPECTRAL RECONSTRUCTION BASED ON SVM FOR CROSS CALIBRATION

    Directory of Open Access Journals (Sweden)

    H. Gao

    2017-05-01

    Full Text Available Chinese HY-1C/1D satellites will use a 5nm/10nm-resolutional visible-near infrared(VNIR hyperspectral sensor with the solar calibrator to cross-calibrate with other sensors. The hyperspectral radiance data are composed of average radiance in the sensor’s passbands and bear a spectral smoothing effect, a transform from the hyperspectral radiance data to the 1-nm-resolution apparent spectral radiance by spectral reconstruction need to be implemented. In order to solve the problem of noise cumulation and deterioration after several times of iteration by the iterative algorithm, a novel regression method based on SVM is proposed, which can approach arbitrary complex non-linear relationship closely and provide with better generalization capability by learning. In the opinion of system, the relationship between the apparent radiance and equivalent radiance is nonlinear mapping introduced by spectral response function(SRF, SVM transform the low-dimensional non-linear question into high-dimensional linear question though kernel function, obtaining global optimal solution by virtue of quadratic form. The experiment is performed using 6S-simulated spectrums considering the SRF and SNR of the hyperspectral sensor, measured reflectance spectrums of water body and different atmosphere conditions. The contrastive result shows: firstly, the proposed method is with more reconstructed accuracy especially to the high-frequency signal; secondly, while the spectral resolution of the hyperspectral sensor reduces, the proposed method performs better than the iterative method; finally, the root mean square relative error(RMSRE which is used to evaluate the difference of the reconstructed spectrum and the real spectrum over the whole spectral range is calculated, it decreses by one time at least by proposed method.

  14. Pixel-Based Land Cover Classification by Fusing Hyperspectral and LIDAR Data

    Science.gov (United States)

    Jahan, F.; Awrangjeb, M.

    2017-09-01

    Land cover classification has many applications like forest management, urban planning, land use change identification and environment change analysis. The passive sensing of hyperspectral systems can be effective in describing the phenomenology of the observed area over hundreds of (narrow) spectral bands. On the other hand, the active sensing of LiDAR (Light Detection and Ranging) systems can be exploited for characterising topographical information of the area. As a result, the joint use of hyperspectral and LiDAR data provides a source of complementary information, which can greatly assist in the classification of complex classes. In this study, we fuse hyperspectral and LiDAR data for land cover classification. We do a pixel-wise classification on a disjoint set of training and testing samples for five different classes. We propose a new feature combination by fusing features from both hyperspectral and LiDAR, which achieves competent classification accuracy with low feature dimension, while the existing method requires high dimensional feature vector to achieve similar classification result. Also, for the reduction of the dimension of the feature vector, Principal Component Analysis (PCA) is used as it captures the variance of the samples with a limited number of Principal Components (PCs). We tested our classification method using PCA applied on hyperspectral bands only and combined hyperspectral and LiDAR features. Classification with support vector machine (SVM) and decision tree shows that our feature combination achieves better classification accuracy compared to the existing feature combination, while keeping the similar number of PCs. The experimental results also show that decision tree performs better than SVM and requires less execution time.

  15. PIXEL-BASED LAND COVER CLASSIFICATION BY FUSING HYPERSPECTRAL AND LIDAR DATA

    Directory of Open Access Journals (Sweden)

    F. Jahan

    2017-09-01

    Full Text Available Land cover classification has many applications like forest management, urban planning, land use change identification and environment change analysis. The passive sensing of hyperspectral systems can be effective in describing the phenomenology of the observed area over hundreds of (narrow spectral bands. On the other hand, the active sensing of LiDAR (Light Detection and Ranging systems can be exploited for characterising topographical information of the area. As a result, the joint use of hyperspectral and LiDAR data provides a source of complementary information, which can greatly assist in the classification of complex classes. In this study, we fuse hyperspectral and LiDAR data for land cover classification. We do a pixel-wise classification on a disjoint set of training and testing samples for five different classes. We propose a new feature combination by fusing features from both hyperspectral and LiDAR, which achieves competent classification accuracy with low feature dimension, while the existing method requires high dimensional feature vector to achieve similar classification result. Also, for the reduction of the dimension of the feature vector, Principal Component Analysis (PCA is used as it captures the variance of the samples with a limited number of Principal Components (PCs. We tested our classification method using PCA applied on hyperspectral bands only and combined hyperspectral and LiDAR features. Classification with support vector machine (SVM and decision tree shows that our feature combination achieves better classification accuracy compared to the existing feature combination, while keeping the similar number of PCs. The experimental results also show that decision tree performs better than SVM and requires less execution time.

  16. Hybrid SVM-HMM based recognition algorithm for pen-based tutoring system

    Science.gov (United States)

    Yuan, Zhenming; Pan, Hong

    2007-11-01

    Pen-based computing takes advantage of human skill with the pen, which is more than a substitute for the mouse. A hybrid SVM-HMM based recognition algorithm is presented for pen-based single stroke diagram. The algorithm includes five steps: sampling and pre-processing, segmentation, formal feature computing, SVM based feature classification, and HMM based symbol recognition. The formal feature of a stroke is composed of five static features and one dynamic feature. A group of one-to-one combinations of binary SVMs are used as feature classifiers to produce fixed length feature vectors, each of which is produced by the probability output with Sigmoid function and act as the posterior probability of observation of HMM. Finally HMMs are employed as final recognizer to recognize the unknown stroke. Based on this algorithm, a tutoring system is designed to identify the sketches of the flowchart diagrams. Experiment results show the hybrid algorithm has a good learning and recognition ability, which is benefited from combining the SVM's classification ability of static properties with the HMM's recognition ability of dynamic properties.

  17. Can we improve accuracy and reliability of MRI interpretation in children with optic pathway glioma? Proposal for a reproducible imaging classification

    Energy Technology Data Exchange (ETDEWEB)

    Lambron, Julien; Frampas, Eric; Toulgoat, Frederique [University Hospital, Department of Radiology, Nantes (France); Rakotonjanahary, Josue [University Hospital, Department of Pediatric Oncology, Angers (France); University Paris Diderot, INSERM CIE5 Robert Debre Hospital, Assistance Publique-Hopitaux de Paris (AP-HP), Paris (France); Loisel, Didier [University Hospital, Department of Radiology, Angers (France); Carli, Emilie de; Rialland, Xavier [University Hospital, Department of Pediatric Oncology, Angers (France); Delion, Matthieu [University Hospital, Department of Neurosurgery, Angers (France)

    2016-02-15

    Magnetic resonance (MR) images from children with optic pathway glioma (OPG) are complex. We initiated this study to evaluate the accuracy of MR imaging (MRI) interpretation and to propose a simple and reproducible imaging classification for MRI. We randomly selected 140 MRIs from among 510 MRIs performed on 104 children diagnosed with OPG in France from 1990 to 2004. These images were reviewed independently by three radiologists (F.T., 15 years of experience in neuroradiology; D.L., 25 years of experience in pediatric radiology; and J.L., 3 years of experience in radiology) using a classification derived from the Dodge and modified Dodge classifications. Intra- and interobserver reliabilities were assessed using the Bland-Altman method and the kappa coefficient. These reviews allowed the definition of reliable criteria for MRI interpretation. The reviews showed intraobserver variability and large discrepancies among the three radiologists (kappa coefficient varying from 0.11 to 1). These variabilities were too large for the interpretation to be considered reproducible over time or among observers. A consensual analysis, taking into account all observed variabilities, allowed the development of a definitive interpretation protocol. Using this revised protocol, we observed consistent intra- and interobserver results (kappa coefficient varying from 0.56 to 1). The mean interobserver difference for the solid portion of the tumor with contrast enhancement was 0.8 cm{sup 3} (limits of agreement = -16 to 17). We propose simple and precise rules for improving the accuracy and reliability of MRI interpretation for children with OPG. Further studies will be necessary to investigate the possible prognostic value of this approach. (orig.)

  18. Fast Gaussian Naïve Bayes for searchlight classification analysis.

    Science.gov (United States)

    Ontivero-Ortega, Marlis; Lage-Castellanos, Agustin; Valente, Giancarlo; Goebel, Rainer; Valdes-Sosa, Mitchell

    2017-12-01

    The searchlight technique is a variant of multivariate pattern analysis (MVPA) that examines neural activity across large sets of small regions, exhaustively covering the whole brain. This usually involves application of classifier algorithms across all searchlights, which entails large computational costs especially when testing the statistical significance of the accuracies with permutation methods. In this article, a new implementation of the Gaussian Naive Bayes classifier is presented (henceforth massive-GNB). This approach allows classification in all searchlights simultaneously, and is faster than previously published searchlight GNB implementations, as well as other more complex classifiers including support vector machines (SVM). To ensure that the gain in speed for GNB would be useful in searchlight analysis, we compared the accuracies of massive-GNB and SVM in detecting the lateral occipital complex (LOC) in an fMRI localizer experiment (26 subjects). Moreover, this region as defined in a meta-analysis of many activation studies was used as a gold standard to compare error rates for both classifiers. In individual searchlights, SVM was somewhat more accurate than massive-GNB and more selective in detecting the meta-analytic LOC. However, with multiple comparison correction at the cluster-level the two classifiers performed equivalently. Thus for cluster-level analysis, massive-GNB produces an accuracy similar to more sophisticated classifiers but with a substantial gain in speed. Massive-GNB (available as a public Matlab toolbox) could facilitate the more widespread use of searchlight analysis. Copyright © 2017 Elsevier Inc. All rights reserved.

  19. The role of the continuous wavelet transform in mineral identification using hyperspectral imaging in the long-wave infrared by using SVM classifier

    Science.gov (United States)

    Sojasi, Saeed; Yousefi, Bardia; Liaigre, Kévin; Ibarra-Castanedo, Clemente; Beaudoin, Georges; Maldague, Xavier P. V.; Huot, François; Chamberland, Martin

    2017-05-01

    Hyperspectral imaging (HSI) in the long-wave infrared spectrum (LWIR) provides spectral and spatial information concerning the emissivity of the surface of materials, which can be used for mineral identification. For this, an endmember, which is the purest form of a mineral, is used as reference. All pure minerals have specific spectral profiles in the electromagnetic wavelength, which can be thought of as the mineral's fingerprint. The main goal of this paper is the identification of minerals by LWIR hyperspectral imaging using a machine learning scheme. The information of hyperspectral imaging has been recorded from the energy emitted from the mineral's surface. Solar energy is the source of energy in remote sensing, while a heating element is the energy source employed in laboratory experiments. Our work contains three main steps where the first step involves obtaining the spectral signatures of pure (single) minerals with a hyperspectral camera, in the long-wave infrared (7.7 to 11.8 μm), which measures the emitted radiance from the minerals' surface. The second step concerns feature extraction by applying the continuous wavelet transform (CWT) and finally we use support vector machine classifier with radial basis functions (SVM-RBF) for classification/identification of minerals. The overall accuracy of classification in our work is 90.23+/- 2.66%. In conclusion, based on CWT's ability to capture the information of signals can be used as a good marker for classification and identification the minerals substance.

  20. Keratin protein property based classification of mammals and non-mammals using machine learning techniques.

    Science.gov (United States)

    Banerjee, Amit Kumar; Ravi, Vadlamani; Murty, U S N; Shanbhag, Anirudh P; Prasanna, V Lakshmi

    2013-08-01

    Keratin protein is ubiquitous in most vertebrates and invertebrates, and has several important cellular and extracellular functions that are related to survival and protection. Keratin function has played a significant role in the natural selection of an organism. Hence, it acts as a marker of evolution. Much information about an organism and its evolution can therefore be obtained by investigating this important protein. In the present study, Keratin sequences were extracted from public data repositories and various important sequential, structural and physicochemical properties were computed and used for preparing the dataset. The dataset containing two classes, namely mammals (Class-1) and non-mammals (Class-0), was prepared, and rigorous classification analysis was performed. To reduce the complexity of the dataset containing 56 parameters and to achieve improved accuracy, feature selection was done using the t-statistic. The 20 best features (parameters) were selected for further classification analysis using computational algorithms which included SVM, KNN, Neural Network, Logistic regression, Meta-modeling, Tree Induction, Rule Induction, Discriminant analysis and Bayesian Modeling. Statistical methods were used to evaluate the output. Logistic regression was found to be the most effective algorithm for classification, with greater than 96% accuracy using a 10-fold cross validation analysis. KNN, SVM and Rule Induction algorithms also were found to be efficacious for classification. Copyright © 2013 Elsevier Ltd. All rights reserved.

  1. Classification of EEG Signals Using a Multiple Kernel Learning Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Xiaoou Li

    2014-07-01

    Full Text Available In this study, a multiple kernel learning support vector machine algorithm is proposed for the identification of EEG signals including mental and cognitive tasks, which is a key component in EEG-based brain computer interface (BCI systems. The presented BCI approach included three stages: (1 a pre-processing step was performed to improve the general signal quality of the EEG; (2 the features were chosen, including wavelet packet entropy and Granger causality, respectively; (3 a multiple kernel learning support vector machine (MKL-SVM based on a gradient descent optimization algorithm was investigated to classify EEG signals, in which the kernel was defined as a linear combination of polynomial kernels and radial basis function kernels. Experimental results showed that the proposed method provided better classification performance compared with the SVM based on a single kernel. For mental tasks, the average accuracies for 2-class, 3-class, 4-class, and 5-class classifications were 99.20%, 81.25%, 76.76%, and 75.25% respectively. Comparing stroke patients with healthy controls using the proposed algorithm, we achieved the average classification accuracies of 89.24% and 80.33% for 0-back and 1-back tasks respectively. Our results indicate that the proposed approach is promising for implementing human-computer interaction (HCI, especially for mental task classification and identifying suitable brain impairment candidates.

  2. Voice based gender classification using machine learning

    Science.gov (United States)

    Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.

    2017-11-01

    Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.

  3. Retrospective assessment of interobserver agreement and accuracy in classifications and measurements in subsolid nodules with solid components less than 8mm: which window setting is better?

    Energy Technology Data Exchange (ETDEWEB)

    Yoo, Roh-Eul [Seoul National University College of Medicine, Department of Radiology, Seoul (Korea, Republic of); Seoul National University Medical Research Center, Institute of Radiation Medicine, Seoul (Korea, Republic of); Goo, Jin Mo; Park, Chang Min [Seoul National University College of Medicine, Department of Radiology, Seoul (Korea, Republic of); Seoul National University College of Medicine, Cancer Research Institute, Seoul (Korea, Republic of); Hwang, Eui Jin; Yoon, Soon Ho; Lee, Chang Hyun [Seoul National University College of Medicine, Department of Radiology, Seoul (Korea, Republic of); Ahn, Soyeon [Seoul National University Bundang Hospital, Medical Research Collaborating Center, Seongnam-si (Korea, Republic of)

    2017-04-15

    To compare interobserver agreements among multiple readers and accuracy for the assessment of solid components in subsolid nodules between the lung and mediastinal window settings. Seventy-seven surgically resected nodules with solid components smaller than 8 mm were included in this study. In both lung and mediastinal windows, five readers independently assessed the presence and size of solid component. Bootstrapping was used to compare the interobserver agreement between the two window settings. Imaging-pathology correlation was performed to evaluate the accuracy. There were no significant differences in the interobserver agreements between the two windows for both identification (lung windows, k = 0.51; mediastinal windows, k = 0.57) and measurements (lung windows, ICC = 0.70; mediastinal windows, ICC = 0.69) of solid components. The incidence of false negative results for the presence of invasive components and the median absolute difference between the solid component size and the invasive component size were significantly higher on mediastinal windows than on lung windows (P < 0.001 and P < 0.001, respectively). The lung window setting had a comparable reproducibility but a higher accuracy than the mediastinal window setting for nodule classifications and solid component measurements in subsolid nodules. (orig.)

  4. Bag-of-visual-words based feature extraction for SAR target classification

    Science.gov (United States)

    Amrani, Moussa; Chaib, Souleyman; Omara, Ibrahim; Jiang, Feng

    2017-07-01

    Feature extraction plays a key role in the classification performance of synthetic aperture radar automatic target recognition (SAR-ATR). It is very crucial to choose appropriate features to train a classifier, which is prerequisite. Inspired by the great success of Bag-of-Visual-Words (BoVW), we address the problem of feature extraction by proposing a novel feature extraction method for SAR target classification. First, Gabor based features are adopted to extract features from the training SAR images. Second, a discriminative codebook is generated using K-means clustering algorithm. Third, after feature encoding by computing the closest Euclidian distance, the targets are represented by new robust bag of features. Finally, for target classification, support vector machine (SVM) is used as a baseline classifier. Experiments on Moving and Stationary Target Acquisition and Recognition (MSTAR) public release dataset are conducted, and the classification accuracy and time complexity results demonstrate that the proposed method outperforms the state-of-the-art methods.

  5. Hyperspectral Image Classification Based on the Combination of Spatial-spectral Feature and Sparse Representation

    Directory of Open Access Journals (Sweden)

    YANG Zhaoxia

    2015-07-01

    Full Text Available In order to avoid the problem of being over-dependent on high-dimensional spectral feature in the traditional hyperspectral image classification, a novel approach based on the combination of spatial-spectral feature and sparse representation is proposed in this paper. Firstly, we extract the spatial-spectral feature by reorganizing the local image patch with the first d principal components(PCs into a vector representation, followed by a sorting scheme to make the vector invariant to local image rotation. Secondly, we learn the dictionary through a supervised method, and use it to code the features from test samples afterwards. Finally, we embed the resulting sparse feature coding into the support vector machine(SVM for hyperspectral image classification. Experiments using three hyperspectral data show that the proposed method can effectively improve the classification accuracy comparing with traditional classification methods.

  6. Structural Health Monitoring of Tall Buildings with Numerical Integrator and Convex-Concave Hull Classification

    Directory of Open Access Journals (Sweden)

    Suresh Thenozhi

    2012-01-01

    Full Text Available An important objective of health monitoring systems for tall buildings is to diagnose the state of the building and to evaluate its possible damage. In this paper, we use our prototype to evaluate our data-mining approach for the fault monitoring. The offset cancellation and high-pass filtering techniques are combined effectively to solve common problems in numerical integration of acceleration signals in real-time applications. The integration accuracy is improved compared with other numerical integrators. Then we introduce a novel method for support vector machine (SVM classification, called convex-concave hull. We use the Jarvis march method to decide the concave (nonconvex hull for the inseparable points. Finally the vertices of the convex-concave hull are applied for SVM training.

  7. Classification of epileptic EEG signals based on simple random sampling and sequential feature selection.

    Science.gov (United States)

    Ghayab, Hadi Ratham Al; Li, Yan; Abdulla, Shahab; Diykh, Mohammed; Wan, Xiangkui

    2016-06-01

    Electroencephalogram (EEG) signals are used broadly in the medical fields. The main applications of EEG signals are the diagnosis and treatment of diseases such as epilepsy, Alzheimer, sleep problems and so on. This paper presents a new method which extracts and selects features from multi-channel EEG signals. This research focuses on three main points. Firstly, simple random sampling (SRS) technique is used to extract features from the time domain of EEG signals. Secondly, the sequential feature selection (SFS) algorithm is applied to select the key features and to reduce the dimensionality of the data. Finally, the selected features are forwarded to a least square support vector machine (LS_SVM) classifier to classify the EEG signals. The LS_SVM classifier classified the features which are extracted and selected from the SRS and the SFS. The experimental results show that the method achieves 99.90, 99.80 and 100 % for classification accuracy, sensitivity and specificity, respectively.

  8. Patient-specific ECG beat classification technique.

    Science.gov (United States)

    Das, Manab K; Ari, Samit

    2014-09-01

    Electrocardiogram (ECG) beat classification plays an important role in the timely diagnosis of the critical heart condition. An automated diagnostic system is proposed to classify five types of ECG classes, namely normal (N), ventricular ectopic beat (V), supra ventricular ectopic beat (S), fusion (F) and unknown (Q) as recommended by the Association for the Advancement of Medical Instrumentation (AAMI). The proposed method integrates the Stockwell transform (ST), a bacteria foraging optimisation (BFO) algorithm and a least mean square (LMS)-based multiclass support vector machine (SVM) classifier. The ST is utilised to extract the important morphological features which are concatenated with four timing features. The resultant combined feature vector is optimised by removing the redundant and irrelevant features using the BFO algorithm. The optimised feature vector is applied to the LMS-based multiclass SVM classifier for automated diagnosis. In the proposed technique, the LMS algorithm is used to modify the Lagrange multiplier, which in turn modifies the weight vector to minimise the classification error. The updated weights are used during the testing phase to classify ECG beats. The classification performances are evaluated using the MIT-BIH arrhythmia database. Average accuracy and sensitivity performances of the proposed system for V detection are 98.6% and 91.7%, respectively, and for S detections, 98.2% and 74.7%, respectively over the entire database. To generalise the capability, the classification performance is also evaluated using the St. Petersburg Institute of Cardiological Technics (INCART) database. The proposed technique performs better than other reported heartbeat techniques, with results suggesting better generalisation capability.

  9. A Novel Vehicle Classification Using Embedded Strain Gauge Sensors.

    Science.gov (United States)

    Zhang, Wenbin; Wang, Qi; Suo, Chunguang

    2008-11-05

    This paper presents a new vehicle classification and develops a traffic monitoring detector to provide reliable vehicle classification to aid traffic management systems. The basic principle of this approach is based on measuring the dynamic strain caused by vehicles across pavement to obtain the corresponding vehicle parameters - wheelbase and number of axles - to then accurately classify the vehicle. A system prototype with five embedded strain sensors was developed to validate the accuracy and effectiveness of the classification method. According to the special arrangement of the sensors and the different time a vehicle arrived at the sensors one can estimate the vehicle's speed accurately, corresponding to the estimated vehicle wheelbase and number of axles. Because of measurement errors and vehicle characteristics, there is a lot of overlap between vehicle wheelbase patterns. Therefore, directly setting up a fixed threshold for vehicle classification often leads to low-accuracy results. Using the machine learning pattern recognition method to deal with this problem is believed as one of the most effective tools. In this study, support vector machines (SVMs) were used to integrate the classification features extracted from the strain sensors to automatically classify vehicles into five types, ranging from small vehicles to combination trucks, along the lines of the Federal Highway Administration vehicle classification guide. Test bench and field experiments will be introduced in this paper. Two support vector machines classification algorithms (one-against-all, one-against-one) are used to classify single sensor data and multiple sensor combination data. Comparison of the two classification method results shows that the classification accuracy is very close using single data or multiple data. Our results indicate that using multiclass SVM-based fusion multiple sensor data significantly improves the results of a single sensor data, which is trained on the whole

  10. A Novel Vehicle Classification Using Embedded Strain Gauge Sensors

    Directory of Open Access Journals (Sweden)

    Qi Wang

    2008-11-01

    Full Text Available Abstract: This paper presents a new vehicle classification and develops a traffic monitoring detector to provide reliable vehicle classification to aid traffic management systems. The basic principle of this approach is based on measuring the dynamic strain caused by vehicles across pavement to obtain the corresponding vehicle parameters – wheelbase and number of axles – to then accurately classify the vehicle. A system prototype with five embedded strain sensors was developed to validate the accuracy and effectiveness of the classification method. According to the special arrangement of the sensors and the different time a vehicle arrived at the sensors one can estimate the vehicle’s speed accurately, corresponding to the estimated vehicle wheelbase and number of axles. Because of measurement errors and vehicle characteristics, there is a lot of overlap between vehicle wheelbase patterns. Therefore, directly setting up a fixed threshold for vehicle classification often leads to low-accuracy results. Using the machine learning pattern recognition method to deal with this problem is believed as one of the most effective tools. In this study, support vector machines (SVMs were used to integrate the classification features extracted from the strain sensors to automatically classify vehicles into five types, ranging from small vehicles to combination trucks, along the lines of the Federal Highway Administration vehicle classification guide. Test bench and field experiments will be introduced in this paper. Two support vector machines classification algorithms (one-against-all, one-against-one are used to classify single sensor data and multiple sensor combination data. Comparison of the two classification method results shows that the classification accuracy is very close using single data or multiple data. Our results indicate that using multiclass SVM-based fusion multiple sensor data significantly improves

  11. MAPPING OF HIGH VALUE CROPS THROUGH AN OBJECT-BASED SVM MODEL USING LIDAR DATA AND ORTHOPHOTO IN AGUSAN DEL NORTE PHILIPPINES

    Directory of Open Access Journals (Sweden)

    R. J. Candare

    2016-06-01

    Full Text Available This research describes the methods involved in the mapping of different high value crops in Agusan del Norte Philippines using LiDAR. This project is part of the Phil-LiDAR 2 Program which aims to conduct a nationwide resource assessment using LiDAR. Because of the high resolution data involved, the methodology described here utilizes object-based image analysis and the use of optimal features from LiDAR data and Orthophoto. Object-based classification was primarily done by developing rule-sets in eCognition. Several features from the LiDAR data and Orthophotos were used in the development of rule-sets for classification. Generally, classes of objects can't be separated by simple thresholds from different features making it difficult to develop a rule-set. To resolve this problem, the image-objects were subjected to Support Vector Machine learning. SVMs have gained popularity because of their ability to generalize well given a limited number of training samples. However, SVMs also suffer from parameter assignment issues that can significantly affect the classification results. More specifically, the regularization parameter C in linear SVM has to be optimized through cross validation to increase the overall accuracy. After performing the segmentation in eCognition, the optimization procedure as well as the extraction of the equations of the hyper-planes was done in Matlab. The learned hyper-planes separating one class from another in the multi-dimensional feature-space can be thought of as super-features which were then used in developing the classifier rule set in eCognition. In this study, we report an overall classification accuracy of greater than 90% in different areas.

  12. Detecting brain structural changes as biomarker from magnetic resonance images using a local feature based SVM approach.

    Science.gov (United States)

    Chen, Ye; Storrs, Judd; Tan, Lirong; Mazlack, Lawrence J; Lee, Jing-Huei; Lu, Long J

    2014-01-15

    Detecting brain structural changes from magnetic resonance (MR) images can facilitate early diagnosis and treatment of neurological and psychiatric diseases. Many existing methods require an accurate deformation registration, which is difficult to achieve and therefore prevents them from obtaining high accuracy. We develop a novel local feature based support vector machine (SVM) approach to detect brain structural changes as potential biomarkers. This approach does not require deformation registration and thus is less influenced by artifacts such as image distortion. We represent the anatomical structures based on scale invariant feature transform (SIFT). Likelihood scores calculated using feature-based morphometry is used as the criterion to categorize image features into three classes (healthy, patient and noise). Regional SVMs are trained to classify the three types of image features in different brain regions. Only healthy and patient features are used to predict the disease status of new brain images. An ensemble classifier is built from the regional SVMs to obtain better prediction accuracy. We apply this approach to 3D MR images of Alzheimer's disease, Parkinson's disease and bipolar disorder. The classification accuracy ranges between 70% and 87%. The highly predictive disease-related regions, which represent significant anatomical differences between the healthy and diseased, are shown in heat maps. The common and disease-specific brain regions are identified by comparing the highly predictive regions in each disease. All of the top-ranked regions are supported by literature. Thus, this approach will be a promising tool for assisting automatic diagnosis and advancing mechanism studies of neurological and psychiatric diseases. Copyright © 2013 Elsevier B.V. All rights reserved.

  13. Classification Algorithms for Big Data Analysis, a Map Reduce Approach

    Science.gov (United States)

    Ayma, V. A.; Ferreira, R. S.; Happ, P.; Oliveira, D.; Feitosa, R.; Costa, G.; Plaza, A.; Gamba, P.

    2015-03-01

    Since many years ago, the scientific community is concerned about how to increase the accuracy of different classification methods, and major achievements have been made so far. Besides this issue, the increasing amount of data that is being generated every day by remote sensors raises more challenges to be overcome. In this work, a tool within the scope of InterIMAGE Cloud Platform (ICP), which is an open-source, distributed framework for automatic image interpretation, is presented. The tool, named ICP: Data Mining Package, is able to perform supervised classification procedures on huge amounts of data, usually referred as big data, on a distributed infrastructure using Hadoop MapReduce. The tool has four classification algorithms implemented, taken from WEKA's machine learning library, namely: Decision Trees, Naïve Bayes, Random Forest and Support Vector Machines (SVM). The results of an experimental analysis using a SVM classifier on data sets of different sizes for different cluster configurations demonstrates the potential of the tool, as well as aspects that affect its performance.

  14. Machine Learning Classification of Buildings for Map Generalization

    Directory of Open Access Journals (Sweden)

    Jaeeun Lee

    2017-10-01

    Full Text Available A critical problem in mapping data is the frequent updating of large data sets. To solve this problem, the updating of small-scale data based on large-scale data is very effective. Various map generalization techniques, such as simplification, displacement, typification, elimination, and aggregation, must therefore be applied. In this study, we focused on the elimination and aggregation of the building layer, for which each building in a large scale was classified as “0-eliminated,” “1-retained,” or “2-aggregated.” Machine-learning classification algorithms were then used for classifying the buildings. The data of 1:1000 scale and 1:25,000 scale digital maps obtained from the National Geographic Information Institute were used. We applied to these data various machine-learning classification algorithms, including naive Bayes (NB, decision tree (DT, k-nearest neighbor (k-NN, and support vector machine (SVM. The overall accuracies of each algorithm were satisfactory: DT, 88.96%; k-NN, 88.27%; SVM, 87.57%; and NB, 79.50%. Although elimination is a direct part of the proposed process, generalization operations, such as simplification and aggregation of polygons, must still be performed for buildings classified as retained and aggregated. Thus, these algorithms can be used for building classification and can serve as preparatory steps for building generalization.

  15. Support Vector Machine Classification of Drunk Driving Behaviour

    Directory of Open Access Journals (Sweden)

    Huiqin Chen

    2017-01-01

    Full Text Available Alcohol is the root cause of numerous traffic accidents due to its pharmacological action on the human central nervous system. This study conducted a detection process to distinguish drunk driving from normal driving under simulated driving conditions. The classification was performed by a support vector machine (SVM classifier trained to distinguish between these two classes by integrating both driving performance and physiological measurements. In addition, principal component analysis was conducted to rank the weights of the features. The standard deviation of R–R intervals (SDNN, the root mean square value of the difference of the adjacent R–R interval series (RMSSD, low frequency (LF, high frequency (HF, the ratio of the low and high frequencies (LF/HF, and average blink duration were the highest weighted features in the study. The results show that SVM classification can successfully distinguish drunk driving from normal driving with an accuracy of 70%. The driving performance data and the physiological measurements reported by this paper combined with air-alcohol concentration could be integrated using the support vector regression classification method to establish a better early warning model, thereby improving vehicle safety.

  16. Geographical classification of apple based on hyperspectral imaging

    Science.gov (United States)

    Guo, Zhiming; Huang, Wenqian; Chen, Liping; Zhao, Chunjiang; Peng, Yankun

    2013-05-01

    Attribute of apple according to geographical origin is often recognized and appreciated by the consumers. It is usually an important factor to determine the price of a commercial product. Hyperspectral imaging technology and supervised pattern recognition was attempted to discriminate apple according to geographical origins in this work. Hyperspectral images of 207 Fuji apple samples were collected by hyperspectral camera (400-1000nm). Principal component analysis (PCA) was performed on hyperspectral imaging data to determine main efficient wavelength images, and then characteristic variables were extracted by texture analysis based on gray level co-occurrence matrix (GLCM) from dominant waveband image. All characteristic variables were obtained by fusing the data of images in efficient spectra. Support vector machine (SVM) was used to construct the classification model, and showed excellent performance in classification results. The total classification rate had the high classify accuracy of 92.75% in the training set and 89.86% in the prediction sets, respectively. The overall results demonstrated that the hyperspectral imaging technique coupled with SVM classifier can be efficiently utilized to discriminate Fuji apple according to geographical origins.

  17. CLASSIFICATION ALGORITHMS FOR BIG DATA ANALYSIS, A MAP REDUCE APPROACH

    Directory of Open Access Journals (Sweden)

    V. A. Ayma

    2015-03-01

    Full Text Available Since many years ago, the scientific community is concerned about how to increase the accuracy of different classification methods, and major achievements have been made so far. Besides this issue, the increasing amount of data that is being generated every day by remote sensors raises more challenges to be overcome. In this work, a tool within the scope of InterIMAGE Cloud Platform (ICP, which is an open-source, distributed framework for automatic image interpretation, is presented. The tool, named ICP: Data Mining Package, is able to perform supervised classification procedures on huge amounts of data, usually referred as big data, on a distributed infrastructure using Hadoop MapReduce. The tool has four classification algorithms implemented, taken from WEKA’s machine learning library, namely: Decision Trees, Naïve Bayes, Random Forest and Support Vector Machines (SVM. The results of an experimental analysis using a SVM classifier on data sets of different sizes for different cluster configurations demonstrates the potential of the tool, as well as aspects that affect its performance.

  18. Support Vector Machine Classification of Drunk Driving Behaviour.

    Science.gov (United States)

    Chen, Huiqin; Chen, Lei

    2017-01-23

    Alcohol is the root cause of numerous traffic accidents due to its pharmacological action on the human central nervous system. This study conducted a detection process to distinguish drunk driving from normal driving under simulated driving conditions. The classification was performed by a support vector machine (SVM) classifier trained to distinguish between these two classes by integrating both driving performance and physiological measurements. In addition, principal component analysis was conducted to rank the weights of the features. The standard deviation of R-R intervals (SDNN), the root mean square value of the difference of the adjacent R-R interval series (RMSSD), low frequency (LF), high frequency (HF), the ratio of the low and high frequencies (LF/HF), and average blink duration were the highest weighted features in the study. The results show that SVM classification can successfully distinguish drunk driving from normal driving with an accuracy of 70%. The driving performance data and the physiological measurements reported by this paper combined with air-alcohol concentration could be integrated using the support vector regression classification method to establish a better early warning model, thereby improving vehicle safety.

  19. Object-Based Urban Tree Species Classification Using Bi-Temporal WorldView-2 and WorldView-3 Images

    Directory of Open Access Journals (Sweden)

    Dan Li

    2015-12-01

    Full Text Available Urban tree species mapping is an important prerequisite to understanding the value of urban vegetation in ecological services. In this study, we explored the potential of bi-temporal WorldView-2 (WV2, acquired on 14 September 2012 and WorldView-3 images (WV3, acquired on 18 October 2014 for identifying five dominant urban tree species with the object-based Support Vector Machine (SVM and Random Forest (RF methods. Two study areas in Beijing, China, Capital Normal University (CNU and Beijing Normal University (BNU, representing the typical urban environment, were evaluated. Three classification schemes—classification based solely on WV2; WV3; and bi-temporal WV2 and WV3 images—were examined. Our study showed that the single-date image did not produce satisfying classification results as both producer and user accuracies of tree species were relatively low (44.7%–82.5%, whereas those derived from bi-temporal images were on average 10.7% higher. In addition, the overall accuracy increased substantially (9.7%–20.2% for the CNU area and 4.7%–12% for BNU. A thorough analysis concluded that near-infrared 2, red-edge and green bands are always more important than the other bands to classification, and spectral features always contribute more than textural features. Our results also showed that the scattered distribution of trees and a more complex surrounding environment reduced classification accuracy. Comparisons between SVM and RF classifiers suggested that SVM is more effective for urban tree species classification as it outperforms RF when working with a smaller amount and imbalanced distribution of samples.

  20. An up-to-date comparison of state-of-the-art classification algorithms

    KAUST Repository

    Zhang, Chongsheng

    2017-04-05

    Current benchmark reports of classification algorithms generally concern common classifiers and their variants but do not include many algorithms that have been introduced in recent years. Moreover, important properties such as the dependency on number of classes and features and CPU running time are typically not examined. In this paper, we carry out a comparative empirical study on both established classifiers and more recently proposed ones on 71 data sets originating from different domains, publicly available at UCI and KEEL repositories. The list of 11 algorithms studied includes Extreme Learning Machine (ELM), Sparse Representation based Classification (SRC), and Deep Learning (DL), which have not been thoroughly investigated in existing comparative studies. It is found that Stochastic Gradient Boosting Trees (GBDT) matches or exceeds the prediction performance of Support Vector Machines (SVM) and Random Forests (RF), while being the fastest algorithm in terms of prediction efficiency. ELM also yields good accuracy results, ranking in the top-5, alongside GBDT, RF, SVM, and C4.5 but this performance varies widely across all data sets. Unsurprisingly, top accuracy performers have average or slow training time efficiency. DL is the worst performer in terms of accuracy but second fastest in prediction efficiency. SRC shows good accuracy performance but it is the slowest classifier in both training and testing.

  1. Feature selection and classification methodology for the detection of knee-joint disorders.

    Science.gov (United States)

    Nalband, Saif; Sundar, Aditya; Prince, A Amalin; Agarwal, Anita

    2016-04-01

    Vibroarthographic (VAG) signals emitted from the knee joint disorder provides an early diagnostic tool. The nonstationary and nonlinear nature of VAG signal makes an important aspect for feature extraction. In this work, we investigate VAG signals by proposing a wavelet based decomposition. The VAG signals are decomposed into sub-band signals of different frequencies. Nonlinear features such as recurrence quantification analysis (RQA), approximate entropy (ApEn) and sample entropy (SampEn) are extracted as features of VAG signal. A total of twenty-four features form a vector to characterize a VAG signal. Two feature selection (FS) techniques, apriori algorithm and genetic algorithm (GA) selects six and four features as the most significant features. Least square support vector machines (LS-SVM) and random forest are proposed as classifiers to evaluate the performance of FS techniques. Results indicate that the classification accuracy was more prominent with features selected from FS algorithms. Results convey that LS-SVM using the apriori algorithm gives the highest accuracy of 94.31% with false discovery rate (FDR) of 0.0892. The proposed work also provided better classification accuracy than those reported in the previous studies which gave an accuracy of 88%. This work can enhance the performance of existing technology for accurately distinguishing normal and abnormal VAG signals. And the proposed methodology could provide an effective non-invasive diagnostic tool for knee joint disorders. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  2. Fault diagnosis method based on FFT-RPCA-SVM for Cascaded-Multilevel Inverter.

    Science.gov (United States)

    Wang, Tianzhen; Qi, Jie; Xu, Hao; Wang, Yide; Liu, Lei; Gao, Diju

    2016-01-01

    Thanks to reduced switch stress, high quality of load wave, easy packaging and good extensibility, the cascaded H-bridge multilevel inverter is widely used in wind power system. To guarantee stable operation of system, a new fault diagnosis method, based on Fast Fourier Transform (FFT), Relative Principle Component Analysis (RPCA) and Support Vector Machine (SVM), is proposed for H-bridge multilevel inverter. To avoid the influence of load variation on fault diagnosis, the output voltages of the inverter is chosen as the fault characteristic signals. To shorten the time of diagnosis and improve the diagnostic accuracy, the main features of the fault characteristic signals are extracted by FFT. To further reduce the training time of SVM, the feature vector is reduced based on RPCA that can get a lower dimensional feature space. The fault classifier is constructed via SVM. An experimental prototype of the inverter is built to test the proposed method. Compared to other fault diagnosis methods, the experimental results demonstrate the high accuracy and efficiency of the proposed method. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.

  3. Epileptic Seizure Classification of EEGs using Time-Frequency Analysis based Multiscale Radial Basis Functions.

    Science.gov (United States)

    Li, Yang; Wang, Xu; Luo, Lin; Li, Ke; Yang, Xiao; Guo, Qi

    2017-03-10

    The automatic detection of epileptic seizures from electroencephalography (EEG) signals is crucial for the localization and classification of epileptic seizure activity. However, seizure processes are typically dynamic and nonstationary, and thus distinguishing rhythmic discharges from nonstationary processes is one of the challenging problems. In this paper, an adaptive and localized time-frequency representation in EEG signals is proposed by means of multiscale radial basis functions (MRBF) and a modified particle swarm optimization (MPSO) to improve both time and frequency resolution simultaneously, which is a novel MRBF-MPSO framework of the time-frequency feature extraction for epileptic EEG signals. The dimensionality of extracted features can be greatly reduced by the principle component analysis (PCA) algorithm before the most discriminative features selected are fed into a SVM classifier with the radial basis function (RBF) in order to separate epileptic seizure from seizure-free EEG signals. The classification performance of the proposed method has been evaluated by using several state-of-art feature extraction algorithms and other five different classifiers like linear discriminant analysis (LDA), and Logistic Regression (LR). The experimental results indicate that the proposed MRBF-MPSO-SVM classification method outperforms competing techniques in terms of classification accuracy, and show the effectiveness of the proposed method for classification of seizure epochs and seizure-free epochs.

  4. Evaluation of feature selection algorithms for classification in temporal lobe epilepsy based on MR images

    Science.gov (United States)

    Lai, Chunren; Guo, Shengwen; Cheng, Lina; Wang, Wensheng; Wu, Kai

    2017-02-01

    It's very important to differentiate the temporal lobe epilepsy (TLE) patients from healthy people and localize the abnormal brain regions of the TLE patients. The cortical features and changes can reveal the unique anatomical patterns of brain regions from the structural MR images. In this study, structural MR images from 28 normal controls (NC), 18 left TLE (LTLE), and 21 right TLE (RTLE) were acquired, and four types of cortical feature, namely cortical thickness (CTh), cortical surface area (CSA), gray matter volume (GMV), and mean curvature (MCu), were explored for discriminative analysis. Three feature selection methods, the independent sample t-test filtering, the sparse-constrained dimensionality reduction model (SCDRM), and the support vector machine-recursive feature elimination (SVM-RFE), were investigated to extract dominant regions with significant differences among the compared groups for classification using the SVM classifier. The results showed that the SVM-REF achieved the highest performance (most classifications with more than 92% accuracy), followed by the SCDRM, and the t-test. Especially, the surface area and gray volume matter exhibited prominent discriminative ability, and the performance of the SVM was improved significantly when the four cortical features were combined. Additionally, the dominant regions with higher classification weights were mainly located in temporal and frontal lobe, including the inferior temporal, entorhinal cortex, fusiform, parahippocampal cortex, middle frontal and frontal pole. It was demonstrated that the cortical features provided effective information to determine the abnormal anatomical pattern and the proposed method has the potential to improve the clinical diagnosis of the TLE.

  5. Binary classification of multichannel-EEG records based on the ϵ-complexity of continuous vector functions.

    Science.gov (United States)

    Piryatinska, Alexandra; Darkhovsky, Boris; Kaplan, Alexander

    2017-12-01

    A crucial step in a classification of electroencephalogram (EEG) records is the feature selection. The feature selection problem is difficult because of the complex structure of EEG signals. To classify the EEG signals with good accuracy, most of the recently published studies have used high-dimensional feature spaces. Our objective is to create a low-dimensional feature space that enables binary classification of EEG records. The proposed approach is based on our theory of the ϵ-complexity of continuous functions, which is extended here (see Appendix) to the case of vector functions. This extension permits us to handle multichannel-EEG records. The method consists of two steps. Firstly, we estimate the ϵ-complexity coefficients of the original signal and its finite differences. Secondly, we utilize the random forest (RF) or support vector machine (SVM) classifier. We demonstrated the performance of our method on simulated data. We also applied it to the problem of classification of multichannel-EEG records related to a group of healthy adolescents (39 subjects) and a group of adolescents with schizophrenia (45 subjects). We found that the random forest classifier provides a superior result. In particular, out-of-bag accuracy in the case of RF was 85.3%. Using 10-fold cross-validation (CV), RF gave an average accuracy of 84.5% on a test set, whereas SVM gave an accuracy of 81.07%. We note that the highest accuracy on CV was 89.3%. To compare our method with the classical approach, we performed classification using the spectral features. In this case, the best performance was achieved using seven-dimensional feature space, with an average accuracy of 83.6%. We developed a model-free method for binary classification of EEG records. The feature space was reduced to four dimensions. The results obtained indicate the effectiveness of the proposed method. Copyright © 2017 Elsevier B.V. All rights reserved.

  6. SVM-based automatic diagnosis method for keratoconus

    Science.gov (United States)

    Gao, Yuhong; Wu, Qiang; Li, Jing; Sun, Jiande; Wan, Wenbo

    2017-06-01

    Keratoconus is a progressive cornea disease that can lead to serious myopia and astigmatism, or even to corneal transplantation, if it becomes worse. The early detection of keratoconus is extremely important to know and control its condition. In this paper, we propose an automatic diagnosis algorithm for keratoconus to discriminate the normal eyes and keratoconus ones. We select the parameters obtained by Oculyzer as the feature of cornea, which characterize the cornea both directly and indirectly. In our experiment, 289 normal cases and 128 keratoconus cases are divided into training and test sets respectively. Far better than other kernels, the linear kernel of SVM has sensitivity of 94.94% and specificity of 97.87% with all the parameters training in the model. In single parameter experiment of linear kernel, elevation with 92.03% sensitivity and 98.61% specificity and thickness with 97.28% sensitivity and 97.82% specificity showed their good classification abilities. Combining elevation and thickness of the cornea, the proposed method can reach 97.43% sensitivity and 99.19% specificity. The experiments demonstrate that the proposed automatic diagnosis method is feasible and reliable.

  7. A Realistic Seizure Prediction Study Based on Multiclass SVM.

    Science.gov (United States)

    Direito, Bruno; Teixeira, César A; Sales, Francisco; Castelo-Branco, Miguel; Dourado, António

    2017-05-01

    A patient-specific algorithm, for epileptic seizure prediction, based on multiclass support-vector machines (SVM) and using multi-channel high-dimensional feature sets, is presented. The feature sets, combined with multiclass classification and post-processing schemes aim at the generation of alarms and reduced influence of false positives. This study considers 216 patients from the European Epilepsy Database, and includes 185 patients with scalp EEG recordings and 31 with intracranial data. The strategy was tested over a total of 16,729.80[Formula: see text]h of inter-ictal data, including 1206 seizures. We found an overall sensitivity of 38.47% and a false positive rate per hour of 0.20. The performance of the method achieved statistical significance in 24 patients (11% of the patients). Despite the encouraging results previously reported in specific datasets, the prospective demonstration on long-term EEG recording has been limited. Our study presents a prospective analysis of a large heterogeneous, multicentric dataset. The statistical framework based on conservative assumptions, reflects a realistic approach compared to constrained datasets, and/or in-sample evaluations. The improvement of these results, with the definition of an appropriate set of features able to improve the distinction between the pre-ictal and nonpre-ictal states, hence minimizing the effect of confounding variables, remains a key aspect.

  8. A novel stepwise support vector machine (SVM) method based on ...

    African Journals Online (AJOL)

    ajl yemi

    2011-11-23

    Nov 23, 2011 ... began to use computational approaches, particularly machine learning methods to identify pre-miRNAs (Xue et al., 2005; Huang et al., 2007; Jiang et al., 2007). Xue et al. (2005) presented a support vector machine (SVM)- based classifier called triplet-SVM, which classifies human pre-miRNAs from pseudo ...

  9. Estimating grassland biomass using SVM band shaving of hyperspectral data

    NARCIS (Netherlands)

    Clevers, J.G.P.W.; Heijden, van der G.W.A.M.; Verzakov, S.; Schaepman, M.E.

    2007-01-01

    In this paper, the potential of a band shaving algorithm based on support vector machines (SVM) applied to hyperspectral data for estimating biomass within grasslands is studied. Field spectrometer data and biomass measurements were collected from a homogeneously managed grassland field. The SVM

  10. Power line identification of millimeter wave radar based on PCA-GS-SVM

    Science.gov (United States)

    Fang, Fang; Zhang, Guifeng; Cheng, Yansheng

    2017-12-01

    Aiming at the problem that the existing detection method can not effectively solve the security of UAV's ultra low altitude flight caused by power line, a power line recognition method based on grid search (GS) and the principal component analysis and support vector machine (PCA-SVM) is proposed. Firstly, the candidate line of Hough transform is reduced by PCA, and the main feature of candidate line is extracted. Then, upport vector machine (SVM is) optimized by grid search method (GS). Finally, using support vector machine classifier optimized parameters to classify the candidate line. MATLAB simulation results show that this method can effectively identify the power line and noise, and has high recognition accuracy and algorithm efficiency.

  11. Parameter optimization using GA in SVM to predict damage level of non-reshaped berm breakwater.

    Digital Repository Service at National Institute of Oceanography (India)

    Harish, N.; Lokesha.; Mandal, S.; Rao, S.; Patil, S.G.

    In the present study, Support Vector Machines (SVM) and hybrid of Genetic Algorithm (GA) with SVM models are developed to predict the damage level of non-reshaped berm breakwaters. Optimal kernel parameters of SVM are determined by using GA...

  12. Accuracy of the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) as a research tool for identification of patients with uveitis and scleritis.

    Science.gov (United States)

    Uchiyama, Eduardo; Faez, Sepideh; Nasir, Humzah; Unizony, Sebastian H; Plenge, Robert; Papaliodis, George N; Sobrin, Lucia

    2015-04-01

    To report on the accuracy of the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes for identifying patients with polymyalgia rheumatica (PMR) and concurrent noninfectious inflammatory ocular conditions in a large healthcare organization database. Queries for patients with PMR and uveitis or scleritis were executed in two general teaching hospitals' databases. Patients with ocular infections or other rheumatologic conditions were excluded. Patients with PMR and ocular inflammation were identified, and medical records were reviewed to confirm accuracy. The query identified 10,697 patients with the ICD-9-CM code for PMR and 4154 patients with the codes for noninfectious inflammatory ocular conditions. The number of patients with both PMR and noninfectious uveitis or scleritis by ICD-9-CM codes was 66. On detailed review of the charts of these 66 patients, 31 (47%) had a clinical diagnosis of PMR, 43 (65%) had noninfectious uveitis or scleritis, and only 20 (30%) had PMR with concurrent noninfectious uveitis or scleritis confirmed based on clinical notes. While the use of ICD-9-CM codes has been validated for medical research of common diseases, our results suggest that ICD-9-CM codes may be of limited value for epidemiological investigations of diseases which can be more difficult to diagnose. The ICD-9-CM codes for rarer diseases (PMR, uveitis and scleritis) did not reflect the true clinical problem in a large proportion of our patients. This is particularly true when coding is performed by physicians outside the area of specialty of the diagnosis.

  13. Automatic detection and classification of leukocytes using convolutional neural networks.

    Science.gov (United States)

    Zhao, Jianwei; Zhang, Minshu; Zhou, Zhenghua; Chu, Jianjun; Cao, Feilong

    2017-08-01

    The detection and classification of white blood cells (WBCs, also known as Leukocytes) is a hot issue because of its important applications in disease diagnosis. Nowadays the morphological analysis of blood cells is operated manually by skilled operators, which results in some drawbacks such as slowness of the analysis, a non-standard accuracy, and the dependence on the operator's skills. Although there have been many papers studying the detection of WBCs or classification of WBCs independently, few papers consider them together. This paper proposes an automatic detection and classification system for WBCs from peripheral blood images. It firstly proposes an algorithm to detect WBCs from the microscope images based on the simple relation of colors R, B and morphological operation. Then a granularity feature (pairwise rotation invariant co-occurrence local binary pattern, PRICoLBP feature) and SVM are applied to classify eosinophil and basophil from other WBCs firstly. Lastly, convolution neural networks are used to extract features in high level from WBCs automatically, and a random forest is applied to these features to recognize the other three kinds of WBCs: neutrophil, monocyte and lymphocyte. Some detection experiments on Cellavison database and ALL-IDB database show that our proposed detection method has better effect almost than iterative threshold method with less cost time, and some classification experiments show that our proposed classification method has better accuracy almost than some other methods.

  14. A Selective Ensemble Classification Method Combining Mammography Images with Ultrasound Images for Breast Cancer Diagnosis

    Directory of Open Access Journals (Sweden)

    Jinyu Cong

    2017-01-01

    Full Text Available Breast cancer has been one of the main diseases that threatens women’s life. Early detection and diagnosis of breast cancer play an important role in reducing mortality of breast cancer. In this paper, we propose a selective ensemble method integrated with the KNN, SVM, and Naive Bayes to diagnose the breast cancer combining ultrasound images with mammography images. Our experimental results have shown that the selective classification method with an accuracy of 88.73% and sensitivity of 97.06% is efficient for breast cancer diagnosis. And indicator R presents a new way to choose the base classifier for ensemble learning.

  15. Comparative Study of Data Classification Methods Between EEG and ECoG Used to BCI

    Directory of Open Access Journals (Sweden)

    Yu Ge

    2014-09-01

    Full Text Available Effective decoding of the source signal is a key to improve Brain-computer interfaces (BCI performances. Two groups of motor imagery (MI data based on electroencephalograms (EEG and electrocorticograms (ECoG which provided by International Brain-Computer Interface Competition organization are analyzed, and concluded that ECoG signals processing is more suitable for model-driven approaches. Temporal-frequency features were extracted by model-driven method instead of data-driven method and compared, and classified by support vector machine (SVM. The results show 6 % improvement of motor imagery experiment classification accuracy on ECoG data, compared with of data-driven method.

  16. Diagnostic Accuracy of the FIGO and the 5-Tier Fetal Heart Rate Classification Systems in the Detection of Neonatal Acidemia.

    Science.gov (United States)

    Martí Gamboa, Sabina; Giménez, Olga Redrado; Mancho, Jara Pascual; Moros, María Lapresta; Sada, Julia Ruiz; Mateo, Sergio Castan

    2017-04-01

    Objective  The objective of this study was to determine ability to detect neonatal acidemia and interobserver agreement with the FIGO 3-tier and 5-tier fetal heart rate (FHR) classification systems. Design  This was a case-control study. Setting  This study was set at the University Medical Center. Population  A total of 202 FHR tracings of 102 women who delivered an acidemic fetus (umbilical arterial cord gas pH ≤ 7.10 and BE  7.10) were assessed. A subanalysis was performed for those fetuses who suffered severe metabolic acidemia (pH ≤ 7.0 and BE < - 12). Methods  Two reviewers blind to clinical and outcome data classified tracings according to the new 3-tier system proposed by the FIGO and the 5-tier system proposed by Parer and Ikeda. Main Outcome Measures  Sensitivity and specificity for detecting neonatal acidemia and interobserver agreement in classifying FHR tracings into categories of both systems were studied. Results  The 3-tier system showed a greater sensitivity and lower specificity to detect neonatal acidemia (43.6% sensitivity, 82.5% specificity) and severe metabolic acidemia (71.4% sensitivity, 74.0% specificity) compared with the 5-tier system (36.3% sensitivity, 88% specificity and 61.9% sensitivity, 80.1% specificity, respectively). Both systems were compared by area under the receiver-operating characteristic curve, with comparable predictive ability for detecting neonatal acidemia (FIGO-area under the curve [AUC]: 0.63 [95% confidence interval [CI]: 0.57-0.68] and Parer-AUC: 0.62 [95% CI: 0.56-0.67]). Interobserver agreement was moderate for both systems, but performance at each specific category showed a better agreement for the 5-tier system identifying a pathological tracing (orange or red, κ: 0.625 vs. pathological category, κ: 0.538). Conclusion  Both systems presented a comparable ability to predict neonatal acidemia, although the 5-tier system showed a better interobserver agreement identifying pathological

  17. A classification framework for lung tissue categorization

    Science.gov (United States)

    Depeursinge, Adrien; Iavindrasana, Jimison; Hidki, Asmâa; Cohen, Gilles; Geissbuhler, Antoine; Platon, Alexandra; Poletti, Pierre-Alexandre; Müller, Henning

    2008-03-01

    We compare five common classifier families in their ability to categorize six lung tissue patterns in high-resolution computed tomography (HRCT) images of patients affected with interstitial lung diseases (ILD) but also normal tissue. The evaluated classifiers are Naive Bayes, k-Nearest Neighbor (k-NN), J48 decision trees, Multi-Layer Perceptron (MLP) and Support Vector Machines (SVM). The dataset used contains 843 regions of interest (ROI) of healthy and five pathologic lung tissue patterns identified by two radiologists at the University Hospitals of Geneva. Correlation of the feature space composed of 39 texture attributes is studied. A grid search for optimal parameters is carried out for each classifier family. Two complementary metrics are used to characterize the performances of classification. Those are based on McNemar's statistical tests and global accuracy. SVM reached best values for each metric and allowed a mean correct prediction rate of 87.9% with high class-specific precision on testing sets of 423 ROIs.

  18. Forest type classification with combination of advanced polarimetric decompositions and textures of L-band synthetic aperture radar data

    Science.gov (United States)

    Middinti, Suresh; Jha, Chandra Shekhar; Reddy, Thatiparthi Byragi

    2017-01-01

    Information on distribution of forest types and land cover classes is essential for decision making and significant in climate regulation, biodiversity conservation, and societal issues. An approach for the combination of advanced polarimetric decompositions and textures of Advanced Land Observing Satellite Phased Array L-band Synthetic Aperture Radar full polarimetric data for the purpose of forest type classification is proposed. Using a support vector machine (SVM) classifier, we classified forest types over a selected Indian region. Further, we tested the classification performance of the Wishart method for the same forest types. The classified results were assessed with confusion matrix-based statistics. The results suggest that incorporation of various polarimetric decompositions features into gray-level co-occurrence matrix textures refines the SVM classification overall accuracy (OA) from 73.82% (k=0.69) to 76.34% (k=0.72). The Wishart supervised classification algorithm has the OA of 73.38% (kappa=0.68). We observed that integration of polarimetric information with textures can give complimentary information in forest type discrimination and produce high accuracy maps. Further, this approach overcomes the limitations of optical remote sensing data in continuous cloud coverage areas.

  19. Improving Classification of Airborne Laser Scanning Echoes in the Forest-Tundra Ecotone Using Geostatistical and Statistical Measures

    Directory of Open Access Journals (Sweden)

    Nadja Stumberg

    2014-05-01

    Full Text Available The vegetation in the forest-tundra ecotone zone is expected to be highly affected by climate change and requires effective monitoring techniques. Airborne laser scanning (ALS has been proposed as a tool for the detection of small pioneer trees for such vast areas using laser height and intensity data. The main objective of the present study was to assess a possible improvement in the performance of classifying tree and nontree laser echoes from high-density ALS data. The data were collected along a 1000 km long transect stretching from southern to northern Norway. Different geostatistical and statistical measures derived from laser height and intensity values were used to extent and potentially improve more simple models ignoring the spatial context. Generalised linear models (GLM and support vector machines (SVM were employed as classification methods. Total accuracies and Cohen’s kappa coefficients were calculated and compared to those of simpler models from a previous study. For both classification methods, all models revealed total accuracies similar to the results of the simpler models. Concerning classification performance, however, the comparison of the kappa coefficients indicated a significant improvement for some models both using GLM and SVM, with classification accuracies >94%.

  20. Two-Class Weather Classification.

    Science.gov (United States)

    Lu, Cewu; Lin, Di; Jia, Jiaya; Tang, Chi-Keung

    2017-12-01

    Given a single outdoor image, we propose a collaborative learning approach using novel weather features to label the image as either sunny or cloudy. Though limited, this two-class classification problem is by no means trivial given the great variety of outdoor images captured by different cameras where the images may have been edited after capture. Our overall weather feature combines the data-driven convolutional neural network (CNN) feature and well-chosen weather-specific features. They work collaboratively within a unified optimization framework that is aware of the presence (or absence) of a given weather cue during learning and classification. In this paper we propose a new data augmentation scheme to substantially enrich the training data, which is used to train a latent SVM framework to make our solution insensitive to global intensity transfer. Extensive experiments are performed to verify our method. Compared with our previous work and the sole use of a CNN classifier, this paper improves the accuracy up to 7-8 percent. Our weather image dataset is available together with the executable of our classifier.

  1. SUPPORT VECTOR MACHINE CLASSIFICATION OF OBJECT-BASED DATA FOR CROP MAPPING, USING MULTI-TEMPORAL LANDSAT IMAGERY

    Directory of Open Access Journals (Sweden)

    R. Devadas

    2012-07-01

    Full Text Available Crop mapping and time series analysis of agronomic cycles are critical for monitoring land use and land management practices, and analysing the issues of agro-environmental impacts and climate change. Multi-temporal Landsat data can be used to analyse decadal changes in cropping patterns at field level, owing to its medium spatial resolution and historical availability. This study attempts to develop robust remote sensing techniques, applicable across a large geographic extent, for state-wide mapping of cropping history in Queensland, Australia. In this context, traditional pixel-based classification was analysed in comparison with image object-based classification using advanced supervised machine-learning algorithms such as Support Vector Machine (SVM. For the Darling Downs region of southern Queensland we gathered a set of Landsat TM images from the 2010–2011 cropping season. Landsat data, along with the vegetation index images, were subjected to multiresolution segmentation to obtain polygon objects. Object-based methods enabled the analysis of aggregated sets of pixels, and exploited shape-related and textural variation, as well as spectral characteristics. SVM models were chosen after examining three shape-based parameters, twenty-three textural parameters and ten spectral parameters of the objects. We found that the object-based methods were superior to the pixel-based methods for classifying 4 major landuse/land cover classes, considering the complexities of within field spectral heterogeneity and spectral mixing. Comparative analysis clearly revealed that higher overall classification accuracy (95% was observed in the object-based SVM compared with that of traditional pixel-based classification (89% using maximum likelihood classifier (MLC. Object-based classification also resulted speckle-free images. Further, object-based SVM models were used to classify different broadacre crop types for summer and winter seasons. The influence of

  2. Combination of minimum enclosing balls classifier with SVM in coal-rock recognition.

    Directory of Open Access Journals (Sweden)

    QingJun Song

    Full Text Available Top-coal caving technology is a productive and efficient method in modern mechanized coal mining, the study of coal-rock recognition is key to realizing automation in comprehensive mechanized coal mining. In this paper we propose a new discriminant analysis framework for coal-rock recognition. In the framework, a data acquisition model with vibration and acoustic signals is designed and the caving dataset with 10 feature variables and three classes is got. And the perfect combination of feature variables can be automatically decided by using the multi-class F-score (MF-Score feature selection. In terms of nonlinear mapping in real-world optimization problem, an effective minimum enclosing ball (MEB algorithm plus Support vector machine (SVM is proposed for rapid detection of coal-rock in the caving process. In particular, we illustrate how to construct MEB-SVM classifier in coal-rock recognition which exhibit inherently complex distribution data. The proposed method is examined on UCI data sets and the caving dataset, and compared with some new excellent SVM classifiers. We conduct experiments with accuracy and Friedman test for comparison of more classifiers over multiple on the UCI data sets. Experimental results demonstrate that the proposed algorithm has good robustness and generalization ability. The results of experiments on the caving dataset show the better performance which leads to a promising feature selection and multi-class recognition in coal-rock recognition.

  3. A hybrid feature selection method using multiclass SVM for diagnosis of erythemato-squamous disease

    Science.gov (United States)

    Maryam, Setiawan, Noor Akhmad; Wahyunggoro, Oyas

    2017-08-01

    The diagnosis of erythemato-squamous disease is a complex problem and difficult to detect in dermatology. Besides that, it is a major cause of skin cancer. Data mining implementation in the medical field helps expert to diagnose precisely, accurately, and inexpensively. In this research, we use data mining technique to developed a diagnosis model based on multiclass SVM with a novel hybrid feature selection method to diagnose erythemato-squamous disease. Our hybrid feature selection method, named ChiGA (Chi Square and Genetic Algorithm), uses the advantages from filter and wrapper methods to select the optimal feature subset from original feature. Chi square used as filter method to remove redundant features and GA as wrapper method to select the ideal feature subset with SVM used as classifier. Experiment performed with 10 fold cross validation on erythemato-squamous diseases dataset taken from University of California Irvine (UCI) machine learning database. The experimental result shows that the proposed model based multiclass SVM with Chi Square and GA can give an optimum feature subset. There are 18 optimum features with 99.18% accuracy.

  4. Combination of minimum enclosing balls classifier with SVM in coal-rock recognition.

    Science.gov (United States)

    Song, QingJun; Jiang, HaiYan; Song, Qinghui; Zhao, XieGuang; Wu, Xiaoxuan

    2017-01-01

    Top-coal caving technology is a productive and efficient method in modern mechanized coal mining, the study of coal-rock recognition is key to realizing automation in comprehensive mechanized coal mining. In this paper we propose a new discriminant analysis framework for coal-rock recognition. In the framework, a data acquisition model with vibration and acoustic signals is designed and the caving dataset with 10 feature variables and three classes is got. And the perfect combination of feature variables can be automatically decided by using the multi-class F-score (MF-Score) feature selection. In terms of nonlinear mapping in real-world optimization problem, an effective minimum enclosing ball (MEB) algorithm plus Support vector machine (SVM) is proposed for rapid detection of coal-rock in the caving process. In particular, we illustrate how to construct MEB-SVM classifier in coal-rock recognition which exhibit inherently complex distribution data. The proposed method is examined on UCI data sets and the caving dataset, and compared with some new excellent SVM classifiers. We conduct experiments with accuracy and Friedman test for comparison of more classifiers over multiple on the UCI data sets. Experimental results demonstrate that the proposed algorithm has good robustness and generalization ability. The results of experiments on the caving dataset show the better performance which leads to a promising feature selection and multi-class recognition in coal-rock recognition.

  5. Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests.

    Science.gov (United States)

    Maroco, João; Silva, Dina; Rodrigues, Ana; Guerreiro, Manuela; Santana, Isabel; de Mendonça, Alexandre

    2011-08-17

    Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Press' Q test showed that all classifiers performed better than chance alone (p classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed overall classification accuracy above a median value of 0.63, but for most

  6. Graph-Based Semi-Supervised Hyperspectral Image Classification Using Spatial Information

    Science.gov (United States)

    Jamshidpour, N.; Homayouni, S.; Safari, A.

    2017-09-01

    Hyperspectral image classification has been one of the most popular research areas in the remote sensing community in the past decades. However, there are still some problems that need specific attentions. For example, the lack of enough labeled samples and the high dimensionality problem are two most important issues which degrade the performance of supervised classification dramatically. The main idea of semi-supervised learning is to overcome these issues by the contribution of unlabeled samples, which are available in an enormous amount. In this paper, we propose a graph-based semi-supervised classification method, which uses both spectral and spatial information for hyperspectral image classification. More specifically, two graphs were designed and constructed in order to exploit the relationship among pixels in spectral and spatial spaces respectively. Then, the Laplacians of both graphs were merged to form a weighted joint graph. The experiments were carried out on two different benchmark hyperspectral data sets. The proposed method performed significantly better than the well-known supervised classification methods, such as SVM. The assessments consisted of both accuracy and homogeneity analyses of the produced classification maps. The proposed spectral-spatial SSL method considerably increased the classification accuracy when the labeled training data set is too scarce.When there were only five labeled samples for each class, the performance improved 5.92% and 10.76% compared to spatial graph-based SSL, for AVIRIS Indian Pine and Pavia University data sets respectively.

  7. GRAPH-BASED SEMI-SUPERVISED HYPERSPECTRAL IMAGE CLASSIFICATION USING SPATIAL INFORMATION

    Directory of Open Access Journals (Sweden)

    N. Jamshidpour

    2017-09-01

    Full Text Available Hyperspectral image classification has been one of the most popular research areas in the remote sensing community in the past decades. However, there are still some problems that need specific attentions. For example, the lack of enough labeled samples and the high dimensionality problem are two most important issues which degrade the performance of supervised classification dramatically. The main idea of semi-supervised learning is to overcome these issues by the contribution of unlabeled samples, which are available in an enormous amount. In this paper, we propose a graph-based semi-supervised classification method, which uses both spectral and spatial information for hyperspectral image classification. More specifically, two graphs were designed and constructed in order to exploit the relationship among pixels in spectral and spatial spaces respectively. Then, the Laplacians of both graphs were merged to form a weighted joint graph. The experiments were carried out on two different benchmark hyperspectral data sets. The proposed method performed significantly better than the well-known supervised classification methods, such as SVM. The assessments consisted of both accuracy and homogeneity analyses of the produced classification maps. The proposed spectral-spatial SSL method considerably increased the classification accuracy when the labeled training data set is too scarce.When there were only five labeled samples for each class, the performance improved 5.92% and 10.76% compared to spatial graph-based SSL, for AVIRIS Indian Pine and Pavia University data sets respectively.

  8. Object-based land cover classification based on fusion of multifrequency SAR data and THAICHOTE optical imagery

    Science.gov (United States)

    Sukawattanavijit, Chanika; Srestasathiern, Panu

    2017-10-01

    Land Use and Land Cover (LULC) information are significant to observe and evaluate environmental change. LULC classification applying remotely sensed data is a technique popularly employed on a global and local dimension particularly, in urban areas which have diverse land cover types. These are essential components of the urban terrain and ecosystem. In the present, object-based image analysis (OBIA) is becoming widely popular for land cover classification using the high-resolution image. COSMO-SkyMed SAR data was fused with THAICHOTE (namely, THEOS: Thailand Earth Observation Satellite) optical data for land cover classification using object-based. This paper indicates a comparison between object-based and pixel-based approaches in image fusion. The per-pixel method, support vector machines (SVM) was implemented to the fused image based on Principal Component Analysis (PCA). For the objectbased classification was applied to the fused images to separate land cover classes by using nearest neighbor (NN) classifier. Finally, the accuracy assessment was employed by comparing with the classification of land cover mapping generated from fused image dataset and THAICHOTE image. The object-based data fused COSMO-SkyMed with THAICHOTE images demonstrated the best classification accuracies, well over 85%. As the results, an object-based data fusion provides higher land cover classification accuracy than per-pixel data fusion.

  9. The accuracy of echocardiography versus surgical and pathological classification of patients with ruptured mitral chordae tendineae: a large study in a Chinese cardiovascular center

    Directory of Open Access Journals (Sweden)

    Bai Zhigang

    2011-07-01

    Full Text Available Abstract Background The accuracy of echocardiography versus surgical and pathological classification of patients with ruptured mitral chordae tendineae (RMCT has not yet been investigated with a large study. Methods Clinical, hemodynamic, surgical, and pathological findings were reviewed for 242 patients with a preoperative diagnosis of RMCT that required mitral valvular surgery. Subjects were consecutive in-patients at Fuwai Hospital in 2002-2008. Patients were evaluated by thoracic echocardiography (TTE and transesophageal echocardiography (TEE. RMCT cases were classified by location as anterior or posterior, and classified by degree as partial or complete RMCT, according to surgical findings. RMCT cases were also classified by pathology into four groups: myxomatous degeneration, chronic rheumatic valvulitis (CRV, infective endocarditis and others. Results Echocardiography showed that most patients had a flail mitral valve, moderate to severe mitral regurgitation, a dilated heart chamber, mild to moderate pulmonary artery hypertension and good heart function. The diagnostic accuracy for RMCT was 96.7% for TTE and 100% for TEE compared with surgical findings. Preliminary experiments demonstrated that the sensitivity and specificity of diagnosing anterior, posterior and partial RMCT were high, but the sensitivity of diagnosing complete RMCT was low. Surgical procedures for RMCT depended on the location of ruptured chordae tendineae, with no relationship between surgical procedure and complete or partial RMCT. The echocardiographic characteristics of RMCT included valvular thickening, extended subvalvular chordae, echo enhancement, abnormal echo or vegetation, combined with aortic valve damage in the four groups classified by pathology. The incidence of extended subvalvular chordae in the myxomatous group was higher than that in the other groups, and valve thickening in combination with AV damage in the CRV group was higher than that in the other

  10. A comprehensive simulation study on classification of RNA-Seq data.

    Science.gov (United States)

    Zararsız, Gökmen; Goksuluk, Dincer; Korkmaz, Selcuk; Eldem, Vahap; Zararsiz, Gozde Erturk; Duru, Izzet Parug; Ozturk, Ahmet

    2017-01-01

    RNA sequencing (RNA-Seq) is a powerful technique for the gene-expression profiling of organisms that uses the capabilities of next-generation sequencing technologies. Developing gene-expression-based classification algorithms is an emerging powerful method for diagnosis, disease classification and monitoring at molecular level, as well as providing potential markers of diseases. Most of the statistical methods proposed for the classification of gene-expression data are either based on a continuous scale (eg. microarray data) or require a normal distribution assumption. Hence, these methods cannot be directly applied to RNA-Seq data since they violate both data structure and distributional assumptions. However, it is possible to apply these algorithms with appropriate modifications to RNA-Seq data. One way is to develop count-based classifiers, such as Poisson linear discriminant analysis and negative binomial linear discriminant analysis. Another way is to bring the data closer to microarrays and apply microarray-based classifiers. In this study, we compared several classifiers including PLDA with and without power transformation, NBLDA, single SVM, bagging SVM (bagSVM), classification and regression trees (CART), and random forests (RF). We also examined the effect of several parameters such as overdispersion, sample size, number of genes, number of classes, differential-expression rate, and the transformation method on model performances. A comprehensive simulation study is conducted and the results are compared with the results of two miRNA and two mRNA experimental datasets. The results revealed that increasing the sample size, differential-expression rate and decreasing the dispersion parameter and number of groups lead to an increase in classification accuracy. Similar with differential-expression studies, the classification of RNA-Seq data requires careful attention when handling data overdispersion. We conclude that, as a count-based classifier, the power

  11. Classification of micro-calcification in mammograms using scalable linear Fisher discriminant analysis.

    Science.gov (United States)

    Suhail, Zobia; Denton, Erika R E; Zwiggelaar, Reyer

    2018-01-25

    Breast cancer is one of the major causes of death in women. Computer Aided Diagnosis (CAD) systems are being developed to assist radiologists in early diagnosis. Micro-calcifications can be an early symptom of breast cancer. Besides detection, classification of micro-calcification as benign or malignant is essential in a complete CAD system. We have developed a novel method for the classification of benign and malignant micro-calcification using an improved Fisher Linear Discriminant Analysis (LDA) approach for the linear transformation of segmented micro-calcification data in combination with a Support Vector Machine (SVM) variant to classify between the two classes. The results indicate an average accuracy equal to 96% which is comparable to state-of-the art methods in the literature. Graphical Abstract Classification of Micro-calcification in Mammograms using Scalable Linear Fisher Discriminant Analysis.

  12. A Modified Decision Tree Algorithm Based on Genetic Algorithm for Mobile User Classification Problem

    Science.gov (United States)

    Liu, Dong-sheng; Fan, Shu-jiang

    2014-01-01

    In order to offer mobile customers better service, we should classify the mobile user firstly. Aimed at the limitations of previous classification methods, this paper puts forward a modified decision tree algorithm for mobile user classification, which introduced genetic algorithm to optimize the results of the decision tree algorithm. We also take the context information as a classification attributes for the mobile user and we classify the context into public context and private context classes. Then we analyze the processes and operators of the algorithm. At last, we make an experiment on the mobile user with the algorithm, we can classify the mobile user into Basic service user, E-service user, Plus service user, and Total service user classes and we can also get some rules about the mobile user. Compared to C4.5 decision tree algorithm and SVM algorithm, the algorithm we proposed in this paper has higher accuracy and more simplicity. PMID:24688389

  13. Prediction of protein-protein interactions between viruses and human by an SVM model

    Directory of Open Access Journals (Sweden)

    Cui Guangyu

    2012-05-01

    Full Text Available Abstract Background Several computational methods have been developed to predict protein-protein interactions from amino acid sequences, but most of those methods are intended for the interactions within a species rather than for interactions across different species. Methods for predicting interactions between homogeneous proteins are not appropriate for finding those between heterogeneous proteins since they do not distinguish the interactions between proteins of the same species from those of different species. Results We developed a new method for representing a protein sequence of variable length in a frequency vector of fixed length, which encodes the relative frequency of three consecutive amino acids of a sequence. We built a support vector machine (SVM model to predict human proteins that interact with virus proteins. In two types of viruses, human papillomaviruses (HPV and hepatitis C virus (HCV, our SVM model achieved an average accuracy above 80%, which is higher than that of another SVM model with a different representation scheme. Using the SVM model and Gene Ontology (GO annotations of proteins, we predicted new interactions between virus proteins and human proteins. Conclusions Encoding the relative frequency of amino acid triplets of a protein sequence is a simple yet powerful representation method for predicting protein-protein interactions across different species. The representation method has several advantages: (1 it enables a prediction model to achieve a better performance than other representations, (2 it generates feature vectors of fixed length regardless of the sequence length, and (3 the same representation is applicable to different types of proteins.

  14. Novel SVM-based technique to improve rainfall estimation over the Mediterranean region (north of Algeria) using the multispectral MSG SEVIRI imagery

    Science.gov (United States)

    Sehad, Mounir; Lazri, Mourad; Ameur, Soltane

    2017-03-01

    In this work, a new rainfall estimation technique based on the high spatial and temporal resolution of the Spinning Enhanced Visible and Infra Red Imager (SEVIRI) aboard the Meteosat Second Generation (MSG) is presented. This work proposes efficient scheme rainfall estimation based on two multiclass support vector machine (SVM) algorithms: SVM_D for daytime and SVM_N for night time rainfall estimations. Both SVM models are trained using relevant rainfall parameters based on optical, microphysical and textural cloud proprieties. The cloud parameters are derived from the Spectral channels of the SEVIRI MSG radiometer. The 3-hourly and daily accumulated rainfall are derived from the 15 min-rainfall estimation given by the SVM classifiers for each MSG observation image pixel. The SVMs were trained with ground meteorological radar precipitation scenes recorded from November 2006 to March 2007 over the north of Algeria located in the Mediterranean region. Further, the SVM_D and SVM_N models were used to estimate 3-hourly and daily rainfall using data set gathered from November 2010 to March 2011 over north Algeria. The results were validated against collocated rainfall observed by rain gauge network. Indeed, the statistical scores given by correlation coefficient, bias, root mean square error and mean absolute error, showed good accuracy of rainfall estimates by the present technique. Moreover, rainfall estimates of our technique were compared with two high accuracy rainfall estimates methods based on MSG SEVIRI imagery namely: random forests (RF) based approach and an artificial neural network (ANN) based technique. The findings of the present technique indicate higher correlation coefficient (3-hourly: 0.78; daily: 0.94), and lower mean absolute error and root mean square error values. The results show that the new technique assign 3-hourly and daily rainfall with good and better accuracy than ANN technique and (RF) model.

  15. Classification Accuracy of the Minnesota Multiphasic Personality Inventory-2 (MMPI-2)-Restructured Form Validity Scales in Detecting Malingered Pain-Related Disability.

    Science.gov (United States)

    Bianchini, Kevin J; Aguerrevere, Luis E; Curtis, Kelly L; Roebuck-Spencer, Tresa M; Frey, F Charles; Greve, Kevin W; Calamia, Matthew

    2017-10-26

    The symptom reports of individuals with chronic pain are multidimensional (e.g., emotional, cognitive, and somatic) and significantly contribute to increased morbidity and lost work productivity. When pain occurs in the context of a legally compensable event, reliable assessment of a patient's multifactorial symptom experience during psychological or neuropsychological evaluations is a necessity. The Validity Scales of the Minnesota Multiphasic Personality Inventory-2 (MMPI-2) have been shown useful in identifying symptom overreporting and feigning within chronic pain samples and a number of studies have emerged supporting the use of the MMPI-2-Restructured Form (MMPI-2-RF) in the detection of simulated or feigned impairment in a variety of populations. To date, only 1 other study exists examining the ability of the MMPI-2-RF to detect exaggerated complaints using a strict operationalization of malingering exclusive to chronic pain samples. The purpose of this study was to examine the classification accuracy of MMPI-2-RF Validity Scales in a group of patients with chronic pain using a criterion-groups design. The final sample consisted of 501 clinical chronic pain patients assigned to groups based on the Bianchini, Greve, and Glynn (2005) criteria for Malingered Pain-Related Disability (MPRD). Results showed that all MMPI-2-RF Validity Scales differentiated malingerers from nonmalingerers with a high degree of accuracy. At cut-offs associated with ≥95% Specificity, Sensitivities ranged from 15% (Fs) to 60% (Response Bias Scale; RBS). This study demonstrates that the MMPI-2-RF Validity Scales are capable of differentiating intentional symptom exaggeration from genuine complaints in a sample of incentivized chronic pain patients. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  16. Administrative database concerns: accuracy of International Classification of Diseases, Ninth Revision coding is poor for preoperative anemia in patients undergoing spinal fusion.

    Science.gov (United States)

    Golinvaux, Nicholas S; Bohl, Daniel D; Basques, Bryce A; Grauer, Jonathan N

    2014-11-15

    Cross-sectional study. To objectively evaluate the ability of International Classification of Diseases, Ninth Revision (ICD-9) codes, which are used as the foundation for administratively coded national databases, to identify preoperative anemia in patients undergoing spinal fusion. National database research in spine surgery continues to rise. However, the validity of studies based on administratively coded data, such as the Nationwide Inpatient Sample, are dependent on the accuracy of ICD-9 coding. Such coding has previously been found to have poor sensitivity to conditions such as obesity and infection. A cross-sectional study was performed at an academic medical center. Hospital-reported anemia ICD-9 codes (those used for administratively coded databases) were directly compared with the chart-documented preoperative hematocrits (true laboratory values). A patient was deemed to have preoperative anemia if the preoperative hematocrit was less than the lower end of the normal range (36.0% for females and 41.0% for males). The study included 260 patients. Of these, 37 patients (14.2%) were anemic; however, only 10 patients (3.8%) received an "anemia" ICD-9 code. Of the 10 patients coded as anemic, 7 were anemic by definition, whereas 3 were not, and thus were miscoded. This equates to an ICD-9 code sensitivity of 0.19, with a specificity of 0.99, and positive and negative predictive values of 0.70 and 0.88, respectively. This study uses preoperative anemia to demonstrate the potential inaccuracies of ICD-9 coding. These results have implications for publications using databases that are compiled from ICD-9 coding data. Furthermore, the findings of the current investigation raise concerns regarding the accuracy of additional comorbidities. Although administrative databases are powerful resources that provide large sample sizes, it is crucial that we further consider the quality of the data source relative to its intended purpose.

  17. Relative significance of heat transfer processes to quantify tradeoffs between complexity and accuracy of energy simulations with a building energy use patterns classification

    Science.gov (United States)

    Heidarinejad, Mohammad

    This dissertation develops rapid and accurate building energy simulations based on a building classification that identifies and focuses modeling efforts on most significant heat transfer processes. The building classification identifies energy use patterns and their contributing parameters for a portfolio of buildings. The dissertation hypothesis is "Building classification can provide minimal required inputs for rapid and accurate energy simulations for a large number of buildings". The critical literature review indicated there is lack of studies to (1) Consider synoptic point of view rather than the case study approach, (2) Analyze influence of different granularities of energy use, (3) Identify key variables based on the heat transfer processes, and (4) Automate the procedure to quantify model complexity with accuracy. Therefore, three dissertation objectives are designed to test out the dissertation hypothesis: (1) Develop different classes of buildings based on their energy use patterns, (2) Develop different building energy simulation approaches for the identified classes of buildings to quantify tradeoffs between model accuracy and complexity, (3) Demonstrate building simulation approaches for case studies. Penn State's and Harvard's campus buildings as well as high performance LEED NC office buildings are test beds for this study to develop different classes of buildings. The campus buildings include detailed chilled water, electricity, and steam data, enabling to classify buildings into externally-load, internally-load, or mixed-load dominated. The energy use of the internally-load buildings is primarily a function of the internal loads and their schedules. Externally-load dominated buildings tend to have an energy use pattern that is a function of building construction materials and outdoor weather conditions. However, most of the commercial medium-sized office buildings have a mixed-load pattern, meaning the HVAC system and operation schedule dictate

  18. Unraveling the linguistic nature of specific autobiographical memories using a computerized classification algorithm.

    Science.gov (United States)

    Takano, Keisuke; Ueno, Mayumi; Moriya, Jun; Mori, Masaki; Nishiguchi, Yuki; Raes, Filip

    2017-06-01

    In the present study, we explored the linguistic nature of specific memories generated with the Autobiographical Memory Test (AMT) by developing a computerized classifier that distinguishes between specific and nonspecific memories. The AMT is regarded as one of the most important assessment tools to study memory dysfunctions (e.g., difficulty recalling the specific details of memories) in psychopathology. In Study 1, we utilized the Japanese corpus data of 12,400 cue-recalled memories tagged with observer-rated specificity. We extracted linguistic features of particular relevance to memory specificity, such as past tense, negation, and adverbial words and phrases pertaining to time and location. On the basis of these features, a support vector machine (SVM) was trained to classify the memories into specific and nonspecific categories, which achieved an area under the curve (AUC) of .92 in a performance test. In Study 2, the trained SVM was tested in terms of its robustness in classifying novel memories (n = 8,478) that were retrieved in response to cue words that were different from those used in Study 1. The SVM showed an AUC of .89 in classifying the new memories. In Study 3, we extended the binary SVM to a five-class classification of the AMT, which achieved 64%-65% classification accuracy, against the chance level (20%) in the performance tests. Our data suggest that memory specificity can be identified with a relatively small number of words, capturing the universal linguistic features of memory specificity across memories in diverse contents.

  19. Polarimetric SAR Terrain Classification Using Polarimetric Features Derived from Rotation Domain

    Directory of Open Access Journals (Sweden)

    Tao Chensong

    2017-10-01

    Full Text Available Terrain classification is an important application for understanding and interpreting Polarimetric Synthetic Aperture Radar (PolSAR images. One common PolSAR terrain classification uses roll-invariant feature parameters such as H/A/a/SPAN. However, the back scattering response of a target is closely related to its orientation and attitude. This frequently introduces ambiguity in the interpretation of scattering mechanisms and limits the accuracy of the PolSAR terrain classification, which only uses roll-invariant feature parameters for classification. To address this problem, the uniform polarimetric matrix rotation theory, which interprets a target’s scattering properties when its polarimetric matrix is rotated along the radar line of sight and derives a series of polarimetric features to describe hidden information of the target in the rotation domain was proposed. Based on this theory, in this study, we apply the polarimetric features in the rotation domain to PolSAR terrain discrimination and classification, and develop a PolSAR terrain classification method using both the polarimetric features in the rotation domain and the roll-invariant features of H/A/a/SPAN. This method also uses both the selected polarimetric feature parameters in the rotation domain and H/A/a/SPAN as input for a Support Vector Machine (SVM classifier and achieves better classification performance by complementing the terrain discrimination abilities of both. Results from comparison experiments based on AIRSAR and UAVSAR data demonstrate that compared with the conventional method, which only uses H/A/a/SPAN as SVM classifier input, the proposed method can achieve higher classification accuracy and better robustness. For fifteen terrain classes of AIRSAR data, the total classification accuracy of the proposed method was 92.3%, which is higher than the 91.1% of the conventional method. Moreover, for seven terrain classes of multi-temporal UAVSAR data, the averaged

  20. Estimating grassland biomass using SVM band shaving of hyperspectral data

    OpenAIRE

    Clevers, J G P W; van Der Heijden, G.W.A.M.; Verzakov, S; Schaepman, M. E.

    2007-01-01

    In this paper, the potential of a band shaving algorithm based on support vector machines (SVM) applied to hyperspectral data for estimating biomass within grasslands is studied. Field spectrometer data and biomass measurements were collected from a homogeneously managed grassland field. The SVM band shaving technique was compared with a partial least squares (PLS) and a stepwise forward selection analysis. Using their results, a range of vegetation indices was used as predictors for grasslan...

  1. Learning machines and sleeping brains: Automatic sleep stage classification using decision-tree multi-class support vector machines.

    Science.gov (United States)

    Lajnef, Tarek; Chaibi, Sahbi; Ruby, Perrine; Aguera, Pierre-Emmanuel; Eichenlaub, Jean-Baptiste; Samet, Mounir; Kachouri, Abdennaceur; Jerbi, Karim

    2015-07-30

    Sleep staging is a critical step in a range of electrophysiological signal processing pipelines used in clinical routine as well as in sleep research. Although the results currently achievable with automatic sleep staging methods are promising, there is need for improvement, especially given the time-consuming and tedious nature of visual sleep scoring. Here we propose a sleep staging framework that consists of a multi-class support vector machine (SVM) classification based on a decision tree approach. The performance of the method was evaluated using polysomnographic data from 15 subjects (electroencephalogram (EEG), electrooculogram (EOG) and electromyogram (EMG) recordings). The decision tree, or dendrogram, was obtained using a hierarchical clustering technique and a wide range of time and frequency-domain features were extracted. Feature selection was carried out using forward sequential selection and classification was evaluated using k-fold cross-validation. The dendrogram-based SVM (DSVM) achieved mean specificity, sensitivity and overall accuracy of 0.92, 0.74 and 0.88 respectively, compared to expert visual scoring. Restricting DSVM classification to data where both experts' scoring was consistent (76.73% of the data) led to a mean specificity, sensitivity and overall accuracy of 0.94, 0.82 and 0.92 respectively. The DSVM framework outperforms classification with more standard multi-class "one-against-all" SVM and linear-discriminant analysis. The promising results of the proposed methodology suggest that it may be a valuable alternative to existing automatic methods and that it could accelerate visual scoring by providing a robust starting hypnogram that can be further fine-tuned by expert inspection. Copyright © 2015 Elsevier B.V. All rights reserved.

  2. SVM2Motif--Reconstructing Overlapping DNA Sequence Motifs by Mimicking an SVM Predictor.

    Directory of Open Access Journals (Sweden)

    Marina M-C Vidovic

    Full Text Available Identifying discriminative motifs underlying the functionality and evolution of organisms is a major challenge in computational biology. Machine learning approaches such as support vector machines (SVMs achieve state-of-the-art performances in genomic discrimination tasks, but--due to its black-box character--motifs underlying its decision function are largely unknown. As a remedy, positional oligomer importance matrices (POIMs allow us to visualize the significance of position-specific subsequences. Although being a major step towards the explanation of trained SVM models, they suffer from the fact that their size grows exponentially in the length of the motif, which renders their manual inspection feasible only for comparably small motif sizes, typically k ≤ 5. In this work, we extend the work on positional oligomer importance matrices, by presenting a new machine-learning methodology, entitled motifPOIM, to extract the truly relevant motifs--regardless of their length and complexity--underlying the predictions of a trained SVM model. Our framework thereby considers the motifs as free parameters in a probabilistic model, a task which can be phrased as a non-convex optimization problem. The exponential dependence of the POIM size on the oligomer length poses a major numerical challenge, which we address by an efficient optimization framework that allows us to find possibly overlapping motifs consisting of up to hundreds of nucleotides. We demonstrate the efficacy of our approach on a synthetic data set as well as a real-world human splice site data set.

  3. SVM2Motif--Reconstructing Overlapping DNA Sequence Motifs by Mimicking an SVM Predictor.

    Science.gov (United States)

    Vidovic, Marina M-C; Görnitz, Nico; Müller, Klaus-Robert; Rätsch, Gunnar; Kloft, Marius

    2015-01-01

    Identifying discriminative motifs underlying the functionality and evolution of organisms is a major challenge in computational biology. Machine learning approaches such as support vector machines (SVMs) achieve state-of-the-art performances in genomic discrimination tasks, but--due to its black-box character--motifs underlying its decision function are largely unknown. As a remedy, positional oligomer importance matrices (POIMs) allow us to visualize the significance of position-specific subsequences. Although being a major step towards the explanation of trained SVM models, they suffer from the fact that their size grows exponentially in the length of the motif, which renders their manual inspection feasible only for comparably small motif sizes, typically k ≤ 5. In this work, we extend the work on positional oligomer importance matrices, by presenting a new machine-learning methodology, entitled motifPOIM, to extract the truly relevant motifs--regardless of their length and complexity--underlying the predictions of a trained SVM model. Our framework thereby considers the motifs as free parameters in a probabilistic model, a task which can be phrased as a non-convex optimization problem. The exponential dependence of the POIM size on the oligomer length poses a major numerical challenge, which we address by an efficient optimization framework that allows us to find possibly overlapping motifs consisting of up to hundreds of nucleotides. We demonstrate the efficacy of our approach on a synthetic data set as well as a real-world human splice site data set.

  4. Undercounting of large trucks in federal and state crash databases: Extent of problem and how to improve accuracy of truck classifications.

    Science.gov (United States)

    Cheung, Ivan; Braver, Elisa R

    2016-01-01

    Prior research suggested that single-unit trucks are undercounted when using vehicle body codes in the Fatality Analysis Reporting System (FARS). This study explored the extent of the misclassification and undercounting problem for crashes in FARS and state crash databases. Truck misclassifications for fatal crashes were explored by comparing the Trucks Involved in Fatal Accidents (TIFA) database with FARS. TIFA used vehicle identification numbers (VINs) and survey information to classify large trucks. This study used VINs to improve the accuracy of large truck classifications in state crash databases from 5 states (Delaware, Maryland, Minnesota, Nebraska, and Utah). The vehicle body type codes resulted in a 19% undercount of single-unit trucks in FARS and a 23% undercount of single-unit trucks in state databases. Tractor-trailers were misclassified less often. Misclassifications occurred most frequently among single-unit trucks in the weight classes of 10,001-14,000 pounds. The amount of misclassification of large trucks is large enough to potentially affect federal and state decisions on traffic safety. Using information from VINs results in more complete and accurate counts of large trucks involved in crashes. The National Transportation Safety Board recommended actions to improve federal and state crash data.

  5. Elucidation of Metallic Plume and Spatter Characteristics Based on SVM During High-Power Disk Laser Welding

    Science.gov (United States)

    Gao, Xiangdong; Liu, Guiqian

    2015-01-01

    During deep penetration laser welding, there exist plume (weak plasma) and spatters, which are the results of weld material ejection due to strong laser heating. The characteristics of plume and spatters are related to welding stability and quality. Characteristics of metallic plume and spatters were investigated during high-power disk laser bead-on-plate welding of Type 304 austenitic stainless steel plates at a continuous wave laser power of 10 kW. An ultraviolet and visible sensitive high-speed camera was used to capture the metallic plume and spatter images. Plume area, laser beam path through the plume, swing angle, distance between laser beam focus and plume image centroid, abscissa of plume centroid and spatter numbers are defined as eigenvalues, and the weld bead width was used as a characteristic parameter that reflected welding stability. Welding status was distinguished by SVM (support vector machine) after data normalization and characteristic analysis. Also, PCA (principal components analysis) feature extraction was used to reduce the dimensions of feature space, and PSO (particle swarm optimization) was used to optimize the parameters of SVM. Finally a classification model based on SVM was established to estimate the weld bead width and welding stability. Experimental results show that the established algorithm based on SVM could effectively distinguish the variation of weld bead width, thus providing an experimental example of monitoring high-power disk laser welding quality.

  6. Use of Sub-Aperture Decomposition for Supervised PolSAR Classification in Urban Area

    Directory of Open Access Journals (Sweden)

    Lei Deng

    2015-01-01

    Full Text Available A novel approach is proposed for classifying the polarimetric SAR (PolSAR data by integrating polarimetric decomposition, sub-aperture decomposition and decision tree algorithm. It is composed of three key steps: sub-aperture decomposition, feature extraction and combination, and decision tree classification. Feature extraction and combination is the main contribution to the innovation of the proposed method. Firstly, the full-resolution PolSAR image and its two sub-aperture images are decomposed to obtain the scattering entropy, average scattering angle and anisotropy, respectively. Then, the difference information between the two sub-aperture images are extracted, and combined with the target decomposition features from full-resolution images to form the classification feature set. Finally, C5.0 decision tree algorithm is used to classify the PolSAR image. A comparison between the proposed method and commonly-used Wishart supervised classification was made to verify the improvement of the proposed method on the classification. The overall accuracy using the proposed method was 88.39%, much higher than that using the Wishart supervised classification, which exhibited an overall accuracy of 69.82%. The Kappa Coefficient was 0.83, whereas that using the Wishart supervised classification was 0.56. The results indicate that the proposed method performed better than Wishart supervised classification for landscape classification in urban area using PolSAR data. Further investigation was carried out on the contribution of difference information to PolSAR classification. It was found that the sub-aperture decomposition improved the classification accuracy of forest, buildings and grassland effectively in high-density urban area. Compared with support vector machine (SVM and QUEST classifier, C5.0 decision tree classifier performs more efficient in time consumption, feature selection and construction of decision rule.

  7. Optimizing Multiple Kernel Learning for the Classification of UAV Data

    Directory of Open Access Journals (Sweden)

    Caroline M. Gevaert

    2016-12-01

    Full Text Available Unmanned Aerial Vehicles (UAVs are capable of providing high-quality orthoimagery and 3D information in the form of point clouds at a relatively low cost. Their increasing popularity stresses the necessity of understanding which algorithms are especially suited for processing the data obtained from UAVs. The features that are extracted from the point cloud and imagery have different statistical characteristics and can be considered as heterogeneous, which motivates the use of Multiple Kernel Learning (MKL for classification problems. In this paper, we illustrate the utility of applying MKL for the classification of heterogeneous features obtained from UAV data through a case study of an informal settlement in Kigali, Rwanda. Results indicate that MKL can achieve a classification accuracy of 90.6%, a 5.2% increase over a standard single-kernel Support Vector Machine (SVM. A comparison of seven MKL methods indicates that linearly-weighted kernel combinations based on simple heuristics are competitive with respect to computationally-complex, non-linear kernel combination methods. We further underline the importance of utilizing appropriate feature grouping strategies for MKL, which has not been directly addressed in the literature, and we propose a novel, automated feature grouping method that achieves a high classification accuracy for various MKL methods.

  8. Classification of breast cancer cytological specimen using convolutional neural network

    Science.gov (United States)

    Żejmo, Michał; Kowal, Marek; Korbicz, Józef; Monczak, Roman

    2017-01-01

    The paper presents a deep learning approach for automatic classification of breast tumors based on fine needle cytology. The main aim of the system is to distinguish benign from malignant cases based on microscopic images. Experiment was carried out on cytological samples derived from 50 patients (25 benign cases + 25 malignant cases) diagnosed in Regional Hospital in Zielona Góra. To classify microscopic images, we used convolutional neural networks (CNN) of two types: GoogLeNet and AlexNet. Due to the very large size of images of cytological specimen (on average 200000 × 100000 pixels), they were divided into smaller patches of size 256 × 256 pixels. Breast cancer classification usually is based on morphometric features of nuclei. Therefore, training and validation patches were selected using Support Vector Machine (SVM) so that suitable amount of cell material was depicted. Neural classifiers were tuned using GPU accelerated implementation of gradient descent algorithm. Training error was defined as a cross-entropy classification loss. Classification accuracy was defined as the percentage ratio of successfully classified validation patches to the total number of validation patches. The best accuracy rate of 83% was obtained by GoogLeNet model. We observed that more misclassified patches belong to malignant cases.

  9. a New Spectral-Spatial Framework for Classification of Hyperspectral Data

    Science.gov (United States)

    Akbari, D.

    2017-11-01

    In this paper, an innovative framework, based on both spectral and spatial information, is proposed. The objective is to improve the classification of hyperspectral images for high resolution land cover mapping. The spatial information is obtained by a marker-based Minimum Spanning Forest (MSF) algorithm. A pixel-based SVM algorithm is first used to classify the image. Then, the marker-based MSF spectral-spatial algorithm is applied to improve the accuracy for classes with low accuracy. The marker-based MSF algorithm is used as a binary classifier. These two classes are the low accuracy class and the remaining classes. Finally, the SVM algorithm is trained for classes with acceptable accuracy. To evaluate the proposed approach, the Berlin hyperspectral dataset is tested. Experimental results demonstrate the superiority of the proposed method compared to the original MSF-based approach. It achieves approximately 5 % higher rates in kappa coefficients of agreement, in comparison to the original MSF-based method.

  10. Land Cover Classification Using a KOMPSAT-3A Multi-Spectral Satellite Image

    Directory of Open Access Journals (Sweden)

    Tri Dev Acharya

    2016-11-01

    Full Text Available New sets of satellite sensors are frequently being added to the constellation of remote sensing satellites. These new sets offer improved specification to collect imagery on-demand over specific locations and for specific purposes. The Korea Multi-Purpose Satellite (KOMPSAT series of satellites is a multi-purposed satellite system developed by Korea Aerospace Research Institute (KARI. The recent satellite of the KOMPSAT series, KOMPSAT-3A, demonstrates high resolution multi-spectral imagery with infrared and high resolution electro-optical bands for geographical information systems applications in environmental, agricultural and oceanographic sciences as well as natural disasters. In this study, land cover classification of multispectral data was performed using four supervised classification methods: Mahalanobis Distance (MahD, Minimum Distance (MinD, Maximum Likelihood (ML and Support Vector Machine (SVM, using a KOMPSAT-3A multi-spectral imagery with 2.2 m spatial resolution. The study area for this study was selected from southwestern region of South Korea, around Buan city. The training data for supervised classification was carefully selected by visual interpretation of KOMPSAT-3A imagery and field investigation. After classification, the results were then analyzed for the validation of classification accuracy by comparison with those of field investigation. For the validation, we calculated the User’s Accuracy (UA, Producer’s Accuracy (PA, Overall Accuracy (OA and Kappa statistics from the error matrix to check the classification accuracy for each class obtained individually from different methods. Finally, the comparative analysis was done for the study area for various results of land cover classification using a KOMPSAT-3A multi-spectral imagery.

  11. Pattern classification of brain activation during emotional processing in subclinical depression: psychosis proneness as potential confounding factor

    Directory of Open Access Journals (Sweden)

    Gemma Modinos

    2013-02-01

    Full Text Available We used Support Vector Machine (SVM to perform multivariate pattern classification based on brain activation during emotional processing in healthy participants with subclinical depressive symptoms. Six-hundred undergraduate students completed the Beck Depression Inventory II (BDI-II. Two groups were subsequently formed: (i subclinical (mild mood disturbance (n = 17 and (ii no mood disturbance (n = 17. Participants also completed a self-report questionnaire on subclinical psychotic symptoms, the Community Assessment of Psychic Experiences Questionnaire (CAPE positive subscale. The functional magnetic resonance imaging (fMRI paradigm entailed passive viewing of negative emotional and neutral scenes. The pattern of brain activity during emotional processing allowed correct group classification with an overall accuracy of 77% (p = 0.002, within a network of regions including the amygdala, insula, anterior cingulate cortex and medial prefrontal cortex. However, further analysis suggested that the classification accuracy could also be explained by subclinical psychotic symptom scores (correlation with SVM weights r = 0.459, p = 0.006. Psychosis proneness may thus be a confounding factor for neuroimaging studies in subclinical depression.

  12. Wavelet-based multicomponent denoising on GPU to improve the classification of hyperspectral images

    Science.gov (United States)

    Quesada-Barriuso, Pablo; Heras, Dora B.; Argüello, Francisco; Mouriño, J. C.

    2017-10-01

    Supervised classification allows handling a wide range of remote sensing hyperspectral applications. Enhancing the spatial organization of the pixels over the image has proven to be beneficial for the interpretation of the image content, thus increasing the classification accuracy. Denoising in the spatial domain of the image has been shown as a technique that enhances the structures in the image. This paper proposes a multi-component denoising approach in order to increase the classification accuracy when a classification method is applied. It is computed on multicore CPUs and NVIDIA GPUs. The method combines feature extraction based on a 1Ddiscrete wavelet transform (DWT) applied in the spectral dimension followed by an Extended Morphological Profile (EMP) and a classifier (SVM or ELM). The multi-component noise reduction is applied to the EMP just before the classification. The denoising recursively applies a separable 2D DWT after which the number of wavelet coefficients is reduced by using a threshold. Finally, inverse 2D-DWT filters are applied to reconstruct the noise free original component. The computational cost of the classifiers as well as the cost of the whole classification chain is high but it is reduced achieving real-time behavior for some applications through their computation on NVIDIA multi-GPU platforms.

  13. Regularised extreme learning machine with misclassification cost and rejection cost for gene expression data classification.

    Science.gov (United States)

    Lu, Huijuan; Wei, Shasha; Zhou, Zili; Miao, Yanzi; Lu, Yi

    2015-01-01

    The main purpose of traditional classification algorithms on bioinformatics application is to acquire better classification accuracy. However, these algorithms cannot meet the requirement that minimises the average misclassification cost. In this paper, a new algorithm of cost-sensitive regularised extreme learning machine (CS-RELM) was proposed by using probability estimation and misclassification cost to reconstruct the classification results. By improving the classification accuracy of a group of small sample which higher misclassification cost, the new CS-RELM can minimise the classification cost. The 'rejection cost' was integrated into CS-RELM algorithm to further reduce the average misclassification cost. By using Colon Tumour dataset and SRBCT (Small Round Blue Cells Tumour) dataset, CS-RELM was compared with other cost-sensitive algorithms such as extreme learning machine (ELM), cost-sensitive extreme learning machine, regularised extreme learning machine, cost-sensitive support vector machine (SVM). The results of experiments show that CS-RELM with embedded rejection cost could reduce the average cost of misclassification and made more credible classification decision than others.

  14. Stacked Denoising Autoencoders Applied to Star/Galaxy Classification

    Science.gov (United States)

    Hao-ran, Qin; Ji-ming, Lin; Jun-yi, Wang

    2017-04-01

    In recent years, the deep learning algorithm, with the characteristics of strong adaptability, high accuracy, and structural complexity, has become more and more popular, but it has not yet been used in astronomy. In order to solve the problem that the star/galaxy classification accuracy is high for the bright source set, but low for the faint source set of the Sloan Digital Sky Survey (SDSS) data, we introduced the new deep learning algorithm, namely the SDA (stacked denoising autoencoder) neural network and the dropout fine-tuning technique, which can greatly improve the robustness and antinoise performance. We randomly selected respectively the bright source sets and faint source sets from the SDSS DR12 and DR7 data with spectroscopic measurements, and made preprocessing on them. Then, we randomly selected respectively the training sets and testing sets without replacement from the bright source sets and faint source sets. At last, using these training sets we made the training to obtain the SDA models of the bright sources and faint sources in the SDSS DR7 and DR12, respectively. We compared the test result of the SDA model on the DR12 testing set with the test results of the Library for Support Vector Machines (LibSVM), J48 decision tree, Logistic Model Tree (LMT), Support Vector Machine (SVM), Logistic Regression, and Decision Stump algorithm, and compared the test result of the SDA model on the DR7 testing set with the test results of six kinds of decision trees. The experiments show that the SDA has a better classification accuracy than other machine learning algorithms for the faint source sets of DR7 and DR12. Especially, when the completeness function is used as the evaluation index, compared with the decision tree algorithms, the correctness rate of SDA has improved about 15% for the faint source set of SDSS-DR7.

  15. Structural SCOP superfamily level classification using unsupervised machine learning.

    Science.gov (United States)

    Angadi, Ulavappa B; Venkatesulu, M

    2012-01-01

    One of the major research directions in bioinformatics is that of assigning superfamily classification to a given set of proteins. The classification reflects the structural, evolutionary, and functional relatedness. These relationships are embodied in a hierarchical classification, such as the Structural Classification of Protein (SCOP), which is mostly manually curated. Such a classification is essential for the structural and functional analyses of proteins. Yet a large number of proteins remain unclassified. In this study, we have proposed an unsupervised machine learning approach to classify and assign a given set of proteins to SCOP superfamilies. In the method, we have constructed a database and similarity matrix using P-values obtained from an all-against-all BLAST run and trained the network with the ART2 unsupervised learning algorithm using the rows of the similarity matrix as input vectors, enabling the trained network to classify the proteins from 0.82 to 0.97 f-measure accuracy. The performance of ART2 has been compared with that of spectral clustering, Random forest, SVM, and HHpred. ART2 performs better than the others except HHpred. HHpred performs better than ART2 and the sum of errors is smaller than that of the other methods evaluated.

  16. Multicategory classification of 11 neuromuscular diseases based on microarray data using support vector machine.

    Science.gov (United States)

    Choi, Soo Beom; Park, Jee Soo; Chung, Jai Won; Yoo, Tae Keun; Kim, Deok Won

    2014-01-01

    We applied multicategory machine learning methods to classify 11 neuromuscular disease groups and one control group based on microarray data. To develop multicategory classification models with optimal parameters and features, we performed a systematic evaluation of three machine learning algorithms and four feature selection methods using three-fold cross validation and a grid search. This study included 114 subjects of 11 neuromuscular diseases and 31 subjects of a control group using microarray data with 22,283 probe sets from the National Center for Biotechnology Information (NCBI). We obtained an accuracy of 100%, relative classifier information (RCI) of 1.0, and a kappa index of 1.0 by applying the models of support vector machines one-versus-one (SVM-OVO), SVM one-versus-rest (OVR), and directed acyclic graph SVM (DAGSVM), using the ratio of genes between categories to within-category sums of squares (BW) feature selection method. Each of these three models selected only four features to categorize the 12 groups, resulting in a time-saving and cost-effective strategy for diagnosing neuromuscular diseases. In addition, a gene symbol, SPP1 was selected as the top-ranked gene by the BW method. We confirmed relationships between the gene (SPP1) and Duchenne muscular dystrophy (DMD) from a previous study. With our models as clinically helpful tools, neuromuscular diseases could be classified quickly using a computer, thereby giving a time-saving, cost-effective, and accurate diagnosis.

  17. Classification of intended motor movement using surface EEG ensemble empirical mode decomposition.

    Science.gov (United States)

    Kuo, Ching-Chang; Lin, William S; Dressel, Chelsea A; Chiu, Alan W L

    2011-01-01

    Noninvasive electroencephalography (EEG) brain computer interface (BCI) systems are used to investigate intended arm reaching tasks. The main goal of the work is to create a device with a control scheme that allows those with limited motor control to have more command over potential prosthetic devices. Four healthy subjects were recruited to perform various reaching tasks directed by visual cues. Independent component analysis (ICA) was used to identify artifacts. Active post parietal cortex (PPC) activation before arm movement was validated using EEGLAB. Single-trial binary classification strategies using support vector machine (SVM) with radial basis functions (RBF) kernels and Fisher linear discrimination (FLD) were evaluated using signal features from surface electrodes near the PPC regions. No significant improvement can be found by using a nonlinear SVM over a linear FLD classifier (63.65% to 63.41% accuracy). A significant improvement in classification accuracy was found when a normalization factor based on visual cue "signature" was introduced to the raw signal (90.43%) and the intrinsic mode functions (IMF) of the data (93.55%) using Ensemble Empirical Mode Decomposition (EEMD).

  18. Support vector machine for classification of walking conditions of persons after stroke with dropped foot.

    Science.gov (United States)

    Lau, Hong-yin; Tong, Kai-yu; Zhu, Hailong

    2009-08-01

    Walking with dropped foot represents a major gait disorder, which is observed in hemiparetic persons after stroke. This study explores the use of support vector machine (SVMs) to classify different walking conditions for hemiparetic subjects. Seven participants with dropped foot (category 4 of functional ambulatory category) walked in five different conditions: level ground, stair ascent, stair descent, upslope, and downslope. The kinematic data were measured by two portable sensor units, each comprising an accelerometer and gyroscope attached to the lower limb on the shank and foot segments. The overall classification accuracy of stair ascent, stair descent, and other walking conditions was 92.9% using input features from the sensor attached to the shank. It was further improved to 97.5% by adding two more inputs from the sensor attached to the foot. Stair ascent was also classified by the inputs from the foot sensor unit with 96% accuracy. The performance of an SVM was shown to be superior to that of other machine learning methods using artificial neural networks (ANN) and radial basis function neural networks (RBF). The results suggested that the SVM classification method could be applied as a tool for pathological gait analysis, pattern recognition, control signals in functional electrical stimulation (FES) and rehabilitation robot, as well as activity monitoring during rehabilitation of daily activities.

  19. Accurate crop classification using hierarchical genetic fuzzy rule-based systems

    Science.gov (United States)

    Topaloglou, Charalampos A.; Mylonas, Stelios K.; Stavrakoudis, Dimitris G.; Mastorocostas, Paris A.; Theocharis, John B.

    2014-10-01

    This paper investigates the effectiveness of an advanced classification system for accurate crop classification using very high resolution (VHR) satellite imagery. Specifically, a recently proposed genetic fuzzy rule-based classification system (GFRBCS) is employed, namely, the Hierarchical Rule-based Linguistic Classifier (HiRLiC). HiRLiC's model comprises a small set of simple IF-THEN fuzzy rules, easily interpretable by humans. One of its most important attributes is that its learning algorithm requires minimum user interaction, since the most important learning parameters affecting the classification accuracy are determined by the learning algorithm automatically. HiRLiC is applied in a challenging crop classification task, using a SPOT5 satellite image over an intensively cultivated area in a lake-wetland ecosystem in northern Greece. A rich set of higher-order spectral and textural features is derived from the initial bands of the (pan-sharpened) image, resulting in an input space comprising 119 features. The experimental analysis proves that HiRLiC compares favorably to other interpretable classifiers of the literature, both in terms of structural complexity and classification accuracy. Its testing accuracy was very close to that obtained by complex state-of-the-art classification systems, such as the support vector machines (SVM) and random forest (RF) classifiers. Nevertheless, visual inspection of the derived classification maps shows that HiRLiC is characterized by higher generalization properties, providing more homogeneous classifications that the competitors. Moreover, the runtime requirements for producing the thematic map was orders of magnitude lower than the respective for the competitors.

  20. A Hybrid ICA-SVM Approach for Determining the Quality Variables at Fault in a Multivariate Process

    Directory of Open Access Journals (Sweden)

    Yuehjen E. Shao

    2012-01-01

    Full Text Available The monitoring of a multivariate process with the use of multivariate statistical process control (MSPC charts has received considerable attention. However, in practice, the use of MSPC chart typically encounters a difficulty. This difficult involves which quality variable or which set of the quality variables is responsible for the generation of the signal. This study proposes a hybrid scheme which is composed of independent component analysis (ICA and support vector machine (SVM to determine the fault quality variables when a step-change disturbance existed in a multivariate process. The proposed hybrid ICA-SVM scheme initially applies ICA to the Hotelling T2 MSPC chart to generate independent components (ICs. The hidden information of the fault quality variables can be identified in these ICs. The ICs are then served as the input variables of the classifier SVM for performing the classification process. The performance of various process designs is investigated and compared with the typical classification method. Using the proposed approach, the fault quality variables for a multivariate process can be accurately and reliably determined.

  1. Classification of Partial Discharge Measured under Different Levels of Noise Contamination.

    Directory of Open Access Journals (Sweden)

    Wong Jee Keen Raymond

    Full Text Available Cable joint insulation breakdown may cause a huge loss to power companies. Therefore, it is vital to diagnose the insulation quality to detect early signs of insulation failure. It is well known that there is a correlation between Partial discharge (PD and the insulation quality. Although many works have been done on PD pattern recognition, it is usually performed in a noise free environment. Also, works on PD pattern recognition in actual cable joint are less likely to be found in literature. Therefore, in this work, classifications of actual cable joint defect types from partial discharge data contaminated by noise were performed. Five cross-linked polyethylene (XLPE cable joints with artificially created defects were prepared based on the defects commonly encountered on site. Three different types of input feature were extracted from the PD pattern under artificially created noisy environment. These include statistical features, fractal features and principal component analysis (PCA features. These input features were used to train the classifiers to classify each PD defect types. Classifications were performed using three different artificial intelligence classifiers, which include Artificial Neural Networks (ANN, Adaptive Neuro-Fuzzy Inference System (ANFIS and Support Vector Machine (SVM. It was found that the classification accuracy decreases with higher noise level but PCA features used in SVM and ANN showed the strongest tolerance against noise contamination.

  2. Support-vector-machines-based multidimensional signal classification for fetal activity characterization

    Science.gov (United States)

    Ribes, S.; Voicu, I.; Girault, J. M.; Fournier, M.; Perrotin, F.; Tranquart, F.; Kouamé, D.

    2011-03-01

    Electronic fetal monitoring may be required during the whole pregnancy to closely monitor specific fetal and maternal disorders. Currently used methods suffer from many limitations and are not sufficient to evaluate fetal asphyxia. Fetal activity parameters such as movements, heart rate and associated parameters are essential indicators of the fetus well being, and no current device gives a simultaneous and sufficient estimation of all these parameters to evaluate the fetus well-being. We built for this purpose, a multi-transducer-multi-gate Doppler system and developed dedicated signal processing techniques for fetal activity parameter extraction in order to investigate fetus's asphyxia or well-being through fetal activity parameters. To reach this goal, this paper shows preliminary feasibility of separating normal and compromised fetuses using our system. To do so, data set consisting of two groups of fetal signals (normal and compromised) has been established and provided by physicians. From estimated parameters an instantaneous Manning-like score, referred to as ultrasonic score was introduced and was used together with movements, heart rate and associated parameters in a classification process using Support Vector Machines (SVM) method. The influence of the fetal activity parameters and the performance of the SVM were evaluated using the computation of sensibility, specificity, percentage of support vectors and total classification accuracy. We showed our ability to separate the data into two sets : normal fetuses and compromised fetuses and obtained an excellent matching with the clinical classification performed by physician.

  3. Evaluating oral epithelial dysplasia classification system by near-infrared Raman spectroscopy.

    Science.gov (United States)

    Li, Bo; Gu, Zhi-Yu; Yan, Kai-Xiao; Wen, Zhi-Ning; Zhao, Zhi-He; Li, Long-Jiang; Li, Yi

    2017-09-29

    Until now, the classification system of oral epithelial dysplasia is still based on the architectural and cytological changes, which relies on the observation of pathologists and is relatively subjective. The purpose of present research was to discriminate the oral dysplasia by the near-infrared Raman spectroscope, in order to evaluate the classification system. We collected Raman spectra of normal mucosa, oral squamous cell carcinoma (OSCC) and dysplasia by near-infrared Raman spectroscope. The biochemical variations between different stages were analyzed by the characteristic peaks in the subtracted mean spectra. Gaussian radial basis function support vector machines (SVM) were used to establish the diagnostic models. At the same time, principal component analysis (PCA) and linear discriminant analysis (LDA) were used to verify the results of SVM. Raman spectral differences were observed in the range between 730~1913 cm-1. Compared with normal mucosa, high contents of protein and DNA in oral dysplasia and OSCC were observed. There were no significant or gradual variation of Raman peaks among different dysplastic grades. The accuracies of comparison between mild, moderate, severe dysplasia with OSCC were 100%, 44.44%, 71.15%, which elucidated the low modeling ability of support vector machines, especially for the moderate dysplasia. The analysis by PCA-LDA could not discriminate the stages, either. Combined with support vector machines, near-infrared Raman spectroscopy could detect the biochemical variations in oral normal, OSCC and dysplastic tissues, but could not establish diagnostic model accurately. The classification system needs further improvements.

  4. Classification of bifurcations regions in IVOCT images using support vector machine and artificial neural network models

    Science.gov (United States)

    Porto, C. D. N.; Costa Filho, C. F. F.; Macedo, M. M. G.; Gutierrez, M. A.; Costa, M. G. F.

    2017-03-01

    Studies in intravascular optical coherence tomography (IV-OCT) have demonstrated the importance of coronary bifurcation regions in intravascular medical imaging analysis, as plaques are more likely to accumulate in this region leading to coronary disease. A typical IV-OCT pullback acquires hundreds of frames, thus developing an automated tool to classify the OCT frames as bifurcation or non-bifurcation can be an important step to speed up OCT pullbacks analysis and assist automated methods for atherosclerotic plaque quantification. In this work, we evaluate the performance of two state-of-the-art classifiers, SVM and Neural Networks in the bifurcation classification task. The study included IV-OCT frames from 9 patients. In order to improve classification performance, we trained and tested the SVM with different parameters by means of a grid search and different stop criteria were applied to the Neural Network classifier: mean square error, early stop and regularization. Different sets of features were tested, using feature selection techniques: PCA, LDA and scalar feature selection with correlation. Training and test were performed in sets with a maximum of 1460 OCT frames. We quantified our results in terms of false positive rate, true positive rate, accuracy, specificity, precision, false alarm, f-measure and area under ROC curve. Neural networks obtained the best classification accuracy, 98.83%, overcoming the results found in literature. Our methods appear to offer a robust and reliable automated classification of OCT frames that might assist physicians indicating potential frames to analyze. Methods for improving neural networks generalization have increased the classification performance.

  5. An Object-Based Classification of Mangroves Using a Hybrid Decision Tree—Support Vector Machine Approach

    Directory of Open Access Journals (Sweden)

    Benjamin W. Heumann

    2011-11-01

    Full Text Available Mangroves provide valuable ecosystem goods and services such as carbon sequestration, habitat for terrestrial and marine fauna, and coastal hazard mitigation. The use of satellite remote sensing to map mangroves has become widespread as it can provide accurate, efficient, and repeatable assessments. Traditional remote sensing approaches have failed to accurately map fringe mangroves and true mangrove species due to relatively coarse spatial resolution and/or spectral confusion with landward vegetation. This study demonstrates the use of the new Worldview-2 sensor, Object-based image analysis (OBIA, and support vector machine (SVM classification to overcome both of these limitations. An exploratory spectral separability showed that individual mangrove species could not be spectrally separated, but a distinction between true and associate mangrove species could be made. An OBIA classification was used that combined a decision-tree classification with the machine-learning SVM classification. Results showed an overall accuracy greater than 94% (kappa = 0.863 for classifying true mangroves species and other dense coastal vegetation at the object level. There remain serious challenges to accurately mapping fringe mangroves using remote sensing data due to spectral similarity of mangrove and associate species, lack of clear zonation between species, and mixed pixel effects, especially when vegetation is sparse or degraded.

  6. Support vector machine for classification of meiotic recombination hotspots and coldspots in Saccharomyces cerevisiae based on codon composition

    Directory of Open Access Journals (Sweden)

    Sun Xiao

    2006-04-01

    Full Text Available Abstract Background Meiotic double-strand breaks occur at relatively high frequencies in some genomic regions (hotspots and relatively low frequencies in others (coldspots. Hotspots and coldspots are receiving increasing attention in research into the mechanism of meiotic recombination. However, predicting hotspots and coldspots from DNA sequence information is still a challenging task. Results We present a novel method for classification of hot and cold ORFs located in hotspots and coldspots respectively in Saccharomyces cerevisiae, using support vector machine (SVM, which relies on codon composition differences. This method has achieved a high classification accuracy of 85.0%. Since codon composition is a fusion of codon usage bias and amino acid composition signals, the ability of these two kinds of sequence attributes to discriminate hot ORFs from cold ORFs was also investigated separately. Our results indicate that neither codon usage bias nor amino acid composition taken separately performed as well as codon composition. Moreover, our SVM based method was applied to the full genome: We predicted the hot/cold ORFs from the yeast genome by using cutoffs of recombination rate. We found that the performance of our method for predicting cold ORFs is not as good as that for predicting hot ORFs. Besides, we also observed a considerable correlation between meiotic recombination rate and amino acid composition of certain residues, which probably reflects the structural and functional dissimilarity between the hot and cold groups. Conclusion We have introduced a SVM-based novel method to discriminate hot ORFs from cold ones. Applying codon composition as sequence attributes, we have achieved a high classification accuracy, which suggests that codon composition has strong potential to be used as sequence attributes in the prediction of hot and cold ORFs.

  7. An improved chaotic fruit fly optimization based on a mutation strategy for simultaneous feature selection and parameter optimization for SVM and its applications.

    Directory of Open Access Journals (Sweden)

    Fei Ye

    Full Text Available This paper proposes a new support vector machine (SVM optimization scheme based on an improved chaotic fly optimization algorithm (FOA with a mutation strategy to simultaneously perform parameter setting turning for the SVM and feature selection. In the improved FOA, the chaotic particle initializes the fruit fly swarm location and replaces the expression of distance for the fruit fly to find the food source. However, the proposed mutation strategy uses two distinct generative mechanisms for new food sources at the osphresis phase, allowing the algorithm procedure to search for the optimal solution in both the whole solution space and within the local solution space containing the fruit fly swarm location. In an evaluation based on a group of ten benchmark problems, the proposed algorithm's performance is compared with that of other well-known algorithms, and the results support the superiority of the proposed algorithm. Moreover, this algorithm is successfully applied in a SVM to perform both parameter setting turning for the SVM and feature selection to solve real-world classification problems. This method is called chaotic fruit fly optimization algorithm (CIFOA-SVM and has been shown to be a more robust and effective optimization method than other well-known methods, particularly in terms of solving the medical diagnosis problem and the credit card problem.

  8. An improved chaotic fruit fly optimization based on a mutation strategy for simultaneous feature selection and parameter optimization for SVM and its applications

    Science.gov (United States)

    Lou, Xin Yuan; Sun, Lin Fu

    2017-01-01

    This paper proposes a new support vector machine (SVM) optimization scheme based on an improved chaotic fly optimization algorithm (FOA) with a mutation strategy to simultaneously perform parameter setting turning for the SVM and feature selection. In the improved FOA, the chaotic particle initializes the fruit fly swarm location and replaces the expression of distance for the fruit fly to find the food source. However, the proposed mutation strategy uses two distinct generative mechanisms for new food sources at the osphresis phase, allowing the algorithm procedure to search for the optimal solution in both the whole solution space and within the local solution space containing the fruit fly swarm location. In an evaluation based on a group of ten benchmark problems, the proposed algorithm’s performance is compared with that of other well-known algorithms, and the results support the superiority of the proposed algorithm. Moreover, this algorithm is successfully applied in a SVM to perform both parameter setting turning for the SVM and feature selection to solve real-world classification problems. This method is called chaotic fruit fly optimization algorithm (CIFOA)-SVM and has been shown to be a more robust and effective optimization method than other well-known methods, particularly in terms of solving the medical diagnosis problem and the credit card problem. PMID:28369096

  9. Automated Feature Identification and Classification Using Automated Feature Weighted Self Organizing Map (FWSOM)

    Science.gov (United States)

    Starkey, Andrew; Usman Ahmad, Aliyu; Hamdoun, Hassan

    2017-10-01

    This paper investigates the application of a novel method for classification called Feature Weighted Self Organizing Map (FWSOM) that analyses the topology information of a converged standard Self Organizing Map (SOM) to automatically guide the selection of important inputs during training for improved classification of data with redundant inputs, examined against two traditional approaches namely neural networks and Support Vector Machines (SVM) for the classification of EEG data as presented in previous work. In particular, the novel method looks to identify the features that are important for classification automatically, and in this way the important features can be used to improve the diagnostic ability of any of the above methods. The paper presents the results and shows how the automated identification of the important features successfully identified the important features in the dataset and how this results in an improvement of the classification results for all methods apart from linear discriminatory methods which cannot separate the underlying nonlinear relationship in the data. The FWSOM in addition to achieving higher classification accuracy has given insights into what features are important in the classification of each class (left and right-hand movements), and these are corroborated by already published work in this area.

  10. A wrapper-based approach for feature selection and classification of major depressive disorder-bipolar disorders.

    Science.gov (United States)

    Tekin Erguzel, Turker; Tas, Cumhur; Cebi, Merve

    2015-09-01

    Feature selection (FS) and classification are consecutive artificial intelligence (AI) methods used in data analysis, pattern classification, data mining and medical informatics. Beside promising studies in the application of AI methods to health informatics, working with more informative features is crucial in order to contribute to early diagnosis. Being one of the prevalent psychiatric disorders, depressive episodes of bipolar disorder (BD) is often misdiagnosed as major depressive disorder (MDD), leading to suboptimal therapy and poor outcomes. Therefore discriminating MDD and BD at earlier stages of illness could help to facilitate efficient and specific treatment. In this study, a nature inspired and novel FS algorithm based on standard Ant Colony Optimization (ACO), called improved ACO (IACO), was used to reduce the number of features by removing irrelevant and redundant data. The selected features were then fed into support vector machine (SVM), a powerful mathematical tool for data classification, regression, function estimation and modeling processes, in order to classify MDD and BD subjects. Proposed method used coherence, a promising quantitative electroencephalography (EEG) biomarker, values calculated from alpha, theta and delta frequency bands. The noteworthy performance of novel IACO-SVM approach stated that it is possible to discriminate 46 BD and 55 MDD subjects using 22 of 48 features with 80.19% overall classification accuracy. The performance of IACO algorithm was also compared to the performance of standard ACO, genetic algorithm (GA) and particle swarm optimization (PSO) algorithms in terms of their classification accuracy and number of selected features. In order to provide an almost unbiased estimate of classification error, the validation process was performed using nested cross-validation (CV) procedure. Copyright © 2015 Elsevier Ltd. All rights reserved.

  11. A Fast Classification Method of Faults in Power Electronic Circuits Based on Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Cui Jiang

    2017-12-01

    Full Text Available Fault detection and location are important and front-end tasks in assuring the reliability of power electronic circuits. In essence, both tasks can be considered as the classification problem. This paper presents a fast fault classification method for power electronic circuits by using the support vector machine (SVM as a classifier and the wavelet transform as a feature extraction technique. Using one-against-rest SVM and one-against-one SVM are two general approaches to fault classification in power electronic circuits. However, these methods have a high computational complexity, therefore in this design we employ a directed acyclic graph (DAG SVM to implement the fault classification. The DAG SVM is close to the one-against-one SVM regarding its classification performance, but it is much faster. Moreover, in the presented approach, the DAG SVM is improved by introducing the method of Knearest neighbours to reduce some computations, so that the classification time can be further reduced. A rectifier and an inverter are demonstrated to prove effectiveness of the presented design.

  12. Global discriminative learning for higher-accuracy computational gene prediction.

    Directory of Open Access Journals (Sweden)

    Axel Bernal

    2007-03-01

    Full Text Available Most ab initio gene predictors use a probabilistic sequence model, typically a hidden Markov model, to combine separately trained models of genomic signals and content. By combining separate models of relevant genomic features, such gene predictors can exploit small training sets and incomplete annotations, and can be trained fairly efficiently. However, that type of piecewise training does not optimize prediction accuracy and has difficulty in accounting for statistical dependencies among different parts of the gene model. With genomic information being created at an ever-increasing rate, it is worth investigating alternative approaches in which many different types of genomic evidence, with complex statistical dependencies, can be integrated by discriminative learning to maximize annotation accuracy. Among discriminative learning methods, large-margin classifiers have become prominent because of the success of support vector machines (SVM in many classification tasks. We describe CRAIG, a new program for ab initio gene prediction based on a conditional random field model with semi-Markov structure that is trained with an online large-margin algorithm related to multiclass SVMs. Our experiments on benchmark vertebrate datasets and on regions from the ENCODE project show significant improvements in prediction accuracy over published gene predictors that use intrinsic features only, particularly at the gene level and on genes with long introns.

  13. [Application of optimized parameters SVM based on photoacoustic spectroscopy method in fault diagnosis of power transformer].

    Science.gov (United States)

    Zhang, Yu-xin; Cheng, Zhi-feng; Xu, Zheng-ping; Bai, Jing

    2015-01-01

    In order to solve the problems such as complex operation, consumption for the carrier gas and long test period in traditional power transformer fault diagnosis approach based on dissolved gas analysis (DGA), this paper proposes a new method which is detecting 5 types of characteristic gas content in transformer oil such as CH4, C2H2, C2H4, C2H6 and H2 based on photoacoustic Spectroscopy and C2H2/C2H4, CH4/H2, C2H4/C2H6 three-ratios data are calculated. The support vector machine model was constructed using cross validation method under five support vector machine functions and four kernel functions, heuristic algorithms were used in parameter optimization for penalty factor c and g, which to establish the best SVM model for the highest fault diagnosis accuracy and the fast computing speed. Particles swarm optimization and genetic algorithm two types of heuristic algorithms were comparative studied in this paper for accuracy and speed in optimization. The simulation result shows that SVM model composed of C-SVC, RBF kernel functions and genetic algorithm obtain 97. 5% accuracy in test sample set and 98. 333 3% accuracy in train sample set, and genetic algorithm was about two times faster than particles swarm optimization in computing speed. The methods described in this paper has many advantages such as simple operation, non-contact measurement, no consumption for the carrier gas, long test period, high stability and sensitivity, the result shows that the methods described in this paper can instead of the traditional transformer fault diagnosis by gas chromatography and meets the actual project needs in transformer fault diagnosis.

  14. Active relearning for robust supervised classification of pulmonary emphysema

    Science.gov (United States)

    Raghunath, Sushravya; Rajagopalan, Srinivasan; Karwoski, Ronald A.; Bartholmai, Brian J.; Robb, Richard A.

    2012-03-01

    Radiologists are adept at recognizing the appearance of lung parenchymal abnormalities in CT scans. However, the inconsistent differential diagnosis, due to subjective aggregation, mandates supervised classification. Towards optimizing Emphysema classification, we introduce a physician-in-the-loop feedback approach in order to minimize uncertainty in the selected training samples. Using multi-view inductive learning with the training samples, an ensemble of Support Vector Machine (SVM) models, each based on a specific pair-wise dissimilarity metric, was constructed in less than six seconds. In the active relearning phase, the ensemble-expert label conflicts were resolved by an expert. This just-in-time feedback with unoptimized SVMs yielded 15% increase in classification accuracy and 25% reduction in the number of support vectors. The generality of relearning was assessed in the optimized parameter space of six different classifiers across seven dissimilarity metrics. The resultant average accuracy improved to 21%. The co-operative feedback method proposed here could enhance both diagnostic and staging throughput efficiency in chest radiology practice.

  15. Classification of atherosclerotic and non-atherosclerotic individuals using multiclass state vector machine.

    Science.gov (United States)

    Kumar, Paulraj Ranjith; Priya, Mohan

    2014-01-01

    Coronary artery disease due to atherosclerosis is an epidemic in India. An estimated 1.3 million Indians died from this in 2000. The projected death from coronary artery disease by 2016 is 2.98 million. To build an effective model which assorts the individuals, whether they belong to the normal group, risk group and pathologic group regarding atherosclerosis in real time by doing necessary preprocessing techniques and to compare the performance with other state-of-the-art machine learning techniques. In this work we have employed STULONG dataset. We have made a deep case study in selecting the attributes which contributes for higher accuracy in predicting the target. The selected attributes includes missing values. Initially our work includes imputation of missing values using Iterative Principal Component Analysis (IPCA). The second step includes selecting best features using Fast Correlation Based Filter (FCBF). Finally the classifier Multiclass Support Vector Machine (SVM) with kernel Radial Basis Function (RBF) is used for classification of atherosclerotic community. For the subjects belonging to the classes of normal, risk and pathologic, our methodology has outperformed with an accuracy of 99.85%, 99.80% and 99.46% respectively. The combined optimization methods such as Iterative Principal Component Analysis (IPCA) for missing value imputation, Multiclass SVM for classifying normal, risk and pathologic community in real time has performed with overall accuracy of about 98.97%. The essential pre-processing technique, Fast Correlation Based Filter (FCBF) was employed to further intensifying the target.

  16. Pipeline for the identification and classification of ion channels in parasitic flatworms.

    Science.gov (United States)

    Nor, Bahiyah; Young, Neil D; Korhonen, Pasi K; Hall, Ross S; Tan, Patrick; Lonie, Andrew; Gasser, Robin B

    2016-03-16

    Ion channels are well characterised in model organisms, principally because of the availability of functional genomic tools and datasets for these species. This contrasts the situation, for example, for parasites of humans and animals, whose genomic and biological uniqueness means that many genes and their products cannot be annotated. As ion channels are recognised as important drug targets in mammals, the accurate identification and classification of parasite channels could provide major prospects for defining unique targets for designing novel and specific anti-parasite therapies. Here, we established a reliable bioinformatic pipeline for the identification and classification of ion channels encoded in the genome of the cancer-causing liver fluke Opisthorchis viverrini, and extended its application to related flatworms affecting humans. We built an ion channel identification + classification pipeline (called MuSICC), employing an optimised support vector machine (SVM) model and using the Kyoto Encyclopaedia of Genes and Genomes (KEGG) classification system. Ion channel proteins were first identified and grouped according to amino acid sequence similarity to classified ion channels and the presence and number of ion channel-like conserved and transmembrane domains. Predicted ion channels were then classified to sub-family using a SVM model, trained using ion channel features. Following an evaluation of this pipeline (MuSICC), which demonstrated a classification sensitivity of 95.2 % and accuracy of 70.5 % for known ion channels, we applied it to effectively identify and classify ion channels in selected parasitic flatworms. MuSICC provides a practical and effective tool for the identification and classification of ion channels of parasitic flatworms, and should be applicable to a broad range of organisms that are evolutionarily distant from taxa whose ion channels are functionally characterised.

  17. Efficient and Privacy-Preserving Online Medical Prediagnosis Framework Using Nonlinear SVM.

    Science.gov (United States)

    Zhu, Hui; Liu, Xiaoxia; Lu, Rongxing; Li, Hui

    2017-05-01

    With the advances of machine learning algorithms and the pervasiveness of network terminals, the online medical prediagnosis system, which can provide the diagnosis of healthcare provider anywhere anytime, has attracted considerable interest recently. However, the flourish of online medical prediagnosis system still faces many challenges including information security and privacy preservation. In this paper, we propose an e fficient and privacy-preserving online medical prediagnosis framework, called eDiag, by using nonlinear kernel support vector machine (SVM). With eDiag, the sensitive personal health information can be processed without privacy disclosure during online prediagnosis service. Specifically, based on an improved expression for the nonlinear SVM, an efficient and privacy-preserving classification scheme is introduced with lightweight multiparty random masking and polynomial aggregation techniques. The encrypted user query is directly operated at the service provider without decryption, and the diagnosis result can only be decrypted by user. Through extensive analysis, we show that eDiag can ensure that users' health information and healthcare provider's prediction model are kept confidential, and has significantly less computation and communication overhead than existing schemes. In addition, performance evaluations via implementing eDiag on smartphone and computer demonstrate eDiag's effectiveness in term of real online environment.

  18. Multi-Sectional Views Textural Based SVM for MS Lesion Segmentation in Multi-Channels MRIs.

    Science.gov (United States)

    Abdullah, Bassem A; Younis, Akmal A; John, Nigel M

    2012-01-01

    In this paper, a new technique is proposed for automatic segmentation of multiple sclerosis (MS) lesions from brain magnetic resonance imaging (MRI) data. The technique uses a trained support vector machine (SVM) to discriminate between the blocks in regions of MS lesions and the blocks in non-MS lesion regions mainly based on the textural features with aid of the other features. The classification is done on each of the axial, sagittal and coronal sectional brain view independently and the resultant segmentations are aggregated to provide more accurate output segmentation. The main contribution of the proposed technique described in this paper is the use of textural features to detect MS lesions in a fully automated approach that does not rely on manually delineating the MS lesions. In addition, the technique introduces the concept of the multi-sectional view segmentation to produce verified segmentation. The proposed textural-based SVM technique was evaluated using three simulated datasets and more than fifty real MRI datasets. The results were compared with state of the art methods. The obtained results indicate that the proposed method would be viable for use in clinical practice for the detection of MS lesions in MRI.

  19. CLASSIFICATION OF CROPLANDS THROUGH FUSION OF OPTICAL AND SAR TIME SERIES DATA

    Directory of Open Access Journals (Sweden)

    S. Park

    2016-06-01

    Full Text Available Many satellite sensors including Landsat series have been extensively used for land cover classification. Studies have been conducted to mitigate classification problems associated with the use of single data (e.g., such as cloud contamination through multi-sensor data fusion and the use of time series data. This study investigated two areas with different environment and climate conditions: one in South Korea and the other in US. Cropland classification was conducted by using multi-temporal Landsat 5, Radarsat-1 and digital elevation models (DEM based on two machine learning approaches (i.e., random forest and support vector machines. Seven classification scenarios were examined and evaluated through accuracy assessment. Results show that SVM produced the best performance (overall accuracy of 93.87% when using all temporal and spectral data as input variables. Normalized Difference Water Index (NDWI, SAR backscattering, and Normalized Difference Vegetation Index (NDVI were identified as more contributing variables than the others for cropland classification.

  20. Preliminary research on organics recognition by x-ray absorption spectroscopy detection and classification

    Science.gov (United States)

    Wang, Qian; Wu, Xiaomei; Zhang, Wei; He, Shuting; Feng, Haifeng; Fang, Zheng

    2016-01-01

    X-ray Absorption Spectroscopy (XAS) was been applied for the material recognition in this paper. Twelve kinds of plastics were selected as specimens. Each specimen was tested for 100 times by different operators for data processing. Seventy sets of spectral data of each specimen were randomly selected as training set and the other 30 sets were selected as testing set. Training set was calculated with Principal Component Analysis (PCA) to get the first four Principal Components, which totally explain 99% of the original spectrum. The first four Principal Components were built plastic classification model respectively through K-Nearest Neighbor (KNN) and Support Vector Machine (SVM) methods. The classification accuracy reached 89.22%-98.17%. Experimental results demonstrate that organics could be recognized by XAS. It shows that the X-ray absorption spectroscopy contains the potential of other organics recognition or even organisms.

  1. Advances in metaheuristics for gene selection and classification of microarray data.

    Science.gov (United States)

    Duval, Béatrice; Hao, Jin-Kao

    2010-01-01

    Gene selection aims at identifying a (small) subset of informative genes from the initial data in order to obtain high predictive accuracy for classification. Gene selection can be considered as a combinatorial search problem and thus be conveniently handled with optimization methods. In this article, we summarize some recent developments of using metaheuristic-based methods within an embedded approach for gene selection. In particular, we put forward the importance and usefulness of integrating problem-specific knowledge into the search operators of such a method. To illustrate the point, we explain how ranking coefficients of a linear classifier such as support vector machine (SVM) can be profitably used to reinforce the search efficiency of Local Search and Evolutionary Search metaheuristic algorithms for gene selection and classification.

  2. Classification of agricultural fields using time series of dual polarimetry TerraSAR-X images

    Directory of Open Access Journals (Sweden)

    S. Mirzaee

    2014-10-01

    Full Text Available Due to its special imaging characteristics, Synthetic Aperture Radar (SAR has become an important source of information for a variety of remote sensing applications dealing with environmental changes. SAR images contain information about both phase and intensity in different polarization modes, making them sensitive to geometrical structure and physical properties of the targets such as dielectric and plant water content. In this study we investigate multi temporal changes occurring to different crop types due to phenological changes using high-resolution TerraSAR-X imagers. The dataset includes 17 dual-polarimetry TSX data acquired from June 2012 to August 2013 in Lorestan province, Iran. Several features are extracted from polarized data and classified using support vector machine (SVM classifier. Training samples and different features employed in classification are also assessed in the study. Results show a satisfactory accuracy for classification which is about 0.91 in kappa coefficient.

  3. Automatic optical detection and classification of marine animals around MHK converters using machine vision

    Energy Technology Data Exchange (ETDEWEB)

    Brunton, Steven [Univ. of Washington, Seattle, WA (United States)

    2018-01-15

    Optical systems provide valuable information for evaluating interactions and associations between organisms and MHK energy converters and for capturing potentially rare encounters between marine organisms and MHK device. The deluge of optical data from cabled monitoring packages makes expert review time-consuming and expensive. We propose algorithms and a processing framework to automatically extract events of interest from underwater video. The open-source software framework consists of background subtraction, filtering, feature extraction and hierarchical classification algorithms. This principle classification pipeline was validated on real-world data collected with an experimental underwater monitoring package. An event detection rate of 100% was achieved using robust principal components analysis (RPCA), Fourier feature extraction and a support vector machine (SVM) binary classifier. The detected events were then further classified into more complex classes – algae | invertebrate | vertebrate, one species | multiple species of fish, and interest rank. Greater than 80% accuracy was achieved using a combination of machine learning techniques.

  4. A study of the effectiveness of machine learning methods for classification of clinical interview fragments into a large number of categories.

    Science.gov (United States)

    Hasan, Mehedi; Kotov, Alexander; Carcone, April; Dong, Ming; Naar, Sylvie; Hartlieb, Kathryn Brogan

    2016-08-01

    This study examines the effectiveness of state-of-the-art supervised machine learning methods in conjunction with different feature types for the task of automatic annotation of fragments of clinical text based on codebooks with a large number of categories. We used a collection of motivational interview transcripts consisting of 11,353 utterances, which were manually annotated by two human coders as the gold standard, and experimented with state-of-art classifiers, including Naïve Bayes, J48 Decision Tree, Support Vector Machine (SVM), Random Forest (RF), AdaBoost, DiscLDA, Conditional Random Fields (CRF) and Convolutional Neural Network (CNN) in conjunction with lexical, contextual (label of the previous utterance) and semantic (distribution of words in the utterance across the Linguistic Inquiry and Word Count dictionaries) features. We found out that, when the number of classes is large, the performance of CNN and CRF is inferior to SVM. When only lexical features were used, interview transcripts were automatically annotated by SVM with the highest classification accuracy among all classifiers of 70.8%, 61% and 53.7% based on the codebooks consisting of 17, 20 and 41 codes, respectively. Using contextual and semantic features, as well as their combination, in addition to lexical ones, improved the accuracy of SVM for annotation of utterances in motivational interview transcripts with a codebook consisting of 17 classes to 71.5%, 74.2%, and 75.1%, respectively. Our results demonstrate the potential of using machine learning methods in conjunction with lexical, semantic and contextual features for automatic annotation of clinical interview transcripts with near-human accuracy. Copyright © 2016 Elsevier Inc. All rights reserved.

  5. Scale Issues Related to the Accuracy Assessment of Land Use/Land Cover Maps Produced Using Multi-Resolution Data: Comments on “The Improvement of Land Cover Classification by Thermal Remote Sensing”. Remote Sens. 2015, 7(7, 8368–8390

    Directory of Open Access Journals (Sweden)

    Brian A. Johnson

    2015-10-01

    Full Text Available Much remote sensing (RS research focuses on fusing, i.e., combining, multi-resolution/multi-sensor imagery for land use/land cover (LULC classification. In relation to this topic, Sun and Schulz [1] recently found that a combination of visible-to-near infrared (VNIR; 30 m spatial resolution and thermal infrared (TIR; 100–120 m spatial resolution Landsat data led to more accurate LULC classification. They also found that using multi-temporal TIR data alone for classification resulted in comparable (and in some cases higher classification accuracies to the use of multi-temporal VNIR data, which contrasts with the findings of other recent research [2]. This discrepancy, and the generally very high LULC accuracies achieved by Sun and Schulz (up to 99.2% overall accuracy for a combined VNIR/TIR classification result, can likely be explained by their use of an accuracy assessment procedure which does not take into account the multi-resolution nature of the data. Sun and Schulz used 10-fold cross-validation for accuracy assessment, which is not necessarily inappropriate for RS accuracy assessment in general. However, here it is shown that the typical pixel-based cross-validation approach results in non-independent training and validation data sets when the lower spatial resolution TIR images are used for classification, which causes classification accuracy to be overestimated.

  6. Vehicle classification in WAMI imagery using deep network

    Science.gov (United States)

    Yi, Meng; Yang, Fan; Blasch, Erik; Sheaff, Carolyn; Liu, Kui; Chen, Genshe; Ling, Haibin

    2016-05-01

    Humans have always had a keen interest in understanding activities and the surrounding environment for mobility, communication, and survival. Thanks to recent progress in photography and breakthroughs in aviation, we are now able to capture tens of megapixels of ground imagery, namely Wide Area Motion Imagery (WAMI), at multiple frames per second from unmanned aerial vehicles (UAVs). WAMI serves as a great source for many applications, including security, urban planning and route planning. These applications require fast and accurate image understanding which is time consuming for humans, due to the large data volume and city-scale area coverage. Therefore, automatic processing and understanding of WAMI imagery has been gaining attention in both industry and the research community. This paper focuses on an essential step in WAMI imagery analysis, namely vehicle classification. That is, deciding whether a certain image patch contains a vehicle or not. We collect a set of positive and negative sample image patches, for training and testing the detector. Positive samples are 64 × 64 image patches centered on annotated vehicles. We generate two sets of negative images. The first set is generated from positive images with some location shift. The second set of negative patches is generated from randomly sampled patches. We also discard those patches if a vehicle accidentally locates at the center. Both positive and negative samples are randomly divided into 9000 training images and 3000 testing images. We propose to train a deep convolution network for classifying these patches. The classifier is based on a pre-trained AlexNet Model in the Caffe library, with an adapted loss function for vehicle classification. The performance of our classifier is compared to several traditional image classifier methods using Support Vector Machine (SVM) and Histogram of Oriented Gradient (HOG) features. While the SVM+HOG method achieves an accuracy of 91.2%, the accuracy of our deep

  7. KOMPARASI MODEL SUPPORT VECTOR MACHINES (SVM DAN NEURAL NETWORK UNTUK MENGETAHUI TINGKAT AKURASI PREDIKSI TERTINGGI HARGA SAHAM

    Directory of Open Access Journals (Sweden)

    R. Hadapiningradja Kusumodestoni

    2017-09-01

    Full Text Available There are many types of investments to make money, one of which is in the form of shares. Shares is a trading company dealing with securities in the global capital markets. Stock Exchange or also called stock market is actually the activities of private companies in the form of buying and selling investments. To avoid losses in investing, we need a model of predictive analysis with high accuracy and supported by data - lots of data and accurately. The correct techniques in the analysis will be able to reduce the risk for investors in investing. There are many models used in the analysis of stock price movement prediction, in this study the researchers used models of neural networks (NN and a model of support vector machine (SVM. Based on the background of the problems that have been mentioned in the previous description it can be formulated the problem as follows: need an algorithm that can predict stock prices, and need a high accuracy rate by adding a data set on the prediction, two algorithms will be investigated expected results last researchers can deduce where the algorithm accuracy rate predictions are the highest or accurate, then the purpose of this study was to mengkomparasi or compare between the two algorithms are algorithms Neural Network algorithm and Support Vector Machine which later on the end result has an accuracy rate forecast stock prices highest to see the error value RMSEnya. After doing research using the model of neural network and model of support vector machine (SVM to predict the stock using the data value of the shares on the stock index hongkong dated July 20, 2016 at 16:26 pm until the date of 15 September 2016 at 17:40 pm as many as 729 data sets within an interval of 5 minute through a process of training, learning, and then continue the process of testing so the result is that by using a neural network model of the prediction accuracy of 0.503 +/- 0.009 (micro 503 while using the model of support vector machine

  8. Accurate Multisteps Traffic Flow Prediction Based on SVM

    Directory of Open Access Journals (Sweden)

    Zhang Mingheng

    2013-01-01

    Full Text Available Accurate traffic flow prediction is prerequisite and important for realizing intelligent traffic control and guidance, and it is also the objective requirement for intelligent traffic management. Due to the strong nonlinear, stochastic, time-varying characteristics of urban transport system, artificial intelligence methods such as support vector machine (SVM are now receiving more and more attentions in this research field. Compared with the traditional single-step prediction method, the multisteps prediction has the ability that can predict the traffic state trends over a certain period in the future. From the perspective of dynamic decision, it is far important than the current traffic condition obtained. Thus, in this paper, an accurate multi-steps traffic flow prediction model based on SVM was proposed. In which, the input vectors were comprised of actual traffic volume and four different types of input vectors were compared to verify their prediction performance with each other. Finally, the model was verified with actual data in the empirical analysis phase and the test results showed that the proposed SVM model had a good ability for traffic flow prediction and the SVM-HPT model outperformed the other three models for prediction.

  9. Fast rule-based bioactivity prediction using associative classification mining

    Directory of Open Access Journals (Sweden)

    Yu Pulan

    2012-11-01

    Full Text Available Abstract Relating chemical features to bioactivities is critical in molecular design and is used extensively in the lead discovery and optimization process. A variety of techniques from statistics, data mining and machine learning have been applied to this process. In this study, we utilize a collection of methods, called associative classification mining (ACM, which are popular in the data mining community, but so far have not been applied widely in cheminformatics. More specifically, classification based on predictive association rules (CPAR, classification based on multiple association rules (CMAR and classification based on association rules (CBA are employed on three datasets using various descriptor sets. Experimental evaluations on anti-tuberculosis (antiTB, mutagenicity and hERG (the human Ether-a-go-go-Related Gene blocker datasets show that these three methods are computationally scalable and appropriate for high speed mining. Additionally, they provide comparable accuracy and efficiency to the commonly used Bayesian and support vector machines (SVM methods, and produce highly interpretable models.

  10. Fast rule-based bioactivity prediction using associative classification mining.

    Science.gov (United States)

    Yu, Pulan; Wild, David J

    2012-11-23

    Relating chemical features to bioactivities is critical in molecular design and is used extensively in the lead discovery and optimization process. A variety of techniques from statistics, data mining and machine learning have been applied to this process. In this study, we utilize a collection of methods, called associative classification mining (ACM), which are popular in the data mining community, but so far have not been applied widely in cheminformatics. More specifically, classification based on predictive association rules (CPAR), classification based on multiple association rules (CMAR) and classification based on association rules (CBA) are employed on three datasets using various descriptor sets. Experimental evaluations on anti-tuberculosis (antiTB), mutagenicity and hERG (the human Ether-a-go-go-Related Gene) blocker datasets show that these three methods are computationally scalable and appropriate for high speed mining. Additionally, they provide comparable accuracy and efficiency to the commonly used Bayesian and support vector machines (SVM) methods, and produce highly interpretable models.

  11. Band selection for hyperspectral image classification using extreme learning machine

    Science.gov (United States)

    Li, Jiaojiao; Kingsdorf, Benjamin; Du, Qian

    2017-05-01

    Extreme learning machine (ELM) is a feedforward neural network with one hidden layer, which is similar to a multilayer perceptron (MLP). To reduce the complexity in the training process of MLP using the traditional backpropagation algorithm, the weights in ELM between input and hidden layers are random variables. The output layer in the ELM is linear, as in a radial basis function neural network (RBFNN), so the output weights can be easily estimated with a least squares solution. It has been demonstrated in our previous work that the computational cost of ELM is much lower than the standard support vector machine (SVM), and a kernel version of ELM can offer comparable performance as SVM. In our previous work, we also investigate the impact of the number of hidden neurons to the performance of ELM. Basically, more hidden neurons are needed if the number of training samples and data dimensionality are large, which results in a very large matrix inversion problem. To avoid handling such a large matrix, we propose to conduct band selection to reduce data dimensionality (i.e., the number of input neurons), thereby reducing network complexity. Experimental results show that ELM using selected bands can yield similar or even better classification accuracy than using all the original bands.

  12. N-gram support vector machines for scalable procedure and diagnosis classification, with applications to clinical free text data from the intensive care unit.

    Science.gov (United States)

    Marafino, Ben J; Davies, Jason M; Bardach, Naomi S; Dean, Mitzi L; Dudley, R Adams

    2014-01-01

    Existing risk adjustment models for intensive care unit (ICU) outcomes rely on manual abstraction of patient-level predictors from medical charts. Developing an automated method for abstracting these data from free text might reduce cost and data collection times. To develop a support vector machine (SVM) classifier capable of identifying a range of procedures and diagnoses in ICU clinical notes for use in risk adjustment. We selected notes from 2001-2008 for 4191 neonatal ICU (NICU) and 2198 adult ICU patients from the MIMIC-II database from the Beth Israel Deaconess Medical Center. Using these notes, we developed an implementation of the SVM classifier to identify procedures (mechanical ventilation and phototherapy in NICU notes) and diagnoses (jaundice in NICU and intracranial hemorrhage (ICH) in adult ICU). On the jaundice classification task, we also compared classifier performance using n-gram features to unigrams with application of a negation algorithm (NegEx). Our classifier accurately identified mechanical ventilation (accuracy=0.982, F1=0.954) and phototherapy use (accuracy=0.940, F1=0.912), as well as jaundice (accuracy=0.898, F1=0.884) and ICH diagnoses (accuracy=0.938, F1=0.943). Including bigram features improved performance on the jaundice (accuracy=0.898 vs 0.865) and ICH (0.938 vs 0.927) tasks, and outperformed NegEx-derived unigram features (accuracy=0.898 vs 0.863) on the jaundice task. Overall, a classifier using n-gram support vectors displayed excellent performance characteristics. The classifier generalizes to diverse patient populations, diagnoses, and procedures. SVM-based classifiers can accurately identify procedure status and diagnoses among ICU patients, and including n-gram features improves performance, compared to existing methods. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  13. Emotion classification using single-channel scalp-EEG recording.

    Science.gov (United States)

    Jalilifard, Amir; Brigante Pizzolato, Ednaldo; Kafiul Islam, Md

    2016-08-01

    Several studies have found evidence for corticolimbic Theta electroencephalographic (EEG) oscillation in the neural processing of visual stimuli perceived as fear or threatening scene. Recent studies showed that neural oscillations' patterns in Theta, Alpha, Beta and Gamma sub-bands play a main role in brain's emotional processing. The main goal of this study is to classify two different emotional states by means of EEG data recorded through a single-electrode EEG headset. Nineteen young subjects participated in an EEG experiment while watching a video clip that evoked three emotional states: neutral, relaxation and scary. Following each video clip, participants were asked to report on their subjective affect by giving a score between 0 to 10. First, recorded EEG data were preprocessed by stationary wavelet transform (SWT) based denoising to remove artifacts. Afterward, the distribution of power in time-frequency space was obtained using short-time Fourier transform (STFT) and then, the mean value of energy was calculated for each EEG sub-band. Finally, 46 features, as the mean energy of frequency bands between 4 and 50 Hz, containing 689 instances - for each subject -were collected in order to classify the emotional states. Our experimental results show that EEG dynamics induced by horror and relaxing movies can be classified with average classification rate of 92% using support vector machine (SVM) classifier. We also compared the performance of SVM to K-nearest neighbors (K-NN). The results show that K-NN achieves a better classification rate by 94% accuracy. The findings of this work are expected to pave the way to a new horizon in neuroscience by proving the point that only single-channel EEG data carry enough information for emotion classification.

  14. On mining incomplete medical datasets: Ordering imputation and classification.

    Science.gov (United States)

    Chen, Chih-Wen; Lin, Wei-Chao; Ke, Shih-Wen; Tsai, Chih-Fong; Hu, Ya-Han

    2015-01-01

    To collect medical datasets, it is usually the case that a number of data samples contain some missing values. Performing the data mining task over the incomplete datasets is a difficult problem. In general, missing value imputation can be approached, which aims at providing estimations for missing values by reasoning from the observed data. Consequently, the effectiveness of missing value imputation is heavily dependent on the observed data (or complete data) in the incomplete datasets. In this paper, the research objective is to perform instance selection to filter out some noisy data (or outliers) from a given (complete) dataset to see its effect on the final imputation result. Specifically, four different processes of combining instance selection and missing value imputation are proposed and compared in terms of data classification. Experiments are conducted based on 11 medical related datasets containing categorical, numerical, and mixed attribute types of data. In addition, missing values for each dataset are introduced into all attributes (the missing data rates are 10%, 20%, 30%, 40%, and 50%). For instance selection and missing value imputation, the DROP3 and k-nearest neighbor imputation methods are employed. On the other hand, the support vector machine (SVM) classifier is used to assess the final classification accuracy of the four different processes. The experimental results show that the second process by performing instance selection first and imputation second allows the SVM classifiers to outperform the other processes. For incomplete medical datasets containing some missing values, it is necessary to perform missing value imputation. In this paper, we demonstrate that instance selection can be used to filter out some noisy data or outliers before the imputation process. In other words, the observed data for missing value imputation may contain some noisy information, which can degrade the quality of the imputation result as well as the

  15. Classification of team sport activities using a single wearable tracking device.

    Science.gov (United States)

    Wundersitz, Daniel W T; Josman, Casey; Gupta, Ritu; Netto, Kevin J; Gastin, Paul B; Robertson, Sam

    2015-11-26

    Wearable tracking devices incorporating accelerometers and gyroscopes are increasingly being used for activity analysis in sports. However, minimal research exists relating to their ability to classify common activities. The purpose of this study was to determine whether data obtained from a single wearable tracking device can be used to classify team sport-related activities. Seventy-six non-elite sporting participants were tested during a simulated team sport circuit (involving stationary, walking, jogging, running, changing direction, counter-movement jumping, jumping for distance and tackling activities) in a laboratory setting. A MinimaxX S4 wearable tracking device was worn below the neck, in-line and dorsal to the first to fifth thoracic vertebrae of the spine, with tri-axial accelerometer and gyroscope data collected at 100Hz. Multiple time domain, frequency domain and custom features were extracted from each sensor using 0.5, 1.0, and 1.5s movement capture durations. Features were further screened using a combination of ANOVA and Lasso methods. Relevant features were used to classify the eight activities performed using the Random Forest (RF), Support Vector Machine (SVM) and Logistic Model Tree (LMT) algorithms. The LMT (79-92% classification accuracy) outperformed RF (32-43%) and SVM algorithms (27-40%), obtaining strongest performance using the full model (accelerometer and gyroscope inputs). Processing time can be reduced through feature selection methods (range 1.5-30.2%), however a trade-off exists between classification accuracy and processing time. Movement capture duration also had little impact on classification accuracy or processing time. In sporting scenarios where wearable tracking devices are employed, it is both possible and feasible to accurately classify team sport-related activities. Copyright © 2015 Elsevier Ltd. All rights reserved.

  16. Individual classification of children with epilepsy using support vector machine with multiple indices of diffusion tensor imaging

    Directory of Open Access Journals (Sweden)

    Ishmael Amarreh

    2014-01-01

    Conclusion: DTI-based SVM classification appears promising for distinguishing children with active epilepsy from either those with remitted epilepsy or controls, and the question that arises is whether it will prove useful as a prognostic index of seizure remission. While SVM can correctly identify children with active epilepsy from other groups' diagnosis, further research is needed to determine the efficacy of SVM as a prognostic tool in longitudinal clinical studies.

  17. Construction accident narrative classification: An evaluation of text mining techniques.

    Science.gov (United States)

    Goh, Yang Miang; Ubeynarayana, C U

    2017-11-01

    Learning from past accidents is fundamental to accident prevention. Thus, accident and near miss reporting are encouraged by organizations and regulators. However, for organizations managing large safety databases, the time taken to accurately classify accident and near miss narratives will be very significant. This study aims to evaluate the utility of various text mining classification techniques in classifying 1000 publicly available construction accident narratives obtained from the US OSHA website. The study evaluated six machine learning algorithms, including support vector machine (SVM), linear regression (LR), random forest (RF), k-nearest neighbor (KNN), decision tree (DT) and Naive Bayes (NB), and found that SVM produced the best performance in classifying the test set of 251 cases. Further experimentation with tokenization of the processed text and non-linear SVM were also conducted. In addition, a grid search was conducted on the hyperparameters of the SVM models. It was found that the best performing classifiers were linear SVM with unigram tokenization and radial basis function (RBF) SVM with uni-gram tokenization. In view of its relative simplicity, the linear SVM is recommended. Across the 11 labels of accident causes or types, the precision of the linear SVM ranged from 0.5 to 1, recall ranged from 0.36 to 0.9 and F1 score was between 0.45 and 0.92. The reasons for misclassification were discussed and suggestions on ways to improve the performance were provided. Copyright © 2017 Elsevier Ltd. All rights reserved.

  18. A fast image retrieval method based on SVM and imbalanced samples in filtering multimedia message spam

    Science.gov (United States)

    Chen, Zhang; Peng, Zhenming; Peng, Lingbing; Liao, Dongyi; He, Xin

    2011-11-01

    With the swift and violent development of the Multimedia Messaging Service (MMS), it becomes an urgent task to filter the Multimedia Message (MM) spam effectively in real-time. For the fact that most MMs contain images or videos, a method based on retrieving images is given in this paper for filtering MM spam. The detection method used in this paper is a combination of skin-color detection, texture detection, and face detection, and the classifier for this imbalanced problem is a very fast multi-classification combining Support vector machine (SVM) with unilateral binary decision tree. The experiments on 3 test sets show that the proposed method is effective, with the interception rate up to 60% and the average detection time for each image less than 1 second.

  19. R-Peak Detection using Daubechies Wavelet and ECG Signal Classification using Radial Basis Function Neural Network

    Science.gov (United States)

    Rai, H. M.; Trivedi, A.; Chatterjee, K.; Shukla, S.

    2014-01-01

    This paper employed the Daubechies wavelet transform (WT) for R-peak detection and radial basis function neural network (RBFNN) to classify the electrocardiogram (ECG) signals. Five types of ECG beats: normal beat, paced beat, left bundle branch block (LBBB) beat, right bundle branch block (RBBB) beat and premature ventricular contraction (PVC) were classified. 500 QRS complexes were arbitrarily extracted from 26 records in Massachusetts Institute of Technology-Beth Israel Hospital (MIT-BIH) arrhythmia database, which are available on Physionet website. Each and every QRS complex was represented by 21 points from p1 to p21 and these QRS complexes of each record were categorized according to types of beats. The system performance was computed using four types of parameter evaluation metrics: sensitivity, positive predictivity, specificity and classification error rate. The experimental result shows that the average values of sensitivity, positive predictivity, specificity and classification error rate are 99.8%, 99.60%, 99.90% and 0.12%, respectively with RBFNN classifier. The overall accuracy achieved for back propagation neural network (BPNN), multilayered perceptron (MLP), support vector machine (SVM) and RBFNN classifiers are 97.2%, 98.8%, 99% and 99.6%, respectively. The accuracy levels and processing time of RBFNN is higher than or comparable with BPNN, MLP and SVM classifiers.

  20. Intra-regional classification of grape seeds produced in Mendoza province (Argentina) by multi-elemental analysis and chemometrics tools.

    Science.gov (United States)

    Canizo, Brenda V; Escudero, Leticia B; Pérez, María B; Pellerano, Roberto G; Wuilloud, Rodolfo G

    2018-03-01

    The feasibility of the application of chemometric techniques associated with multi-element analysis for the classification of grape seeds according to their provenance vineyard soil was investigated. Grape seed samples from different localities of Mendoza province (Argentina) were evaluated. Inductively coupled plasma mass spectrometry (