WorldWideScience

Sample records for svm classification accuracy

  1. Efficient HIK SVM learning for image classification.

    Science.gov (United States)

    Wu, Jianxin

    2012-10-01

    Histograms are used in almost every aspect of image processing and computer vision, from visual descriptors to image representations. Histogram intersection kernel (HIK) and support vector machine (SVM) classifiers are shown to be very effective in dealing with histograms. This paper presents contributions concerning HIK SVM for image classification. First, we propose intersection coordinate descent (ICD), a deterministic and scalable HIK SVM solver. ICD is much faster than, and has similar accuracies to, general purpose SVM solvers and other fast HIK SVM training methods. We also extend ICD to the efficient training of a broader family of kernels. Second, we show an important empirical observation that ICD is not sensitive to the C parameter in SVM, and we provide some theoretical analyses to explain this observation. ICD achieves high accuracies in many problems, using its default parameters. This is an attractive property for practitioners, because many image processing tasks are too large to choose SVM parameters using cross-validation.

  2. Dates fruits classification using SVM

    Science.gov (United States)

    Alzu'bi, Reem; Anushya, A.; Hamed, Ebtisam; Al Sha'ar, Eng. Abdelnour; Vincy, B. S. Angela

    2018-04-01

    In this paper, we used SVM in classifying various types of dates using their images. Dates have interesting different characteristics that can be valuable to distinguish and determine a particular date type. These characteristics include shape, texture, and color. A system that achieves 100% accuracy was built to classify the dates which can be eatable and cannot be eatable. The built system helps the food industry and customer in classifying dates depending on specific quality measures giving best performance with specific type of dates.

  3. Customer and performance rating in QFD using SVM classification

    Science.gov (United States)

    Dzulkifli, Syarizul Amri; Salleh, Mohd Najib Mohd; Leman, A. M.

    2017-09-01

    In a classification problem, where each input is associated to one output. Training data is used to create a model which predicts values to the true function. SVM is a popular method for binary classification due to their theoretical foundation and good generalization performance. However, when trained with noisy data, the decision hyperplane might deviate from optimal position because of the sum of misclassification errors in the objective function. In this paper, we introduce fuzzy in weighted learning approach for improving the accuracy of Support Vector Machine (SVM) classification. The main aim of this work is to determine appropriate weighted for SVM to adjust the parameters of learning method from a given set of noisy input to output data. The performance and customer rating in Quality Function Deployment (QFD) is used as our case study to determine implementing fuzzy SVM is highly scalable for very large data sets and generating high classification accuracy.

  4. Adaptive SVM for Data Stream Classification

    Directory of Open Access Journals (Sweden)

    Isah A. Lawal

    2017-07-01

    Full Text Available In this paper, we address the problem of learning an adaptive classifier for the classification of continuous streams of data. We present a solution based on incremental extensions of the Support Vector Machine (SVM learning paradigm that updates an existing SVM whenever new training data are acquired. To ensure that the SVM effectiveness is guaranteed while exploiting the newly gathered data, we introduce an on-line model selection approach in the incremental learning process. We evaluated the proposed method on real world applications including on-line spam email filtering and human action classification from videos. Experimental results show the effectiveness and the potential of the proposed approach.

  5. Quality-Oriented Classification of Aircraft Material Based on SVM

    Directory of Open Access Journals (Sweden)

    Hongxia Cai

    2014-01-01

    Full Text Available The existing material classification is proposed to improve the inventory management. However, different materials have the different quality-related attributes, especially in the aircraft industry. In order to reduce the cost without sacrificing the quality, we propose a quality-oriented material classification system considering the material quality character, Quality cost, and Quality influence. Analytic Hierarchy Process helps to make feature selection and classification decision. We use the improved Kraljic Portfolio Matrix to establish the three-dimensional classification model. The aircraft materials can be divided into eight types, including general type, key type, risk type, and leveraged type. Aiming to improve the classification accuracy of various materials, the algorithm of Support Vector Machine is introduced. Finally, we compare the SVM and BP neural network in the application. The results prove that the SVM algorithm is more efficient and accurate and the quality-oriented material classification is valuable.

  6. Classification of EMG signals using PSO optimized SVM for diagnosis of neuromuscular disorders.

    Science.gov (United States)

    Subasi, Abdulhamit

    2013-06-01

    Support vector machine (SVM) is an extensively used machine learning method with many biomedical signal classification applications. In this study, a novel PSO-SVM model has been proposed that hybridized the particle swarm optimization (PSO) and SVM to improve the EMG signal classification accuracy. This optimization mechanism involves kernel parameter setting in the SVM training procedure, which significantly influences the classification accuracy. The experiments were conducted on the basis of EMG signal to classify into normal, neurogenic or myopathic. In the proposed method the EMG signals were decomposed into the frequency sub-bands using discrete wavelet transform (DWT) and a set of statistical features were extracted from these sub-bands to represent the distribution of wavelet coefficients. The obtained results obviously validate the superiority of the SVM method compared to conventional machine learning methods, and suggest that further significant enhancements in terms of classification accuracy can be achieved by the proposed PSO-SVM classification system. The PSO-SVM yielded an overall accuracy of 97.41% on 1200 EMG signals selected from 27 subject records against 96.75%, 95.17% and 94.08% for the SVM, the k-NN and the RBF classifiers, respectively. PSO-SVM is developed as an efficient tool so that various SVMs can be used conveniently as the core of PSO-SVM for diagnosis of neuromuscular disorders. Copyright © 2013 Elsevier Ltd. All rights reserved.

  7. New KF-PP-SVM classification method for EEG in brain-computer interfaces.

    Science.gov (United States)

    Yang, Banghua; Han, Zhijun; Zan, Peng; Wang, Qian

    2014-01-01

    Classification methods are a crucial direction in the current study of brain-computer interfaces (BCIs). To improve the classification accuracy for electroencephalogram (EEG) signals, a novel KF-PP-SVM (kernel fisher, posterior probability, and support vector machine) classification method is developed. Its detailed process entails the use of common spatial patterns to obtain features, based on which the within-class scatter is calculated. Then the scatter is added into the kernel function of a radial basis function to construct a new kernel function. This new kernel is integrated into the SVM to obtain a new classification model. Finally, the output of SVM is calculated based on posterior probability and the final recognition result is obtained. To evaluate the effectiveness of the proposed KF-PP-SVM method, EEG data collected from laboratory are processed with four different classification schemes (KF-PP-SVM, KF-SVM, PP-SVM, and SVM). The results showed that the overall average improvements arising from the use of the KF-PP-SVM scheme as opposed to KF-SVM, PP-SVM and SVM schemes are 2.49%, 5.83 % and 6.49 % respectively.

  8. Multiclass Posterior Probability Twin SVM for Motor Imagery EEG Classification.

    Science.gov (United States)

    She, Qingshan; Ma, Yuliang; Meng, Ming; Luo, Zhizeng

    2015-01-01

    Motor imagery electroencephalography is widely used in the brain-computer interface systems. Due to inherent characteristics of electroencephalography signals, accurate and real-time multiclass classification is always challenging. In order to solve this problem, a multiclass posterior probability solution for twin SVM is proposed by the ranking continuous output and pairwise coupling in this paper. First, two-class posterior probability model is constructed to approximate the posterior probability by the ranking continuous output techniques and Platt's estimating method. Secondly, a solution of multiclass probabilistic outputs for twin SVM is provided by combining every pair of class probabilities according to the method of pairwise coupling. Finally, the proposed method is compared with multiclass SVM and twin SVM via voting, and multiclass posterior probability SVM using different coupling approaches. The efficacy on the classification accuracy and time complexity of the proposed method has been demonstrated by both the UCI benchmark datasets and real world EEG data from BCI Competition IV Dataset 2a, respectively.

  9. SVM Classifier - a comprehensive java interface for support vector machine classification of microarray data.

    Science.gov (United States)

    Pirooznia, Mehdi; Deng, Youping

    2006-12-12

    Graphical user interface (GUI) software promotes novelty by allowing users to extend the functionality. SVM Classifier is a cross-platform graphical application that handles very large datasets well. The purpose of this study is to create a GUI application that allows SVM users to perform SVM training, classification and prediction. The GUI provides user-friendly access to state-of-the-art SVM methods embodied in the LIBSVM implementation of Support Vector Machine. We implemented the java interface using standard swing libraries. We used a sample data from a breast cancer study for testing classification accuracy. We achieved 100% accuracy in classification among the BRCA1-BRCA2 samples with RBF kernel of SVM. We have developed a java GUI application that allows SVM users to perform SVM training, classification and prediction. We have demonstrated that support vector machines can accurately classify genes into functional categories based upon expression data from DNA microarray hybridization experiments. Among the different kernel functions that we examined, the SVM that uses a radial basis kernel function provides the best performance. The SVM Classifier is available at http://mfgn.usm.edu/ebl/svm/.

  10. SVM Pixel Classification on Colour Image Segmentation

    Science.gov (United States)

    Barui, Subhrajit; Latha, S.; Samiappan, Dhanalakshmi; Muthu, P.

    2018-04-01

    The aim of image segmentation is to simplify the representation of an image with the help of cluster pixels into something meaningful to analyze. Segmentation is typically used to locate boundaries and curves in an image, precisely to label every pixel in an image to give each pixel an independent identity. SVM pixel classification on colour image segmentation is the topic highlighted in this paper. It holds useful application in the field of concept based image retrieval, machine vision, medical imaging and object detection. The process is accomplished step by step. At first we need to recognize the type of colour and the texture used as an input to the SVM classifier. These inputs are extracted via local spatial similarity measure model and Steerable filter also known as Gabon Filter. It is then trained by using FCM (Fuzzy C-Means). Both the pixel level information of the image and the ability of the SVM Classifier undergoes some sophisticated algorithm to form the final image. The method has a well developed segmented image and efficiency with respect to increased quality and faster processing of the segmented image compared with the other segmentation methods proposed earlier. One of the latest application result is the Light L16 camera.

  11. Comparative study of SVM methods combined with voxel selection for object category classification on fMRI data.

    Science.gov (United States)

    Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li

    2011-02-16

    Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice.

  12. Optimization of Support Vector Machine (SVM) for Object Classification

    Science.gov (United States)

    Scholten, Matthew; Dhingra, Neil; Lu, Thomas T.; Chao, Tien-Hsin

    2012-01-01

    The Support Vector Machine (SVM) is a powerful algorithm, useful in classifying data into species. The SVMs implemented in this research were used as classifiers for the final stage in a Multistage Automatic Target Recognition (ATR) system. A single kernel SVM known as SVMlight, and a modified version known as a SVM with K-Means Clustering were used. These SVM algorithms were tested as classifiers under varying conditions. Image noise levels varied, and the orientation of the targets changed. The classifiers were then optimized to demonstrate their maximum potential as classifiers. Results demonstrate the reliability of SVM as a method for classification. From trial to trial, SVM produces consistent results.

  13. SVM Classifier – a comprehensive java interface for support vector machine classification of microarray data

    Science.gov (United States)

    Pirooznia, Mehdi; Deng, Youping

    2006-01-01

    Motivation Graphical user interface (GUI) software promotes novelty by allowing users to extend the functionality. SVM Classifier is a cross-platform graphical application that handles very large datasets well. The purpose of this study is to create a GUI application that allows SVM users to perform SVM training, classification and prediction. Results The GUI provides user-friendly access to state-of-the-art SVM methods embodied in the LIBSVM implementation of Support Vector Machine. We implemented the java interface using standard swing libraries. We used a sample data from a breast cancer study for testing classification accuracy. We achieved 100% accuracy in classification among the BRCA1–BRCA2 samples with RBF kernel of SVM. Conclusion We have developed a java GUI application that allows SVM users to perform SVM training, classification and prediction. We have demonstrated that support vector machines can accurately classify genes into functional categories based upon expression data from DNA microarray hybridization experiments. Among the different kernel functions that we examined, the SVM that uses a radial basis kernel function provides the best performance. The SVM Classifier is available at . PMID:17217518

  14. Research on Classification of Chinese Text Data Based on SVM

    Science.gov (United States)

    Lin, Yuan; Yu, Hongzhi; Wan, Fucheng; Xu, Tao

    2017-09-01

    Data Mining has important application value in today’s industry and academia. Text classification is a very important technology in data mining. At present, there are many mature algorithms for text classification. KNN, NB, AB, SVM, decision tree and other classification methods all show good classification performance. Support Vector Machine’ (SVM) classification method is a good classifier in machine learning research. This paper will study the classification effect based on the SVM method in the Chinese text data, and use the support vector machine method in the chinese text to achieve the classify chinese text, and to able to combination of academia and practical application.

  15. [Hyperspectral remote sensing image classification based on SVM optimized by clonal selection].

    Science.gov (United States)

    Liu, Qing-Jie; Jing, Lin-Hai; Wang, Meng-Fei; Lin, Qi-Zhong

    2013-03-01

    Model selection for support vector machine (SVM) involving kernel and the margin parameter values selection is usually time-consuming, impacts training efficiency of SVM model and final classification accuracies of SVM hyperspectral remote sensing image classifier greatly. Firstly, based on combinatorial optimization theory and cross-validation method, artificial immune clonal selection algorithm is introduced to the optimal selection of SVM (CSSVM) kernel parameter a and margin parameter C to improve the training efficiency of SVM model. Then an experiment of classifying AVIRIS in India Pine site of USA was performed for testing the novel CSSVM, as well as a traditional SVM classifier with general Grid Searching cross-validation method (GSSVM) for comparison. And then, evaluation indexes including SVM model training time, classification overall accuracy (OA) and Kappa index of both CSSVM and GSSVM were all analyzed quantitatively. It is demonstrated that OA of CSSVM on test samples and whole image are 85.1% and 81.58, the differences from that of GSSVM are both within 0.08% respectively; And Kappa indexes reach 0.8213 and 0.7728, the differences from that of GSSVM are both within 0.001; While the ratio of model training time of CSSVM and GSSVM is between 1/6 and 1/10. Therefore, CSSVM is fast and accurate algorithm for hyperspectral image classification and is superior to GSSVM.

  16. A Method of Particle Swarm Optimized SVM Hyper-spectral Remote Sensing Image Classification

    International Nuclear Information System (INIS)

    Liu, Q J; Jing, L H; Wang, L M; Lin, Q Z

    2014-01-01

    Support Vector Machine (SVM) has been proved to be suitable for classification of remote sensing image and proposed to overcome the Hughes phenomenon. Hyper-spectral sensors are intrinsically designed to discriminate among a broad range of land cover classes which may lead to high computational time in SVM mutil-class algorithms. Model selection for SVM involving kernel and the margin parameter values selection which is usually time-consuming, impacts training efficiency of SVM model and final classification accuracies of SVM hyper-spectral remote sensing image classifier greatly. Firstly, based on combinatorial optimization theory and cross-validation method, particle swarm algorithm is introduced to the optimal selection of SVM (PSSVM) kernel parameter σ and margin parameter C to improve the modelling efficiency of SVM model. Then an experiment of classifying AVIRIS in India Pine site of USA was performed for evaluating the novel PSSVM, as well as traditional SVM classifier with general Grid-Search cross-validation method (GSSVM). And then, evaluation indexes including SVM model training time, classification Overall Accuracy (OA) and Kappa index of both PSSVM and GSSVM are all analyzed quantitatively. It is demonstrated that OA of PSSVM on test samples and whole image are 85% and 82%, the differences with that of GSSVM are both within 0.08% respectively. And Kappa indexes reach 0.82 and 0.77, the differences with that of GSSVM are both within 0.001. While the modelling time of PSSVM can be only 1/10 of that of GSSVM, and the modelling. Therefore, PSSVM is an fast and accurate algorithm for hyper-spectral image classification and is superior to GSSVM

  17. Cardiac arrhythmia beat classification using DOST and PSO tuned SVM.

    Science.gov (United States)

    Raj, Sandeep; Ray, Kailash Chandra; Shankar, Om

    2016-11-01

    The increase in the number of deaths due to cardiovascular diseases (CVDs) has gained significant attention from the study of electrocardiogram (ECG) signals. These ECG signals are studied by the experienced cardiologist for accurate and proper diagnosis, but it becomes difficult and time-consuming for long-term recordings. Various signal processing techniques are studied to analyze the ECG signal, but they bear limitations due to the non-stationary behavior of ECG signals. Hence, this study aims to improve the classification accuracy rate and provide an automated diagnostic solution for the detection of cardiac arrhythmias. The proposed methodology consists of four stages, i.e. filtering, R-peak detection, feature extraction and classification stages. In this study, Wavelet based approach is used to filter the raw ECG signal, whereas Pan-Tompkins algorithm is used for detecting the R-peak inside the ECG signal. In the feature extraction stage, discrete orthogonal Stockwell transform (DOST) approach is presented for an efficient time-frequency representation (i.e. morphological descriptors) of a time domain signal and retains the absolute phase information to distinguish the various non-stationary behavior ECG signals. Moreover, these morphological descriptors are further reduced in lower dimensional space by using principal component analysis and combined with the dynamic features (i.e based on RR-interval of the ECG signals) of the input signal. This combination of two different kinds of descriptors represents each feature set of an input signal that is utilized for classification into subsequent categories by employing PSO tuned support vector machines (SVM). The proposed methodology is validated on the baseline MIT-BIH arrhythmia database and evaluated under two assessment schemes, yielding an improved overall accuracy of 99.18% for sixteen classes in the category-based and 89.10% for five classes (mapped according to AAMI standard) in the patient

  18. SVM Based Descriptor Selection and Classification of Neurodegenerative Disease Drugs for Pharmacological Modeling.

    Science.gov (United States)

    Shahid, Mohammad; Shahzad Cheema, Muhammad; Klenner, Alexander; Younesi, Erfan; Hofmann-Apitius, Martin

    2013-03-01

    Systems pharmacological modeling of drug mode of action for the next generation of multitarget drugs may open new routes for drug design and discovery. Computational methods are widely used in this context amongst which support vector machines (SVM) have proven successful in addressing the challenge of classifying drugs with similar features. We have applied a variety of such SVM-based approaches, namely SVM-based recursive feature elimination (SVM-RFE). We use the approach to predict the pharmacological properties of drugs widely used against complex neurodegenerative disorders (NDD) and to build an in-silico computational model for the binary classification of NDD drugs from other drugs. Application of an SVM-RFE model to a set of drugs successfully classified NDD drugs from non-NDD drugs and resulted in overall accuracy of ∼80 % with 10 fold cross validation using 40 top ranked molecular descriptors selected out of total 314 descriptors. Moreover, SVM-RFE method outperformed linear discriminant analysis (LDA) based feature selection and classification. The model reduced the multidimensional descriptors space of drugs dramatically and predicted NDD drugs with high accuracy, while avoiding over fitting. Based on these results, NDD-specific focused libraries of drug-like compounds can be designed and existing NDD-specific drugs can be characterized by a well-characterized set of molecular descriptors. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  19. SVM classification model in depression recognition based on mutation PSO parameter optimization

    Directory of Open Access Journals (Sweden)

    Zhang Ming

    2017-01-01

    Full Text Available At present, the clinical diagnosis of depression is mainly through structured interviews by psychiatrists, which is lack of objective diagnostic methods, so it causes the higher rate of misdiagnosis. In this paper, a method of depression recognition based on SVM and particle swarm optimization algorithm mutation is proposed. To address on the problem that particle swarm optimization (PSO algorithm easily trap in local optima, we propose a feedback mutation PSO algorithm (FBPSO to balance the local search and global exploration ability, so that the parameters of the classification model is optimal. We compared different PSO mutation algorithms about classification accuracy for depression, and found the classification accuracy of support vector machine (SVM classifier based on feedback mutation PSO algorithm is the highest. Our study promotes important reference value for establishing auxiliary diagnostic used in depression recognition of clinical diagnosis.

  20. Effective Sequential Classifier Training for SVM-Based Multitemporal Remote Sensing Image Classification

    Science.gov (United States)

    Guo, Yiqing; Jia, Xiuping; Paull, David

    2018-06-01

    The explosive availability of remote sensing images has challenged supervised classification algorithms such as Support Vector Machines (SVM), as training samples tend to be highly limited due to the expensive and laborious task of ground truthing. The temporal correlation and spectral similarity between multitemporal images have opened up an opportunity to alleviate this problem. In this study, a SVM-based Sequential Classifier Training (SCT-SVM) approach is proposed for multitemporal remote sensing image classification. The approach leverages the classifiers of previous images to reduce the required number of training samples for the classifier training of an incoming image. For each incoming image, a rough classifier is firstly predicted based on the temporal trend of a set of previous classifiers. The predicted classifier is then fine-tuned into a more accurate position with current training samples. This approach can be applied progressively to sequential image data, with only a small number of training samples being required from each image. Experiments were conducted with Sentinel-2A multitemporal data over an agricultural area in Australia. Results showed that the proposed SCT-SVM achieved better classification accuracies compared with two state-of-the-art model transfer algorithms. When training data are insufficient, the overall classification accuracy of the incoming image was improved from 76.18% to 94.02% with the proposed SCT-SVM, compared with those obtained without the assistance from previous images. These results demonstrate that the leverage of a priori information from previous images can provide advantageous assistance for later images in multitemporal image classification.

  1. Feature Selection and Parameters Optimization of SVM Using Particle Swarm Optimization for Fault Classification in Power Distribution Systems.

    Science.gov (United States)

    Cho, Ming-Yuan; Hoang, Thi Thom

    2017-01-01

    Fast and accurate fault classification is essential to power system operations. In this paper, in order to classify electrical faults in radial distribution systems, a particle swarm optimization (PSO) based support vector machine (SVM) classifier has been proposed. The proposed PSO based SVM classifier is able to select appropriate input features and optimize SVM parameters to increase classification accuracy. Further, a time-domain reflectometry (TDR) method with a pseudorandom binary sequence (PRBS) stimulus has been used to generate a dataset for purposes of classification. The proposed technique has been tested on a typical radial distribution network to identify ten different types of faults considering 12 given input features generated by using Simulink software and MATLAB Toolbox. The success rate of the SVM classifier is over 97%, which demonstrates the effectiveness and high efficiency of the developed method.

  2. Feature Selection and Parameters Optimization of SVM Using Particle Swarm Optimization for Fault Classification in Power Distribution Systems

    Directory of Open Access Journals (Sweden)

    Ming-Yuan Cho

    2017-01-01

    Full Text Available Fast and accurate fault classification is essential to power system operations. In this paper, in order to classify electrical faults in radial distribution systems, a particle swarm optimization (PSO based support vector machine (SVM classifier has been proposed. The proposed PSO based SVM classifier is able to select appropriate input features and optimize SVM parameters to increase classification accuracy. Further, a time-domain reflectometry (TDR method with a pseudorandom binary sequence (PRBS stimulus has been used to generate a dataset for purposes of classification. The proposed technique has been tested on a typical radial distribution network to identify ten different types of faults considering 12 given input features generated by using Simulink software and MATLAB Toolbox. The success rate of the SVM classifier is over 97%, which demonstrates the effectiveness and high efficiency of the developed method.

  3. SVM-based Partial Discharge Pattern Classification for GIS

    Science.gov (United States)

    Ling, Yin; Bai, Demeng; Wang, Menglin; Gong, Xiaojin; Gu, Chao

    2018-01-01

    Partial discharges (PD) occur when there are localized dielectric breakdowns in small regions of gas insulated substations (GIS). It is of high importance to recognize the PD patterns, through which we can diagnose the defects caused by different sources so that predictive maintenance can be conducted to prevent from unplanned power outage. In this paper, we propose an approach to perform partial discharge pattern classification. It first recovers the PRPD matrices from the PRPD2D images; then statistical features are extracted from the recovered PRPD matrix and fed into SVM for classification. Experiments conducted on a dataset containing thousands of images demonstrates the high effectiveness of the method.

  4. A method of neighbor classes based SVM classification for optical printed Chinese character recognition.

    Science.gov (United States)

    Zhang, Jie; Wu, Xiaohong; Yu, Yanmei; Luo, Daisheng

    2013-01-01

    In optical printed Chinese character recognition (OPCCR), many classifiers have been proposed for the recognition. Among the classifiers, support vector machine (SVM) might be the best classifier. However, SVM is a classifier for two classes. When it is used for multi-classes in OPCCR, its computation is time-consuming. Thus, we propose a neighbor classes based SVM (NC-SVM) to reduce the computation consumption of SVM. Experiments of NC-SVM classification for OPCCR have been done. The results of the experiments have shown that the NC-SVM we proposed can effectively reduce the computation time in OPCCR.

  5. APPLICATION OF FUSION WITH SAR AND OPTICAL IMAGES IN LAND USE CLASSIFICATION BASED ON SVM

    Directory of Open Access Journals (Sweden)

    C. Bao

    2012-07-01

    Full Text Available As the increment of remote sensing data with multi-space resolution, multi-spectral resolution and multi-source, data fusion technologies have been widely used in geological fields. Synthetic Aperture Radar (SAR and optical camera are two most common sensors presently. The multi-spectral optical images express spectral features of ground objects, while SAR images express backscatter information. Accuracy of the image classification could be effectively improved fusing the two kinds of images. In this paper, Terra SAR-X images and ALOS multi-spectral images were fused for land use classification. After preprocess such as geometric rectification, radiometric rectification noise suppression and so on, the two kind images were fused, and then SVM model identification method was used for land use classification. Two different fusion methods were used, one is joining SAR image into multi-spectral images as one band, and the other is direct fusing the two kind images. The former one can raise the resolution and reserve the texture information, and the latter can reserve spectral feature information and improve capability of identifying different features. The experiment results showed that accuracy of classification using fused images is better than only using multi-spectral images. Accuracy of classification about roads, habitation and water bodies was significantly improved. Compared to traditional classification method, the method of this paper for fused images with SVM classifier could achieve better results in identifying complicated land use classes, especially for small pieces ground features.

  6. Comparative Study on KNN and SVM Based Weather Classification Models for Day Ahead Short Term Solar PV Power Forecasting

    Directory of Open Access Journals (Sweden)

    Fei Wang

    2017-12-01

    Full Text Available Accurate solar photovoltaic (PV power forecasting is an essential tool for mitigating the negative effects caused by the uncertainty of PV output power in systems with high penetration levels of solar PV generation. Weather classification based modeling is an effective way to increase the accuracy of day-ahead short-term (DAST solar PV power forecasting because PV output power is strongly dependent on the specific weather conditions in a given time period. However, the accuracy of daily weather classification relies on both the applied classifiers and the training data. This paper aims to reveal how these two factors impact the classification performance and to delineate the relation between classification accuracy and sample dataset scale. Two commonly used classification methods, K-nearest neighbors (KNN and support vector machines (SVM are applied to classify the daily local weather types for DAST solar PV power forecasting using the operation data from a grid-connected PV plant in Hohhot, Inner Mongolia, China. We assessed the performance of SVM and KNN approaches, and then investigated the influences of sample scale, the number of categories, and the data distribution in different categories on the daily weather classification results. The simulation results illustrate that SVM performs well with small sample scale, while KNN is more sensitive to the length of the training dataset and can achieve higher accuracy than SVM with sufficient samples.

  7. A hybrid particle swarm optimization-SVM classification for automatic cardiac auscultation

    Directory of Open Access Journals (Sweden)

    Prasertsak Charoen

    2017-04-01

    Full Text Available Cardiac auscultation is a method for a doctor to listen to heart sounds, using a stethoscope, for examining the condition of the heart. Automatic cardiac auscultation with machine learning is a promising technique to classify heart conditions without need of doctors or expertise. In this paper, we develop a classification model based on support vector machine (SVM and particle swarm optimization (PSO for an automatic cardiac auscultation system. The model consists of two parts: heart sound signal processing part and a proposed PSO for weighted SVM (WSVM classifier part. In this method, the PSO takes into account the degree of importance for each feature extracted from wavelet packet (WP decomposition. Then, by using principle component analysis (PCA, the features can be selected. The PSO technique is used to assign diverse weights to different features for the WSVM classifier. Experimental results show that both continuous and binary PSO-WSVM models achieve better classification accuracy on the heart sound samples, by reducing system false negatives (FNs, compared to traditional SVM and genetic algorithm (GA based SVM.

  8. The generalization ability of online SVM classification based on Markov sampling.

    Science.gov (United States)

    Xu, Jie; Yan Tang, Yuan; Zou, Bin; Xu, Zongben; Li, Luoqing; Lu, Yang

    2015-03-01

    In this paper, we consider online support vector machine (SVM) classification learning algorithms with uniformly ergodic Markov chain (u.e.M.c.) samples. We establish the bound on the misclassification error of an online SVM classification algorithm with u.e.M.c. samples based on reproducing kernel Hilbert spaces and obtain a satisfactory convergence rate. We also introduce a novel online SVM classification algorithm based on Markov sampling, and present the numerical studies on the learning ability of online SVM classification based on Markov sampling for benchmark repository. The numerical studies show that the learning performance of the online SVM classification algorithm based on Markov sampling is better than that of classical online SVM classification based on random sampling as the size of training samples is larger.

  9. Comparison of ANN and SVM for classification of eye movements in EOG signals

    Science.gov (United States)

    Qi, Lim Jia; Alias, Norma

    2018-03-01

    Nowadays, electrooculogram is regarded as one of the most important biomedical signal in measuring and analyzing eye movement patterns. Thus, it is helpful in designing EOG-based Human Computer Interface (HCI). In this research, electrooculography (EOG) data was obtained from five volunteers. The (EOG) data was then preprocessed before feature extraction methods were employed to further reduce the dimensionality of data. Three feature extraction approaches were put forward, namely statistical parameters, autoregressive (AR) coefficients using Burg method, and power spectral density (PSD) using Yule-Walker method. These features would then become input to both artificial neural network (ANN) and support vector machine (SVM). The performance of the combination of different feature extraction methods and classifiers was presented and analyzed. It was found that statistical parameters + SVM achieved the highest classification accuracy of 69.75%.

  10. Improving Accuracy of Intrusion Detection Model Using PCA and optimized SVM

    Directory of Open Access Journals (Sweden)

    Sumaiya Thaseen Ikram

    2016-06-01

    Full Text Available Intrusion detection is very essential for providing security to different network domains and is mostly used for locating and tracing the intruders. There are many problems with traditional intrusion detection models (IDS such as low detection capability against unknown network attack, high false alarm rate and insufficient analysis capability. Hence the major scope of the research in this domain is to develop an intrusion detection model with improved accuracy and reduced training time. This paper proposes a hybrid intrusiondetection model by integrating the principal component analysis (PCA and support vector machine (SVM. The novelty of the paper is the optimization of kernel parameters of the SVM classifier using automatic parameter selection technique. This technique optimizes the punishment factor (C and kernel parameter gamma (γ, thereby improving the accuracy of the classifier and reducing the training and testing time. The experimental results obtained on the NSL KDD and gurekddcup dataset show that the proposed technique performs better with higher accuracy, faster convergence speed and better generalization. Minimum resources are consumed as the classifier input requires reduced feature set for optimum classification. A comparative analysis of hybrid models with the proposed model is also performed.

  11. Classification of different kinds of pesticide residues on lettuce based on fluorescence spectra and WT-BCC-SVM algorithm

    Science.gov (United States)

    Zhou, Xin; Jun, Sun; Zhang, Bing; Jun, Wu

    2017-07-01

    In order to improve the reliability of the spectrum feature extracted by wavelet transform, a method combining wavelet transform (WT) with bacterial colony chemotaxis algorithm and support vector machine (BCC-SVM) algorithm (WT-BCC-SVM) was proposed in this paper. Besides, we aimed to identify different kinds of pesticide residues on lettuce leaves in a novel and rapid non-destructive way by using fluorescence spectra technology. The fluorescence spectral data of 150 lettuce leaf samples of five different kinds of pesticide residues on the surface of lettuce were obtained using Cary Eclipse fluorescence spectrometer. Standard normalized variable detrending (SNV detrending), Savitzky-Golay coupled with Standard normalized variable detrending (SG-SNV detrending) were used to preprocess the raw spectra, respectively. Bacterial colony chemotaxis combined with support vector machine (BCC-SVM) and support vector machine (SVM) classification models were established based on full spectra (FS) and wavelet transform characteristics (WTC), respectively. Moreover, WTC were selected by WT. The results showed that the accuracy of training set, calibration set and the prediction set of the best optimal classification model (SG-SNV detrending-WT-BCC-SVM) were 100%, 98% and 93.33%, respectively. In addition, the results indicated that it was feasible to use WT-BCC-SVM to establish diagnostic model of different kinds of pesticide residues on lettuce leaves.

  12. Multiclass Classification of Cardiac Arrhythmia Using Improved Feature Selection and SVM Invariants.

    Science.gov (United States)

    Mustaqeem, Anam; Anwar, Syed Muhammad; Majid, Muahammad

    2018-01-01

    Arrhythmia is considered a life-threatening disease causing serious health issues in patients, when left untreated. An early diagnosis of arrhythmias would be helpful in saving lives. This study is conducted to classify patients into one of the sixteen subclasses, among which one class represents absence of disease and the other fifteen classes represent electrocardiogram records of various subtypes of arrhythmias. The research is carried out on the dataset taken from the University of California at Irvine Machine Learning Data Repository. The dataset contains a large volume of feature dimensions which are reduced using wrapper based feature selection technique. For multiclass classification, support vector machine (SVM) based approaches including one-against-one (OAO), one-against-all (OAA), and error-correction code (ECC) are employed to detect the presence and absence of arrhythmias. The SVM method results are compared with other standard machine learning classifiers using varying parameters and the performance of the classifiers is evaluated using accuracy, kappa statistics, and root mean square error. The results show that OAO method of SVM outperforms all other classifiers by achieving an accuracy rate of 81.11% when used with 80/20 data split and 92.07% using 90/10 data split option.

  13. Classification of Focal and Non Focal Epileptic Seizures Using Multi-Features and SVM Classifier.

    Science.gov (United States)

    Sriraam, N; Raghu, S

    2017-09-02

    Identifying epileptogenic zones prior to surgery is an essential and crucial step in treating patients having pharmacoresistant focal epilepsy. Electroencephalogram (EEG) is a significant measurement benchmark to assess patients suffering from epilepsy. This paper investigates the application of multi-features derived from different domains to recognize the focal and non focal epileptic seizures obtained from pharmacoresistant focal epilepsy patients from Bern Barcelona database. From the dataset, five different classification tasks were formed. Total 26 features were extracted from focal and non focal EEG. Significant features were selected using Wilcoxon rank sum test by setting p-value (p z > 1.96) at 95% significance interval. Hypothesis was made that the effect of removing outliers improves the classification accuracy. Turkey's range test was adopted for pruning outliers from feature set. Finally, 21 features were classified using optimized support vector machine (SVM) classifier with 10-fold cross validation. Bayesian optimization technique was adopted to minimize the cross-validation loss. From the simulation results, it was inferred that the highest sensitivity, specificity, and classification accuracy of 94.56%, 89.74%, and 92.15% achieved respectively and found to be better than the state-of-the-art approaches. Further, it was observed that the classification accuracy improved from 80.2% with outliers to 92.15% without outliers. The classifier performance metrics ensures the suitability of the proposed multi-features with optimized SVM classifier. It can be concluded that the proposed approach can be applied for recognition of focal EEG signals to localize epileptogenic zones.

  14. Polsar Land Cover Classification Based on Hidden Polarimetric Features in Rotation Domain and Svm Classifier

    Science.gov (United States)

    Tao, C.-S.; Chen, S.-W.; Li, Y.-Z.; Xiao, S.-P.

    2017-09-01

    Land cover classification is an important application for polarimetric synthetic aperture radar (PolSAR) data utilization. Rollinvariant polarimetric features such as H / Ani / text-decoration: overline">α / Span are commonly adopted in PolSAR land cover classification. However, target orientation diversity effect makes PolSAR images understanding and interpretation difficult. Only using the roll-invariant polarimetric features may introduce ambiguity in the interpretation of targets' scattering mechanisms and limit the followed classification accuracy. To address this problem, this work firstly focuses on hidden polarimetric feature mining in the rotation domain along the radar line of sight using the recently reported uniform polarimetric matrix rotation theory and the visualization and characterization tool of polarimetric coherence pattern. The former rotates the acquired polarimetric matrix along the radar line of sight and fully describes the rotation characteristics of each entry of the matrix. Sets of new polarimetric features are derived to describe the hidden scattering information of the target in the rotation domain. The latter extends the traditional polarimetric coherence at a given rotation angle to the rotation domain for complete interpretation. A visualization and characterization tool is established to derive new polarimetric features for hidden information exploration. Then, a classification scheme is developed combing both the selected new hidden polarimetric features in rotation domain and the commonly used roll-invariant polarimetric features with a support vector machine (SVM) classifier. Comparison experiments based on AIRSAR and multi-temporal UAVSAR data demonstrate that compared with the conventional classification scheme which only uses the roll-invariant polarimetric features, the proposed classification scheme achieves both higher classification accuracy and better robustness. For AIRSAR data, the overall classification

  15. POLSAR LAND COVER CLASSIFICATION BASED ON HIDDEN POLARIMETRIC FEATURES IN ROTATION DOMAIN AND SVM CLASSIFIER

    Directory of Open Access Journals (Sweden)

    C.-S. Tao

    2017-09-01

    Full Text Available Land cover classification is an important application for polarimetric synthetic aperture radar (PolSAR data utilization. Rollinvariant polarimetric features such as H / Ani / α / Span are commonly adopted in PolSAR land cover classification. However, target orientation diversity effect makes PolSAR images understanding and interpretation difficult. Only using the roll-invariant polarimetric features may introduce ambiguity in the interpretation of targets’ scattering mechanisms and limit the followed classification accuracy. To address this problem, this work firstly focuses on hidden polarimetric feature mining in the rotation domain along the radar line of sight using the recently reported uniform polarimetric matrix rotation theory and the visualization and characterization tool of polarimetric coherence pattern. The former rotates the acquired polarimetric matrix along the radar line of sight and fully describes the rotation characteristics of each entry of the matrix. Sets of new polarimetric features are derived to describe the hidden scattering information of the target in the rotation domain. The latter extends the traditional polarimetric coherence at a given rotation angle to the rotation domain for complete interpretation. A visualization and characterization tool is established to derive new polarimetric features for hidden information exploration. Then, a classification scheme is developed combing both the selected new hidden polarimetric features in rotation domain and the commonly used roll-invariant polarimetric features with a support vector machine (SVM classifier. Comparison experiments based on AIRSAR and multi-temporal UAVSAR data demonstrate that compared with the conventional classification scheme which only uses the roll-invariant polarimetric features, the proposed classification scheme achieves both higher classification accuracy and better robustness. For AIRSAR data, the overall classification accuracy

  16. Feature selection based on SVM significance maps for classification of dementia

    NARCIS (Netherlands)

    E.E. Bron (Esther); M. Smits (Marion); J.C. van Swieten (John); W.J. Niessen (Wiro); S. Klein (Stefan)

    2014-01-01

    textabstractSupport vector machine significance maps (SVM p-maps) previously showed clusters of significantly different voxels in dementiarelated brain regions. We propose a novel feature selection method for classification of dementia based on these p-maps. In our approach, the SVM p-maps are

  17. Classification of Multiple Chinese Liquors by Means of a QCM-based E-Nose and MDS-SVM Classifier.

    Science.gov (United States)

    Li, Qiang; Gu, Yu; Jia, Jing

    2017-01-30

    Chinese liquors are internationally well-known fermentative alcoholic beverages. They have unique flavors attributable to the use of various bacteria and fungi, raw materials, and production processes. Developing a novel, rapid, and reliable method to identify multiple Chinese liquors is of positive significance. This paper presents a pattern recognition system for classifying ten brands of Chinese liquors based on multidimensional scaling (MDS) and support vector machine (SVM) algorithms in a quartz crystal microbalance (QCM)-based electronic nose (e-nose) we designed. We evaluated the comprehensive performance of the MDS-SVM classifier that predicted all ten brands of Chinese liquors individually. The prediction accuracy (98.3%) showed superior performance of the MDS-SVM classifier over the back-propagation artificial neural network (BP-ANN) classifier (93.3%) and moving average-linear discriminant analysis (MA-LDA) classifier (87.6%). The MDS-SVM classifier has reasonable reliability, good fitting and prediction (generalization) performance in classification of the Chinese liquors. Taking both application of the e-nose and validation of the MDS-SVM classifier into account, we have thus created a useful method for the classification of multiple Chinese liquors.

  18. Classification of Multiple Chinese Liquors by Means of a QCM-based E-Nose and MDS-SVM Classifier

    Directory of Open Access Journals (Sweden)

    Qiang Li

    2017-01-01

    Full Text Available Chinese liquors are internationally well-known fermentative alcoholic beverages. They have unique flavors attributable to the use of various bacteria and fungi, raw materials, and production processes. Developing a novel, rapid, and reliable method to identify multiple Chinese liquors is of positive significance. This paper presents a pattern recognition system for classifying ten brands of Chinese liquors based on multidimensional scaling (MDS and support vector machine (SVM algorithms in a quartz crystal microbalance (QCM-based electronic nose (e-nose we designed. We evaluated the comprehensive performance of the MDS-SVM classifier that predicted all ten brands of Chinese liquors individually. The prediction accuracy (98.3% showed superior performance of the MDS-SVM classifier over the back-propagation artificial neural network (BP-ANN classifier (93.3% and moving average-linear discriminant analysis (MA-LDA classifier (87.6%. The MDS-SVM classifier has reasonable reliability, good fitting and prediction (generalization performance in classification of the Chinese liquors. Taking both application of the e-nose and validation of the MDS-SVM classifier into account, we have thus created a useful method for the classification of multiple Chinese liquors.

  19. Elastic SCAD as a novel penalization method for SVM classification tasks in high-dimensional data.

    Science.gov (United States)

    Becker, Natalia; Toedt, Grischa; Lichter, Peter; Benner, Axel

    2011-05-09

    Classification and variable selection play an important role in knowledge discovery in high-dimensional data. Although Support Vector Machine (SVM) algorithms are among the most powerful classification and prediction methods with a wide range of scientific applications, the SVM does not include automatic feature selection and therefore a number of feature selection procedures have been developed. Regularisation approaches extend SVM to a feature selection method in a flexible way using penalty functions like LASSO, SCAD and Elastic Net.We propose a novel penalty function for SVM classification tasks, Elastic SCAD, a combination of SCAD and ridge penalties which overcomes the limitations of each penalty alone.Since SVM models are extremely sensitive to the choice of tuning parameters, we adopted an interval search algorithm, which in comparison to a fixed grid search finds rapidly and more precisely a global optimal solution. Feature selection methods with combined penalties (Elastic Net and Elastic SCAD SVMs) are more robust to a change of the model complexity than methods using single penalties. Our simulation study showed that Elastic SCAD SVM outperformed LASSO (L1) and SCAD SVMs. Moreover, Elastic SCAD SVM provided sparser classifiers in terms of median number of features selected than Elastic Net SVM and often better predicted than Elastic Net in terms of misclassification error.Finally, we applied the penalization methods described above on four publicly available breast cancer data sets. Elastic SCAD SVM was the only method providing robust classifiers in sparse and non-sparse situations. The proposed Elastic SCAD SVM algorithm provides the advantages of the SCAD penalty and at the same time avoids sparsity limitations for non-sparse data. We were first to demonstrate that the integration of the interval search algorithm and penalized SVM classification techniques provides fast solutions on the optimization of tuning parameters.The penalized SVM

  20. Hybrid Model Based on Genetic Algorithms and SVM Applied to Variable Selection within Fruit Juice Classification

    Directory of Open Access Journals (Sweden)

    C. Fernandez-Lozano

    2013-01-01

    Full Text Available Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM. Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA, the most representative variables for a specific classification problem can be selected.

  1. Expected Classification Accuracy

    Directory of Open Access Journals (Sweden)

    Lawrence M. Rudner

    2005-08-01

    Full Text Available Every time we make a classification based on a test score, we should expect some number..of misclassifications. Some examinees whose true ability is within a score range will have..observed scores outside of that range. A procedure for providing a classification table of..true and expected scores is developed for polytomously scored items under item response..theory and applied to state assessment data. A simplified procedure for estimating the..table entries is also presented.

  2. Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data

    Directory of Open Access Journals (Sweden)

    Harris Lyndsay N

    2006-04-01

    Full Text Available Abstract Background Like microarray-based investigations, high-throughput proteomics techniques require machine learning algorithms to identify biomarkers that are informative for biological classification problems. Feature selection and classification algorithms need to be robust to noise and outliers in the data. Results We developed a recursive support vector machine (R-SVM algorithm to select important genes/biomarkers for the classification of noisy data. We compared its performance to a similar, state-of-the-art method (SVM recursive feature elimination or SVM-RFE, paying special attention to the ability of recovering the true informative genes/biomarkers and the robustness to outliers in the data. Simulation experiments show that a 5 %-~20 % improvement over SVM-RFE can be achieved regard to these properties. The SVM-based methods are also compared with a conventional univariate method and their respective strengths and weaknesses are discussed. R-SVM was applied to two sets of SELDI-TOF-MS proteomics data, one from a human breast cancer study and the other from a study on rat liver cirrhosis. Important biomarkers found by the algorithm were validated by follow-up biological experiments. Conclusion The proposed R-SVM method is suitable for analyzing noisy high-throughput proteomics and microarray data and it outperforms SVM-RFE in the robustness to noise and in the ability to recover informative features. The multivariate SVM-based method outperforms the univariate method in the classification performance, but univariate methods can reveal more of the differentially expressed features especially when there are correlations between the features.

  3. Lex-SVM: exploring the potential of exon expression profiling for disease classification.

    Science.gov (United States)

    Yuan, Xiongying; Zhao, Yi; Liu, Changning; Bu, Dongbo

    2011-04-01

    Exon expression profiling technologies, including exon arrays and RNA-Seq, measure the abundance of every exon in a gene. Compared with gene expression profiling technologies like 3' array, exon expression profiling technologies could detect alterations in both transcription and alternative splicing, therefore they are expected to be more sensitive in diagnosis. However, exon expression profiling also brings higher dimension, more redundancy, and significant correlation among features. Ignoring the correlation structure among exons of a gene, a popular classification method like L1-SVM selects exons individually from each gene and thus is vulnerable to noise. To overcome this limitation, we present in this paper a new variant of SVM named Lex-SVM to incorporate correlation structure among exons and known splicing patterns to promote classification performance. Specifically, we construct a new norm, ex-norm, including our prior knowledge on exon correlation structure to regularize the coefficients of a linear SVM. Lex-SVM can be solved efficiently using standard linear programming techniques. The advantage of Lex-SVM is that it can select features group-wisely, force features in a subgroup to take equal weihts and exclude the features that contradict the majority in the subgroup. Experimental results suggest that on exon expression profile, Lex-SVM is more accurate than existing methods. Lex-SVM also generates a more compact model and selects genes more consistently in cross-validation. Unlike L1-SVM selecting only one exon in a gene, Lex-SVM assigns equal weights to as many exons in a gene as possible, lending itself easier for further interpretation.

  4. Simultaneous data pre-processing and SVM classification model selection based on a parallel genetic algorithm applied to spectroscopic data of olive oils.

    Science.gov (United States)

    Devos, Olivier; Downey, Gerard; Duponchel, Ludovic

    2014-04-01

    Classification is an important task in chemometrics. For several years now, support vector machines (SVMs) have proven to be powerful for infrared spectral data classification. However such methods require optimisation of parameters in order to control the risk of overfitting and the complexity of the boundary. Furthermore, it is established that the prediction ability of classification models can be improved using pre-processing in order to remove unwanted variance in the spectra. In this paper we propose a new methodology based on genetic algorithm (GA) for the simultaneous optimisation of SVM parameters and pre-processing (GENOPT-SVM). The method has been tested for the discrimination of the geographical origin of Italian olive oil (Ligurian and non-Ligurian) on the basis of near infrared (NIR) or mid infrared (FTIR) spectra. Different classification models (PLS-DA, SVM with mean centre data, GENOPT-SVM) have been tested and statistically compared using McNemar's statistical test. For the two datasets, SVM with optimised pre-processing give models with higher accuracy than the one obtained with PLS-DA on pre-processed data. In the case of the NIR dataset, most of this accuracy improvement (86.3% compared with 82.8% for PLS-DA) occurred using only a single pre-processing step. For the FTIR dataset, three optimised pre-processing steps are required to obtain SVM model with significant accuracy improvement (82.2%) compared to the one obtained with PLS-DA (78.6%). Furthermore, this study demonstrates that even SVM models have to be developed on the basis of well-corrected spectral data in order to obtain higher classification rates. Copyright © 2013 Elsevier Ltd. All rights reserved.

  5. Classification of EEG-P300 Signals Extracted from Brain Activities in BCI Systems Using ν-SVM and BLDA Algorithms

    Directory of Open Access Journals (Sweden)

    Ali MOMENNEZHAD

    2014-06-01

    Full Text Available In this paper, a linear predictive coding (LPC model is used to improve classification accuracy, convergent speed to maximum accuracy, and maximum bitrates in brain computer interface (BCI system based on extracting EEG-P300 signals. First, EEG signal is filtered in order to eliminate high frequency noise. Then, the parameters of filtered EEG signal are extracted using LPC model. Finally, the samples are reconstructed by LPC coefficients and two classifiers, a Bayesian Linear discriminant analysis (BLDA, and b the υ-support vector machine (υ-SVM are applied in order to classify. The proposed algorithm performance is compared with fisher linear discriminant analysis (FLDA. Results show that the efficiency of our algorithm in improving classification accuracy and convergent speed to maximum accuracy are much better. As example at the proposed algorithms, respectively BLDA with LPC model and υ-SVM with LPC model with8 electrode configuration for subject S1 the total classification accuracy is improved as 9.4% and 1.7%. And also, subject 7 at BLDA and υ-SVM with LPC model algorithms (LPC+BLDA and LPC+ υ-SVM after block 11th converged to maximum accuracy but Fisher Linear Discriminant Analysis (FLDA algorithm did not converge to maximum accuracy (with the same configuration. So, it can be used as a promising tool in designing BCI systems.

  6. Comparison of hand-craft feature based SVM and CNN based deep learning framework for automatic polyp classification.

    Science.gov (United States)

    Younghak Shin; Balasingham, Ilangko

    2017-07-01

    Colonoscopy is a standard method for screening polyps by highly trained physicians. Miss-detected polyps in colonoscopy are potential risk factor for colorectal cancer. In this study, we investigate an automatic polyp classification framework. We aim to compare two different approaches named hand-craft feature method and convolutional neural network (CNN) based deep learning method. Combined shape and color features are used for hand craft feature extraction and support vector machine (SVM) method is adopted for classification. For CNN approach, three convolution and pooling based deep learning framework is used for classification purpose. The proposed framework is evaluated using three public polyp databases. From the experimental results, we have shown that the CNN based deep learning framework shows better classification performance than the hand-craft feature based methods. It achieves over 90% of classification accuracy, sensitivity, specificity and precision.

  7. Classification Accuracy Is Not Enough

    DEFF Research Database (Denmark)

    Sturm, Bob L.

    2013-01-01

    A recent review of the research literature evaluating music genre recognition (MGR) systems over the past two decades shows that most works (81\\%) measure the capacity of a system to recognize genre by its classification accuracy. We show here, by implementing and testing three categorically...

  8. SVM and SVM Ensembles in Breast Cancer Prediction.

    Science.gov (United States)

    Huang, Min-Wei; Chen, Chih-Wen; Lin, Wei-Chao; Ke, Shih-Wen; Tsai, Chih-Fong

    2017-01-01

    Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers.

  9. SVM and SVM Ensembles in Breast Cancer Prediction.

    Directory of Open Access Journals (Sweden)

    Min-Wei Huang

    Full Text Available Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers.

  10. A Multi-Classification Method of Improved SVM-based Information Fusion for Traffic Parameters Forecasting

    Directory of Open Access Journals (Sweden)

    Hongzhuan Zhao

    2016-04-01

    Full Text Available With the enrichment of perception methods, modern transportation system has many physical objects whose states are influenced by many information factors so that it is a typical Cyber-Physical System (CPS. Thus, the traffic information is generally multi-sourced, heterogeneous and hierarchical. Existing research results show that the multisourced traffic information through accurate classification in the process of information fusion can achieve better parameters forecasting performance. For solving the problem of traffic information accurate classification, via analysing the characteristics of the multi-sourced traffic information and using redefined binary tree to overcome the shortcomings of the original Support Vector Machine (SVM classification in information fusion, a multi-classification method using improved SVM in information fusion for traffic parameters forecasting is proposed. The experiment was conducted to examine the performance of the proposed scheme, and the results reveal that the method can get more accurate and practical outcomes.

  11. a Comparison Study of Different Kernel Functions for Svm-Based Classification of Multi-Temporal Polarimetry SAR Data

    Science.gov (United States)

    Yekkehkhany, B.; Safari, A.; Homayouni, S.; Hasanlou, M.

    2014-10-01

    In this paper, a framework is developed based on Support Vector Machines (SVM) for crop classification using polarimetric features extracted from multi-temporal Synthetic Aperture Radar (SAR) imageries. The multi-temporal integration of data not only improves the overall retrieval accuracy but also provides more reliable estimates with respect to single-date data. Several kernel functions are employed and compared in this study for mapping the input space to higher Hilbert dimension space. These kernel functions include linear, polynomials and Radial Based Function (RBF). The method is applied to several UAVSAR L-band SAR images acquired over an agricultural area near Winnipeg, Manitoba, Canada. In this research, the temporal alpha features of H/A/α decomposition method are used in classification. The experimental tests show an SVM classifier with RBF kernel for three dates of data increases the Overall Accuracy (OA) to up to 3% in comparison to using linear kernel function, and up to 1% in comparison to a 3rd degree polynomial kernel function.

  12. Effects of atmospheric correction and pansharpening on LULC classification accuracy using WorldView-2 imagery

    Directory of Open Access Journals (Sweden)

    Chinsu Lin

    2015-05-01

    Full Text Available Changes of Land Use and Land Cover (LULC affect atmospheric, climatic, and biological spheres of the earth. Accurate LULC map offers detail information for resources management and intergovernmental cooperation to debate global warming and biodiversity reduction. This paper examined effects of pansharpening and atmospheric correction on LULC classification. Object-Based Support Vector Machine (OB-SVM and Pixel-Based Maximum Likelihood Classifier (PB-MLC were applied for LULC classification. Results showed that atmospheric correction is not necessary for LULC classification if it is conducted in the original multispectral image. Nevertheless, pansharpening plays much more important roles on the classification accuracy than the atmospheric correction. It can help to increase classification accuracy by 12% on average compared to the ones without pansharpening. PB-MLC and OB-SVM achieved similar classification rate. This study indicated that the LULC classification accuracy using PB-MLC and OB-SVM is 82% and 89% respectively. A combination of atmospheric correction, pansharpening, and OB-SVM could offer promising LULC maps from WorldView-2 multispectral and panchromatic images.

  13. Using Generalized Entropies and OC-SVM with Mahalanobis Kernel for Detection and Classification of Anomalies in Network Traffic

    Directory of Open Access Journals (Sweden)

    Jayro Santiago-Paz

    2015-09-01

    Full Text Available Network anomaly detection and classification is an important open issue in network security. Several approaches and systems based on different mathematical tools have been studied and developed, among them, the Anomaly-Network Intrusion Detection System (A-NIDS, which monitors network traffic and compares it against an established baseline of a “normal” traffic profile. Then, it is necessary to characterize the “normal” Internet traffic. This paper presents an approach for anomaly detection and classification based on Shannon, Rényi and Tsallis entropies of selected features, and the construction of regions from entropy data employing the Mahalanobis distance (MD, and One Class Support Vector Machine (OC-SVM with different kernels (Radial Basis Function (RBF and Mahalanobis Kernel (MK for “normal” and abnormal traffic. Regular and non-regular regions built from “normal” traffic profiles allow anomaly detection, while the classification is performed under the assumption that regions corresponding to the attack classes have been previously characterized. Although this approach allows the use of as many features as required, only four well-known significant features were selected in our case. In order to evaluate our approach, two different data sets were used: one set of real traffic obtained from an Academic Local Area Network (LAN, and the other a subset of the 1998 MIT-DARPA set. For these data sets, a True positive rate up to 99.35%, a True negative rate up to 99.83% and a False negative rate at about 0.16% were yielded. Experimental results show that certain q-values of the generalized entropies and the use of OC-SVM with RBF kernel improve the detection rate in the detection stage, while the novel inclusion of MK kernel in OC-SVM and k-temporal nearest neighbors improve accuracy in classification. In addition, the results show that using the Box-Cox transformation, the Mahalanobis distance yielded high detection rates with

  14. Classification of Hyperspectral Images by SVM Using a Composite Kernel by Employing Spectral, Spatial and Hierarchical Structure Information

    Directory of Open Access Journals (Sweden)

    Yi Wang

    2018-03-01

    Full Text Available In this paper, we introduce a novel classification framework for hyperspectral images (HSIs by jointly employing spectral, spatial, and hierarchical structure information. In this framework, the three types of information are integrated into the SVM classifier in a way of multiple kernels. Specifically, the spectral kernel is constructed through each pixel’s vector value in the original HSI, and the spatial kernel is modeled by using the extended morphological profile method due to its simplicity and effectiveness. To accurately characterize hierarchical structure features, the techniques of Fish-Markov selector (FMS, marker-based hierarchical segmentation (MHSEG and algebraic multigrid (AMG are combined. First, the FMS algorithm is used on the original HSI for feature selection to produce its spectral subset. Then, the multigrid structure of this subset is constructed using the AMG method. Subsequently, the MHSEG algorithm is exploited to obtain a hierarchy consist of a series of segmentation maps. Finally, the hierarchical structure information is represented by using these segmentation maps. The main contributions of this work is to present an effective composite kernel for HSI classification by utilizing spatial structure information in multiple scales. Experiments were conducted on two hyperspectral remote sensing images to validate that the proposed framework can achieve better classification results than several popular kernel-based classification methods in terms of both qualitative and quantitative analysis. Specifically, the proposed classification framework can achieve 13.46–15.61% in average higher than the standard SVM classifier under different training sets in the terms of overall accuracy.

  15. Evaluation of Effectiveness of Wavelet Based Denoising Schemes Using ANN and SVM for Bearing Condition Classification

    Directory of Open Access Journals (Sweden)

    Vijay G. S.

    2012-01-01

    Full Text Available The wavelet based denoising has proven its ability to denoise the bearing vibration signals by improving the signal-to-noise ratio (SNR and reducing the root-mean-square error (RMSE. In this paper seven wavelet based denoising schemes have been evaluated based on the performance of the Artificial Neural Network (ANN and the Support Vector Machine (SVM, for the bearing condition classification. The work consists of two parts, the first part in which a synthetic signal simulating the defective bearing vibration signal with Gaussian noise was subjected to these denoising schemes. The best scheme based on the SNR and the RMSE was identified. In the second part, the vibration signals collected from a customized Rolling Element Bearing (REB test rig for four bearing conditions were subjected to these denoising schemes. Several time and frequency domain features were extracted from the denoised signals, out of which a few sensitive features were selected using the Fisher’s Criterion (FC. Extracted features were used to train and test the ANN and the SVM. The best denoising scheme identified, based on the classification performances of the ANN and the SVM, was found to be the same as the one obtained using the synthetic signal.

  16. AN IMPLEMENTATION OF EIS-SVM CLASSIFIER USING RESEARCH ARTICLES FOR TEXT CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    B Ramesh

    2016-04-01

    Full Text Available Automatic text classification is a prominent research topic in text mining. The text pre-processing is a major role in text classifier. The efficiency of pre-processing techniques is increasing the performance of text classifier. In this paper, we are implementing ECAS stemmer, Efficient Instance Selection and Pre-computed Kernel Support Vector Machine for text classification using recent research articles. We are using better pre-processing techniques such as ECAS stemmer to find root word, Efficient Instance Selection for dimensionality reduction of text data and Pre-computed Kernel Support Vector Machine for classification of selected instances. In this experiments were performed on 750 research articles with three classes such as engineering article, medical articles and educational articles. The EIS-SVM classifier provides better performance in real-time research articles classification.

  17. Classification of cardiovascular tissues using LBP based descriptors and a cascade SVM.

    Science.gov (United States)

    Mazo, Claudia; Alegre, Enrique; Trujillo, Maria

    2017-08-01

    Histological images have characteristics, such as texture, shape, colour and spatial structure, that permit the differentiation of each fundamental tissue and organ. Texture is one of the most discriminative features. The automatic classification of tissues and organs based on histology images is an open problem, due to the lack of automatic solutions when treating tissues without pathologies. In this paper, we demonstrate that it is possible to automatically classify cardiovascular tissues using texture information and Support Vector Machines (SVM). Additionally, we realised that it is feasible to recognise several cardiovascular organs following the same process. The texture of histological images was described using Local Binary Patterns (LBP), LBP Rotation Invariant (LBPri), Haralick features and different concatenations between them, representing in this way its content. Using a SVM with linear kernel, we selected the more appropriate descriptor that, for this problem, was a concatenation of LBP and LBPri. Due to the small number of the images available, we could not follow an approach based on deep learning, but we selected the classifier who yielded the higher performance by comparing SVM with Random Forest and Linear Discriminant Analysis. Once SVM was selected as the classifier with a higher area under the curve that represents both higher recall and precision, we tuned it evaluating different kernels, finding that a linear SVM allowed us to accurately separate four classes of tissues: (i) cardiac muscle of the heart, (ii) smooth muscle of the muscular artery, (iii) loose connective tissue, and (iv) smooth muscle of the large vein and the elastic artery. The experimental validation was conducted using 3000 blocks of 100 × 100 sized pixels, with 600 blocks per class and the classification was assessed using a 10-fold cross-validation. using LBP as the descriptor, concatenated with LBPri and a SVM with linear kernel, the main four classes of tissues were

  18. Can Automatic Classification Help to Increase Accuracy in Data Collection?

    Directory of Open Access Journals (Sweden)

    Frederique Lang

    2016-09-01

    Full Text Available Purpose: The authors aim at testing the performance of a set of machine learning algorithms that could improve the process of data cleaning when building datasets. Design/methodology/approach: The paper is centered on cleaning datasets gathered from publishers and online resources by the use of specific keywords. In this case, we analyzed data from the Web of Science. The accuracy of various forms of automatic classification was tested here in comparison with manual coding in order to determine their usefulness for data collection and cleaning. We assessed the performance of seven supervised classification algorithms (Support Vector Machine (SVM, Scaled Linear Discriminant Analysis, Lasso and elastic-net regularized generalized linear models, Maximum Entropy, Regression Tree, Boosting, and Random Forest and analyzed two properties: accuracy and recall. We assessed not only each algorithm individually, but also their combinations through a voting scheme. We also tested the performance of these algorithms with different sizes of training data. When assessing the performance of different combinations, we used an indicator of coverage to account for the agreement and disagreement on classification between algorithms. Findings: We found that the performance of the algorithms used vary with the size of the sample for training. However, for the classification exercise in this paper the best performing algorithms were SVM and Boosting. The combination of these two algorithms achieved a high agreement on coverage and was highly accurate. This combination performs well with a small training dataset (10%, which may reduce the manual work needed for classification tasks. Research limitations: The dataset gathered has significantly more records related to the topic of interest compared to unrelated topics. This may affect the performance of some algorithms, especially in their identification of unrelated papers. Practical implications: Although the

  19. Comparison of two Classification methods (MLC and SVM) to extract land use and land cover in Johor Malaysia

    Science.gov (United States)

    Rokni Deilmai, B.; Ahmad, B. Bin; Zabihi, H.

    2014-06-01

    Mapping is essential for the analysis of the land use and land cover, which influence many environmental processes and properties. For the purpose of the creation of land cover maps, it is important to minimize error. These errors will propagate into later analyses based on these land cover maps. The reliability of land cover maps derived from remotely sensed data depends on an accurate classification. In this study, we have analyzed multispectral data using two different classifiers including Maximum Likelihood Classifier (MLC) and Support Vector Machine (SVM). To pursue this aim, Landsat Thematic Mapper data and identical field-based training sample datasets in Johor Malaysia used for each classification method, which results indicate in five land cover classes forest, oil palm, urban area, water, rubber. Classification results indicate that SVM was more accurate than MLC. With demonstrated capability to produce reliable cover results, the SVM methods should be especially useful for land cover classification.

  20. Comparison of two Classification methods (MLC and SVM) to extract land use and land cover in Johor Malaysia

    International Nuclear Information System (INIS)

    Deilmai, B Rokni; Ahmad, B Bin; Zabihi, H

    2014-01-01

    Mapping is essential for the analysis of the land use and land cover, which influence many environmental processes and properties. For the purpose of the creation of land cover maps, it is important to minimize error. These errors will propagate into later analyses based on these land cover maps. The reliability of land cover maps derived from remotely sensed data depends on an accurate classification. In this study, we have analyzed multispectral data using two different classifiers including Maximum Likelihood Classifier (MLC) and Support Vector Machine (SVM). To pursue this aim, Landsat Thematic Mapper data and identical field-based training sample datasets in Johor Malaysia used for each classification method, which results indicate in five land cover classes forest, oil palm, urban area, water, rubber. Classification results indicate that SVM was more accurate than MLC. With demonstrated capability to produce reliable cover results, the SVM methods should be especially useful for land cover classification

  1. SVM-RFE based feature selection and Taguchi parameters optimization for multiclass SVM classifier.

    Science.gov (United States)

    Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, W M; Li, R K; Jiang, Bo-Ru

    2014-01-01

    Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases.

  2. Strategies to Increase Accuracy in Text Classification

    NARCIS (Netherlands)

    D. Blommesteijn (Dennis)

    2014-01-01

    htmlabstractText classification via supervised learning involves various steps from processing raw data, features extraction to training and validating classifiers. Within these steps implementation decisions are critical to the resulting classifier accuracy. This paper contains a report of the

  3. Application of SVM classifier in thermographic image classification for early detection of breast cancer

    Science.gov (United States)

    Oleszkiewicz, Witold; Cichosz, Paweł; Jagodziński, Dariusz; Matysiewicz, Mateusz; Neumann, Łukasz; Nowak, Robert M.; Okuniewski, Rafał

    2016-09-01

    This article presents the application of machine learning algorithms for early detection of breast cancer on the basis of thermographic images. Supervised learning model: Support vector machine (SVM) and Sequential Minimal Optimization algorithm (SMO) for the training of SVM classifier were implemented. The SVM classifier was included in a client-server application which enables to create a training set of examinations and to apply classifiers (including SVM) for the diagnosis and early detection of the breast cancer. The sensitivity and specificity of SVM classifier were calculated based on the thermographic images from studies. Furthermore, the heuristic method for SVM's parameters tuning was proposed.

  4. Effective and efficient Grassfinch kernel for SVM classification and its application to recognition based on image set

    International Nuclear Information System (INIS)

    Du, Genyuan; Tian, Shengli; Qiu, Yingyu; Xu, Chunyan

    2016-01-01

    This paper presents an effective and efficient kernel approach to recognize image set which is represented as a point on extended Grassmannian manifold. Several recent studies focus on the applicability of discriminant analysis on Grassmannian manifold and suffer from not obtaining the inherent nonlinear structure of the data itself. Therefore, we propose an extension of Grassmannian manifold to address this issue. Instead of using a linear data embedding with PCA, we develop a non-linear data embedding of such manifold using kernel PCA. This paper mainly consider three folds: 1) introduce a non-linear data embedding of extended Grassmannian manifold, 2) derive a distance metric of Grassmannian manifold, 3) develop an effective and efficient Grassmannian kernel for SVM classification. The extended Grassmannian manifold naturally arises in the application to recognition based on image set, such as face and object recognition. Experiments on several standard databases show better classification accuracy. Furthermore, experimental results indicate that our proposed approach significantly reduces time complexity in comparison to graph embedding discriminant analysis.

  5. A structural SVM approach for reference parsing.

    Science.gov (United States)

    Zhang, Xiaoli; Zou, Jie; Le, Daniel X; Thoma, George R

    2011-06-09

    Automated extraction of bibliographic data, such as article titles, author names, abstracts, and references is essential to the affordable creation of large citation databases. References, typically appearing at the end of journal articles, can also provide valuable information for extracting other bibliographic data. Therefore, parsing individual reference to extract author, title, journal, year, etc. is sometimes a necessary preprocessing step in building citation-indexing systems. The regular structure in references enables us to consider reference parsing a sequence learning problem and to study structural Support Vector Machine (structural SVM), a newly developed structured learning algorithm on parsing references. In this study, we implemented structural SVM and used two types of contextual features to compare structural SVM with conventional SVM. Both methods achieve above 98% token classification accuracy and above 95% overall chunk-level accuracy for reference parsing. We also compared SVM and structural SVM to Conditional Random Field (CRF). The experimental results show that structural SVM and CRF achieve similar accuracies at token- and chunk-levels. When only basic observation features are used for each token, structural SVM achieves higher performance compared to SVM since it utilizes the contextual label features. However, when the contextual observation features from neighboring tokens are combined, SVM performance improves greatly, and is close to that of structural SVM after adding the second order contextual observation features. The comparison of these two methods with CRF using the same set of binary features show that both structural SVM and CRF perform better than SVM, indicating their stronger sequence learning ability in reference parsing.

  6. A Fast SVM-Based Tongue’s Colour Classification Aided by k-Means Clustering Identifiers and Colour Attributes as Computer-Assisted Tool for Tongue Diagnosis

    Directory of Open Access Journals (Sweden)

    Nur Diyana Kamarudin

    2017-01-01

    Full Text Available In tongue diagnosis, colour information of tongue body has kept valuable information regarding the state of disease and its correlation with the internal organs. Qualitatively, practitioners may have difficulty in their judgement due to the instable lighting condition and naked eye’s ability to capture the exact colour distribution on the tongue especially the tongue with multicolour substance. To overcome this ambiguity, this paper presents a two-stage tongue’s multicolour classification based on a support vector machine (SVM whose support vectors are reduced by our proposed k-means clustering identifiers and red colour range for precise tongue colour diagnosis. In the first stage, k-means clustering is used to cluster a tongue image into four clusters of image background (black, deep red region, red/light red region, and transitional region. In the second-stage classification, red/light red tongue images are further classified into red tongue or light red tongue based on the red colour range derived in our work. Overall, true rate classification accuracy of the proposed two-stage classification to diagnose red, light red, and deep red tongue colours is 94%. The number of support vectors in SVM is improved by 41.2%, and the execution time for one image is recorded as 48 seconds.

  7. A Fast SVM-Based Tongue's Colour Classification Aided by k-Means Clustering Identifiers and Colour Attributes as Computer-Assisted Tool for Tongue Diagnosis

    Science.gov (United States)

    Ooi, Chia Yee; Kawanabe, Tadaaki; Odaguchi, Hiroshi; Kobayashi, Fuminori

    2017-01-01

    In tongue diagnosis, colour information of tongue body has kept valuable information regarding the state of disease and its correlation with the internal organs. Qualitatively, practitioners may have difficulty in their judgement due to the instable lighting condition and naked eye's ability to capture the exact colour distribution on the tongue especially the tongue with multicolour substance. To overcome this ambiguity, this paper presents a two-stage tongue's multicolour classification based on a support vector machine (SVM) whose support vectors are reduced by our proposed k-means clustering identifiers and red colour range for precise tongue colour diagnosis. In the first stage, k-means clustering is used to cluster a tongue image into four clusters of image background (black), deep red region, red/light red region, and transitional region. In the second-stage classification, red/light red tongue images are further classified into red tongue or light red tongue based on the red colour range derived in our work. Overall, true rate classification accuracy of the proposed two-stage classification to diagnose red, light red, and deep red tongue colours is 94%. The number of support vectors in SVM is improved by 41.2%, and the execution time for one image is recorded as 48 seconds. PMID:29065640

  8. A Fast SVM-Based Tongue's Colour Classification Aided by k-Means Clustering Identifiers and Colour Attributes as Computer-Assisted Tool for Tongue Diagnosis.

    Science.gov (United States)

    Kamarudin, Nur Diyana; Ooi, Chia Yee; Kawanabe, Tadaaki; Odaguchi, Hiroshi; Kobayashi, Fuminori

    2017-01-01

    In tongue diagnosis, colour information of tongue body has kept valuable information regarding the state of disease and its correlation with the internal organs. Qualitatively, practitioners may have difficulty in their judgement due to the instable lighting condition and naked eye's ability to capture the exact colour distribution on the tongue especially the tongue with multicolour substance. To overcome this ambiguity, this paper presents a two-stage tongue's multicolour classification based on a support vector machine (SVM) whose support vectors are reduced by our proposed k -means clustering identifiers and red colour range for precise tongue colour diagnosis. In the first stage, k -means clustering is used to cluster a tongue image into four clusters of image background (black), deep red region, red/light red region, and transitional region. In the second-stage classification, red/light red tongue images are further classified into red tongue or light red tongue based on the red colour range derived in our work. Overall, true rate classification accuracy of the proposed two-stage classification to diagnose red, light red, and deep red tongue colours is 94%. The number of support vectors in SVM is improved by 41.2%, and the execution time for one image is recorded as 48 seconds.

  9. Ensemble Clustering Classification Applied to Competing SVM and One-Class Classifiers Exemplified by Plant MicroRNAs Data

    Directory of Open Access Journals (Sweden)

    Yousef Malik

    2016-12-01

    Full Text Available The performance of many learning and data mining algorithms depends critically on suitable metrics to assess efficiency over the input space. Learning a suitable metric from examples may, therefore, be the key to successful application of these algorithms. We have demonstrated that the k-nearest neighbor (kNN classification can be significantly improved by learning a distance metric from labeled examples. The clustering ensemble is used to define the distance between points in respect to how they co-cluster. This distance is then used within the framework of the kNN algorithm to define a classifier named ensemble clustering kNN classifier (EC-kNN. In many instances in our experiments we achieved highest accuracy while SVM failed to perform as well. In this study, we compare the performance of a two-class classifier using EC-kNN with different one-class and two-class classifiers. The comparison was applied to seven different plant microRNA species considering eight feature selection methods. In this study, the averaged results show that EC-kNN outperforms all other methods employed here and previously published results for the same data. In conclusion, this study shows that the chosen classifier shows high performance when the distance metric is carefully chosen.

  10. Classification Accuracy Increase Using Multisensor Data Fusion

    Science.gov (United States)

    Makarau, A.; Palubinskas, G.; Reinartz, P.

    2011-09-01

    The practical use of very high resolution visible and near-infrared (VNIR) data is still growing (IKONOS, Quickbird, GeoEye-1, etc.) but for classification purposes the number of bands is limited in comparison to full spectral imaging. These limitations may lead to the confusion of materials such as different roofs, pavements, roads, etc. and therefore may provide wrong interpretation and use of classification products. Employment of hyperspectral data is another solution, but their low spatial resolution (comparing to multispectral data) restrict their usage for many applications. Another improvement can be achieved by fusion approaches of multisensory data since this may increase the quality of scene classification. Integration of Synthetic Aperture Radar (SAR) and optical data is widely performed for automatic classification, interpretation, and change detection. In this paper we present an approach for very high resolution SAR and multispectral data fusion for automatic classification in urban areas. Single polarization TerraSAR-X (SpotLight mode) and multispectral data are integrated using the INFOFUSE framework, consisting of feature extraction (information fission), unsupervised clustering (data representation on a finite domain and dimensionality reduction), and data aggregation (Bayesian or neural network). This framework allows a relevant way of multisource data combination following consensus theory. The classification is not influenced by the limitations of dimensionality, and the calculation complexity primarily depends on the step of dimensionality reduction. Fusion of single polarization TerraSAR-X, WorldView-2 (VNIR or full set), and Digital Surface Model (DSM) data allow for different types of urban objects to be classified into predefined classes of interest with increased accuracy. The comparison to classification results of WorldView-2 multispectral data (8 spectral bands) is provided and the numerical evaluation of the method in comparison to

  11. THE APPLICATION OF SUPPORT VECTOR MACHINE (SVM USING CIELAB COLOR MODEL, COLOR INTENSITY AND COLOR CONSTANCY AS FEATURES FOR ORTHO IMAGE CLASSIFICATION OF BENTHIC HABITATS IN HINATUAN, SURIGAO DEL SUR, PHILIPPINES

    Directory of Open Access Journals (Sweden)

    J. E. Cubillas

    2016-06-01

    Full Text Available This study demonstrates the application of CIELAB, Color intensity, and One Dimensional Scalar Constancy as features for image recognition and classifying benthic habitats in an image with the coastal areas of Hinatuan, Surigao Del Sur, Philippines as the study area. The study area is composed of four datasets, namely: (a Blk66L005, (b Blk66L021, (c Blk66L024, and (d Blk66L0114. SVM optimization was performed in Matlab® software with the help of Parallel Computing Toolbox to hasten the SVM computing speed. The image used for collecting samples for SVM procedure was Blk66L0114 in which a total of 134,516 sample objects of mangrove, possible coral existence with rocks, sand, sea, fish pens and sea grasses were collected and processed. The collected samples were then used as training sets for the supervised learning algorithm and for the creation of class definitions. The learned hyper-planes separating one class from another in the multi-dimensional feature space can be thought of as a super feature which will then be used in developing the C (classifier rule set in eCognition® software. The classification results of the sampling site yielded an accuracy of 98.85% which confirms the reliability of remote sensing techniques and analysis employed to orthophotos like the CIELAB, Color Intensity and One dimensional scalar constancy and the use of SVM classification algorithm in classifying benthic habitats.

  12. SVM-based feature extraction and classification of aflatoxin contaminated corn using fluorescence hyperspectral data

    Science.gov (United States)

    Support Vector Machine (SVM) was used in the Genetic Algorithms (GA) process to select and classify a subset of hyperspectral image bands. The method was applied to fluorescence hyperspectral data for the detection of aflatoxin contamination in Aspergillus flavus infected single corn kernels. In the...

  13. Automated Classification and Removal of EEG Artifacts With SVM and Wavelet-ICA.

    Science.gov (United States)

    Sai, Chong Yeh; Mokhtar, Norrima; Arof, Hamzah; Cumming, Paul; Iwahashi, Masahiro

    2018-05-01

    Brain electrical activity recordings by electroencephalography (EEG) are often contaminated with signal artifacts. Procedures for automated removal of EEG artifacts are frequently sought for clinical diagnostics and brain-computer interface applications. In recent years, a combination of independent component analysis (ICA) and discrete wavelet transform has been introduced as standard technique for EEG artifact removal. However, in performing the wavelet-ICA procedure, visual inspection or arbitrary thresholding may be required for identifying artifactual components in the EEG signal. We now propose a novel approach for identifying artifactual components separated by wavelet-ICA using a pretrained support vector machine (SVM). Our method presents a robust and extendable system that enables fully automated identification and removal of artifacts from EEG signals, without applying any arbitrary thresholding. Using test data contaminated by eye blink artifacts, we show that our method performed better in identifying artifactual components than did existing thresholding methods. Furthermore, wavelet-ICA in conjunction with SVM successfully removed target artifacts, while largely retaining the EEG source signals of interest. We propose a set of features including kurtosis, variance, Shannon's entropy, and range of amplitude as training and test data of SVM to identify eye blink artifacts in EEG signals. This combinatorial method is also extendable to accommodate multiple types of artifacts present in multichannel EEG. We envision future research to explore other descriptive features corresponding to other types of artifactual components.

  14. SVM classifier on chip for melanoma detection.

    Science.gov (United States)

    Afifi, Shereen; GholamHosseini, Hamid; Sinha, Roopak

    2017-07-01

    Support Vector Machine (SVM) is a common classifier used for efficient classification with high accuracy. SVM shows high accuracy for classifying melanoma (skin cancer) clinical images within computer-aided diagnosis systems used by skin cancer specialists to detect melanoma early and save lives. We aim to develop a medical low-cost handheld device that runs a real-time embedded SVM-based diagnosis system for use in primary care for early detection of melanoma. In this paper, an optimized SVM classifier is implemented onto a recent FPGA platform using the latest design methodology to be embedded into the proposed device for realizing online efficient melanoma detection on a single system on chip/device. The hardware implementation results demonstrate a high classification accuracy of 97.9% and a significant acceleration factor of 26 from equivalent software implementation on an embedded processor, with 34% of resources utilization and 2 watts for power consumption. Consequently, the implemented system meets crucial embedded systems constraints of high performance and low cost, resources utilization and power consumption, while achieving high classification accuracy.

  15. Area Determination of Diabetic Foot Ulcer Images Using a Cascaded Two-Stage SVM-Based Classification.

    Science.gov (United States)

    Wang, Lei; Pedersen, Peder C; Agu, Emmanuel; Strong, Diane M; Tulu, Bengisu

    2017-09-01

    The standard chronic wound assessment method based on visual examination is potentially inaccurate and also represents a significant clinical workload. Hence, computer-based systems providing quantitative wound assessment may be valuable for accurately monitoring wound healing status, with the wound area the best suited for automated analysis. Here, we present a novel approach, using support vector machines (SVM) to determine the wound boundaries on foot ulcer images captured with an image capture box, which provides controlled lighting and range. After superpixel segmentation, a cascaded two-stage classifier operates as follows: in the first stage, a set of k binary SVM classifiers are trained and applied to different subsets of the entire training images dataset, and incorrectly classified instances are collected. In the second stage, another binary SVM classifier is trained on the incorrectly classified set. We extracted various color and texture descriptors from superpixels that are used as input for each stage in the classifier training. Specifically, color and bag-of-word representations of local dense scale invariant feature transformation features are descriptors for ruling out irrelevant regions, and color and wavelet-based features are descriptors for distinguishing healthy tissue from wound regions. Finally, the detected wound boundary is refined by applying the conditional random field method. We have implemented the wound classification on a Nexus 5 smartphone platform, except for training which was done offline. Results are compared with other classifiers and show that our approach provides high global performance rates (average sensitivity = 73.3%, specificity = 94.6%) and is sufficiently efficient for a smartphone-based image analysis.

  16. Accuracy of automated classification of major depressive disorder as a function of symptom severity.

    Science.gov (United States)

    Ramasubbu, Rajamannar; Brown, Matthew R G; Cortese, Filmeno; Gaxiola, Ismael; Goodyear, Bradley; Greenshaw, Andrew J; Dursun, Serdar M; Greiner, Russell

    2016-01-01

    Growing evidence documents the potential of machine learning for developing brain based diagnostic methods for major depressive disorder (MDD). As symptom severity may influence brain activity, we investigated whether the severity of MDD affected the accuracies of machine learned MDD-vs-Control diagnostic classifiers. Forty-five medication-free patients with DSM-IV defined MDD and 19 healthy controls participated in the study. Based on depression severity as determined by the Hamilton Rating Scale for Depression (HRSD), MDD patients were sorted into three groups: mild to moderate depression (HRSD 14-19), severe depression (HRSD 20-23), and very severe depression (HRSD ≥ 24). We collected functional magnetic resonance imaging (fMRI) data during both resting-state and an emotional-face matching task. Patients in each of the three severity groups were compared against controls in separate analyses, using either the resting-state or task-based fMRI data. We use each of these six datasets with linear support vector machine (SVM) binary classifiers for identifying individuals as patients or controls. The resting-state fMRI data showed statistically significant classification accuracy only for the very severe depression group (accuracy 66%, p = 0.012 corrected), while mild to moderate (accuracy 58%, p = 1.0 corrected) and severe depression (accuracy 52%, p = 1.0 corrected) were only at chance. With task-based fMRI data, the automated classifier performed at chance in all three severity groups. Binary linear SVM classifiers achieved significant classification of very severe depression with resting-state fMRI, but the contribution of brain measurements may have limited potential in differentiating patients with less severe depression from healthy controls.

  17. Prediction of novel pre-microRNAs with high accuracy through boosting and SVM.

    Science.gov (United States)

    Zhang, Yuanwei; Yang, Yifan; Zhang, Huan; Jiang, Xiaohua; Xu, Bo; Xue, Yu; Cao, Yunxia; Zhai, Qian; Zhai, Yong; Xu, Mingqing; Cooke, Howard J; Shi, Qinghua

    2011-05-15

    High-throughput deep-sequencing technology has generated an unprecedented number of expressed short sequence reads, presenting not only an opportunity but also a challenge for prediction of novel microRNAs. To verify the existence of candidate microRNAs, we have to show that these short sequences can be processed from candidate pre-microRNAs. However, it is laborious and time consuming to verify these using existing experimental techniques. Therefore, here, we describe a new method, miRD, which is constructed using two feature selection strategies based on support vector machines (SVMs) and boosting method. It is a high-efficiency tool for novel pre-microRNA prediction with accuracy up to 94.0% among different species. miRD is implemented in PHP/PERL+MySQL+R and can be freely accessed at http://mcg.ustc.edu.cn/rpg/mird/mird.php.

  18. Accuracy assessment between different image classification ...

    African Journals Online (AJOL)

    What image classification does is to assign pixel to a particular land cover and land use type that has the most similar spectral signature. However, there are possibilities that different methods or algorithms of image classification of the same data set could produce appreciable variant results in the sizes, shapes and areas of ...

  19. Accuracy of automated classification of major depressive disorder as a function of symptom severity

    Directory of Open Access Journals (Sweden)

    Rajamannar Ramasubbu, MD, FRCPC, MSc

    2016-01-01

    Conclusions: Binary linear SVM classifiers achieved significant classification of very severe depression with resting-state fMRI, but the contribution of brain measurements may have limited potential in differentiating patients with less severe depression from healthy controls.

  20. A linear-RBF multikernel SVM to classify big text corpora.

    Science.gov (United States)

    Romero, R; Iglesias, E L; Borrajo, L

    2015-01-01

    Support vector machine (SVM) is a powerful technique for classification. However, SVM is not suitable for classification of large datasets or text corpora, because the training complexity of SVMs is highly dependent on the input size. Recent developments in the literature on the SVM and other kernel methods emphasize the need to consider multiple kernels or parameterizations of kernels because they provide greater flexibility. This paper shows a multikernel SVM to manage highly dimensional data, providing an automatic parameterization with low computational cost and improving results against SVMs parameterized under a brute-force search. The model consists in spreading the dataset into cohesive term slices (clusters) to construct a defined structure (multikernel). The new approach is tested on different text corpora. Experimental results show that the new classifier has good accuracy compared with the classic SVM, while the training is significantly faster than several other SVM classifiers.

  1. Filtering SVM frame-by-frame binary classification in a detection framework

    NARCIS (Netherlands)

    Betancourt Arango, A.; Morerio, P.; Marcenaro, L.; Rauterberg, G.W.M.; Regazzoni, C.S.

    2015-01-01

    Classifying frames, or parts of them, is a common way of carrying out detection tasks in computer vision. However, frame by frame classification suffers from sudden significant variations in image texture, colour and luminosity, resulting in noise in the extracted features and consequently in the

  2. Vehicle Detection with Occlusion Handling, Tracking, and OC-SVM Classification: A High Performance Vision-Based System

    Science.gov (United States)

    Velazquez-Pupo, Roxana; Sierra-Romero, Alberto; Torres-Roman, Deni; Shkvarko, Yuriy V.; Romero-Delgado, Misael

    2018-01-01

    This paper presents a high performance vision-based system with a single static camera for traffic surveillance, for moving vehicle detection with occlusion handling, tracking, counting, and One Class Support Vector Machine (OC-SVM) classification. In this approach, moving objects are first segmented from the background using the adaptive Gaussian Mixture Model (GMM). After that, several geometric features are extracted, such as vehicle area, height, width, centroid, and bounding box. As occlusion is present, an algorithm was implemented to reduce it. The tracking is performed with adaptive Kalman filter. Finally, the selected geometric features: estimated area, height, and width are used by different classifiers in order to sort vehicles into three classes: small, midsize, and large. Extensive experimental results in eight real traffic videos with more than 4000 ground truth vehicles have shown that the improved system can run in real time under an occlusion index of 0.312 and classify vehicles with a global detection rate or recall, precision, and F-measure of up to 98.190%, and an F-measure of up to 99.051% for midsize vehicles. PMID:29382078

  3. SVM-based glioma grading. Optimization by feature reduction analysis

    International Nuclear Information System (INIS)

    Zoellner, Frank G.; Schad, Lothar R.; Emblem, Kyrre E.; Harvard Medical School, Boston, MA; Oslo Univ. Hospital

    2012-01-01

    We investigated the predictive power of feature reduction analysis approaches in support vector machine (SVM)-based classification of glioma grade. In 101 untreated glioma patients, three analytic approaches were evaluated to derive an optimal reduction in features; (i) Pearson's correlation coefficients (PCC), (ii) principal component analysis (PCA) and (iii) independent component analysis (ICA). Tumor grading was performed using a previously reported SVM approach including whole-tumor cerebral blood volume (CBV) histograms and patient age. Best classification accuracy was found using PCA at 85% (sensitivity = 89%, specificity = 84%) when reducing the feature vector from 101 (100-bins rCBV histogram + age) to 3 principal components. In comparison, classification accuracy by PCC was 82% (89%, 77%, 2 dimensions) and 79% by ICA (87%, 75%, 9 dimensions). For improved speed (up to 30%) and simplicity, feature reduction by all three methods provided similar classification accuracy to literature values (∝87%) while reducing the number of features by up to 98%. (orig.)

  4. Prediction of healthy blood with data mining classification by using Decision Tree, Naive Baysian and SVM approaches

    Science.gov (United States)

    Khalilinezhad, Mahdieh; Minaei, Behrooz; Vernazza, Gianni; Dellepiane, Silvana

    2015-03-01

    Data mining (DM) is the process of discovery knowledge from large databases. Applications of data mining in Blood Transfusion Organizations could be useful for improving the performance of blood donation service. The aim of this research is the prediction of healthiness of blood donors in Blood Transfusion Organization (BTO). For this goal, three famous algorithms such as Decision Tree C4.5, Naïve Bayesian classifier, and Support Vector Machine have been chosen and applied to a real database made of 11006 donors. Seven fields such as sex, age, job, education, marital status, type of donor, results of blood tests (doctors' comments and lab results about healthy or unhealthy blood donors) have been selected as input to these algorithms. The results of the three algorithms have been compared and an error cost analysis has been performed. According to this research and the obtained results, the best algorithm with low error cost and high accuracy is SVM. This research helps BTO to realize a model from blood donors in each area in order to predict the healthy blood or unhealthy blood of donors. This research could be useful if used in parallel with laboratory tests to better separate unhealthy blood.

  5. Tuning to optimize SVM approach for assisting ovarian cancer diagnosis with photoacoustic imaging.

    Science.gov (United States)

    Wang, Rui; Li, Rui; Lei, Yanyan; Zhu, Quing

    2015-01-01

    Support vector machine (SVM) is one of the most effective classification methods for cancer detection. The efficiency and quality of a SVM classifier depends strongly on several important features and a set of proper parameters. Here, a series of classification analyses, with one set of photoacoustic data from ovarian tissues ex vivo and a widely used breast cancer dataset- the Wisconsin Diagnostic Breast Cancer (WDBC), revealed the different accuracy of a SVM classification in terms of the number of features used and the parameters selected. A pattern recognition system is proposed by means of SVM-Recursive Feature Elimination (RFE) with the Radial Basis Function (RBF) kernel. To improve the effectiveness and robustness of the system, an optimized tuning ensemble algorithm called as SVM-RFE(C) with correlation filter was implemented to quantify feature and parameter information based on cross validation. The proposed algorithm is first demonstrated outperforming SVM-RFE on WDBC. Then the best accuracy of 94.643% and sensitivity of 94.595% were achieved when using SVM-RFE(C) to test 57 new PAT data from 19 patients. The experiment results show that the classifier constructed with SVM-RFE(C) algorithm is able to learn additional information from new data and has significant potential in ovarian cancer diagnosis.

  6. Wavelet-SVM classification and automatic recognition of unstained viable cells in phase-contrast microscopy

    International Nuclear Information System (INIS)

    Skoczylas, M.; Rakowski, W.; Cherubini, R.; Gerardi, S.

    2011-01-01

    Irradiation of individual cultured mammalian cells with a pre-selected number of ions down to one ion per single cell is a useful experimental approach to investigating the low-dose ionising radiation exposure effects and thus contributing to a more realistic human cancer risk assessment. One of the crucial tasks of all the microbeam apparatuses is the visualisation, recognition and positioning of every individual cell of the cell culture to be irradiated. Before irradiations, mammalian cells (specifically, Chinese hamster V79 cells) are seeded and grown as a monolayer on a mylar surface used as the bottom of a specially designed holder. Manual recognition of unstained cells in a bright-field microscope is a time-consuming procedure; therefore, a parallel algorithm has been conceived and developed in order to speed up this irradiation protocol step. Many technical problems have been faced to overcome the complexity of the images to be analysed: cell discrimination in an inhomogeneous background, among many disturbing bodies mainly due to the mylar surface roughness and culture medium bodies; cell shapes, depending on how they attach to the surface, which phase of the cell cycle they are in and on cell density. Preliminary results of the recognition and classification based on a method of wavelet kernels for the support vector machine classifier will be presented. (authors)

  7. 100% classification accuracy considered harmful: the normalized information transfer factor explains the accuracy paradox.

    Directory of Open Access Journals (Sweden)

    Francisco J Valverde-Albacete

    Full Text Available The most widely spread measure of performance, accuracy, suffers from a paradox: predictive models with a given level of accuracy may have greater predictive power than models with higher accuracy. Despite optimizing classification error rate, high accuracy models may fail to capture crucial information transfer in the classification task. We present evidence of this behavior by means of a combinatorial analysis where every possible contingency matrix of 2, 3 and 4 classes classifiers are depicted on the entropy triangle, a more reliable information-theoretic tool for classification assessment. Motivated by this, we develop from first principles a measure of classification performance that takes into consideration the information learned by classifiers. We are then able to obtain the entropy-modulated accuracy (EMA, a pessimistic estimate of the expected accuracy with the influence of the input distribution factored out, and the normalized information transfer factor (NIT, a measure of how efficient is the transmission of information from the input to the output set of classes. The EMA is a more natural measure of classification performance than accuracy when the heuristic to maximize is the transfer of information through the classifier instead of classification error count. The NIT factor measures the effectiveness of the learning process in classifiers and also makes it harder for them to "cheat" using techniques like specialization, while also promoting the interpretability of results. Their use is demonstrated in a mind reading task competition that aims at decoding the identity of a video stimulus based on magnetoencephalography recordings. We show how the EMA and the NIT factor reject rankings based in accuracy, choosing more meaningful and interpretable classifiers.

  8. Selecting Feature Subsets Based on SVM-RFE and the Overlapping Ratio with Applications in Bioinformatics.

    Science.gov (United States)

    Lin, Xiaohui; Li, Chao; Zhang, Yanhui; Su, Benzhe; Fan, Meng; Wei, Hai

    2017-12-26

    Feature selection is an important topic in bioinformatics. Defining informative features from complex high dimensional biological data is critical in disease study, drug development, etc. Support vector machine-recursive feature elimination (SVM-RFE) is an efficient feature selection technique that has shown its power in many applications. It ranks the features according to the recursive feature deletion sequence based on SVM. In this study, we propose a method, SVM-RFE-OA, which combines the classification accuracy rate and the average overlapping ratio of the samples to determine the number of features to be selected from the feature rank of SVM-RFE. Meanwhile, to measure the feature weights more accurately, we propose a modified SVM-RFE-OA (M-SVM-RFE-OA) algorithm that temporally screens out the samples lying in a heavy overlapping area in each iteration. The experiments on the eight public biological datasets show that the discriminative ability of the feature subset could be measured more accurately by combining the classification accuracy rate with the average overlapping degree of the samples compared with using the classification accuracy rate alone, and shielding the samples in the overlapping area made the calculation of the feature weights more stable and accurate. The methods proposed in this study can also be used with other RFE techniques to define potential biomarkers from big biological data.

  9. Selecting Feature Subsets Based on SVM-RFE and the Overlapping Ratio with Applications in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Xiaohui Lin

    2017-12-01

    Full Text Available Feature selection is an important topic in bioinformatics. Defining informative features from complex high dimensional biological data is critical in disease study, drug development, etc. Support vector machine-recursive feature elimination (SVM-RFE is an efficient feature selection technique that has shown its power in many applications. It ranks the features according to the recursive feature deletion sequence based on SVM. In this study, we propose a method, SVM-RFE-OA, which combines the classification accuracy rate and the average overlapping ratio of the samples to determine the number of features to be selected from the feature rank of SVM-RFE. Meanwhile, to measure the feature weights more accurately, we propose a modified SVM-RFE-OA (M-SVM-RFE-OA algorithm that temporally screens out the samples lying in a heavy overlapping area in each iteration. The experiments on the eight public biological datasets show that the discriminative ability of the feature subset could be measured more accurately by combining the classification accuracy rate with the average overlapping degree of the samples compared with using the classification accuracy rate alone, and shielding the samples in the overlapping area made the calculation of the feature weights more stable and accurate. The methods proposed in this study can also be used with other RFE techniques to define potential biomarkers from big biological data.

  10. COMPARISON OF PERFORMANCES OF DIFFERENT SVM IMPLEMENTATIONS WHEN USED FOR AUTOMATED EVALUATION OF DESCRIPTIVE ANSWERS

    Directory of Open Access Journals (Sweden)

    C. Sunil Kumar

    2015-04-01

    Full Text Available In this paper, we studied the performances of models built using various SVM implementations during the multiclass classification task of automated evaluation of descriptive answers. The performances were evaluated on five datasets each with 900 samples and with each of the datasets treated using symmetric uncertainty feature selection filter. We quantitatively analyzed the best SVM implementation technique from amongst the 17 different SVM implementation combinations derived by using various SVM classifier libraries, SVM types and Kernel methods. Accuracy, F Score, Kappa and Area under ROC curve are used as model evaluation metrics in order to evaluate the models and rank them according to their performances. Based on the results, we derived the conclusion that SMO classifier when used with Polynomial kernel is the overall best performing classifier applicable for auto evaluation of descriptive answers.

  11. Accuracy Analysis Comparison of Supervised Classification Methods for Anomaly Detection on Levees Using SAR Imagery

    Directory of Open Access Journals (Sweden)

    Ramakalavathi Marapareddy

    2017-10-01

    Full Text Available This paper analyzes the use of a synthetic aperture radar (SAR imagery to support levee condition assessment by detecting potential slide areas in an efficient and cost-effective manner. Levees are prone to a failure in the form of internal erosion within the earthen structure and landslides (also called slough or slump slides. If not repaired, slough slides may lead to levee failures. In this paper, we compare the accuracy of the supervised classification methods minimum distance (MD using Euclidean and Mahalanobis distance, support vector machine (SVM, and maximum likelihood (ML, using SAR technology to detect slough slides on earthen levees. In this work, the effectiveness of the algorithms was demonstrated using quad-polarimetric L-band SAR imagery from the NASA Jet Propulsion Laboratory’s (JPL’s uninhabited aerial vehicle synthetic aperture radar (UAVSAR. The study area is a section of the lower Mississippi River valley in the Southern USA, where earthen flood control levees are maintained by the US Army Corps of Engineers.

  12. Predicting the Metabolic Sites by Flavin-Containing Monooxygenase on Drug Molecules Using SVM Classification on Computed Quantum Mechanics and Circular Fingerprints Molecular Descriptors.

    Directory of Open Access Journals (Sweden)

    Chien-Wei Fu

    Full Text Available As an important enzyme in Phase I drug metabolism, the flavin-containing monooxygenase (FMO also metabolizes some xenobiotics with soft nucleophiles. The site of metabolism (SOM on a molecule is the site where the metabolic reaction is exerted by an enzyme. Accurate prediction of SOMs on drug molecules will assist the search for drug leads during the optimization process. Here, some quantum mechanics features such as the condensed Fukui function and attributes from circular fingerprints (called Molprint2D are computed and classified using the support vector machine (SVM for predicting some potential SOMs on a series of drugs that can be metabolized by FMO enzymes. The condensed Fukui function fA- representing the nucleophilicity of central atom A and the attributes from circular fingerprints accounting the influence of neighbors on the central atom. The total number of FMO substrates and non-substrates collected in the study is 85 and they are equally divided into the training and test sets with each carrying roughly the same number of potential SOMs. However, only N-oxidation and S-oxidation features were considered in the prediction since the available C-oxidation data was scarce. In the training process, the LibSVM package of WEKA package and the option of 10-fold cross validation are employed. The prediction performance on the test set evaluated by accuracy, Matthews correlation coefficient and area under ROC curve computed are 0.829, 0.659, and 0.877 respectively. This work reveals that the SVM model built can accurately predict the potential SOMs for drug molecules that are metabolizable by the FMO enzymes.

  13. Identification of eggs from different production systems based on hyperspectra and CS-SVM.

    Science.gov (United States)

    Sun, J; Cong, S L; Mao, H P; Zhou, X; Wu, X H; Zhang, X D

    2017-06-01

    1. To identify the origin of table eggs more accurately, a method based on hyperspectral imaging technology was studied. 2. The hyperspectral data of 200 samples of intensive and extensive eggs were collected. Standard normalised variables combined with a Savitzky-Golay were used to eliminate noise, then stepwise regression (SWR) was used for feature selection. Grid search algorithm (GS), genetic search algorithm (GA), particle swarm optimisation algorithm (PSO) and cuckoo search algorithm (CS) were applied by support vector machine (SVM) methods to establish an SVM identification model with the optimal parameters. The full spectrum data and the data after feature selection were the input of the model, while egg category was the output. 3. The SWR-CS-SVM model performed better than the other models, including SWR-GS-SVM, SWR-GA-SVM, SWR-PSO-SVM and others based on full spectral data. The training and test classification accuracy of the SWR-CS-SVM model were respectively 99.3% and 96%. 4. SWR-CS-SVM proved effective for identifying egg varieties and could also be useful for the non-destructive identification of other types of egg.

  14. Efficacy of hidden markov model over support vector machine on multiclass classification of healthy and cancerous cervical tissues

    Science.gov (United States)

    Mukhopadhyay, Sabyasachi; Kurmi, Indrajit; Pratiher, Sawon; Mukherjee, Sukanya; Barman, Ritwik; Ghosh, Nirmalya; Panigrahi, Prasanta K.

    2018-02-01

    In this paper, a comparative study between SVM and HMM has been carried out for multiclass classification of cervical healthy and cancerous tissues. In our study, the HMM methodology is more promising to produce higher accuracy in classification.

  15. Training set extension for SVM ensemble in P300-speller with familiar face paradigm.

    Science.gov (United States)

    Li, Qi; Shi, Kaiyang; Gao, Ning; Li, Jian; Bai, Ou

    2018-03-27

    P300-spellers are brain-computer interface (BCI)-based character input systems. Support vector machine (SVM) ensembles are trained with large-scale training sets and used as classifiers in these systems. However, the required large-scale training data necessitate a prolonged collection time for each subject, which results in data collected toward the end of the period being contaminated by the subject's fatigue. This study aimed to develop a method for acquiring more training data based on a collected small training set. A new method was developed in which two corresponding training datasets in two sequences are superposed and averaged to extend the training set. The proposed method was tested offline on a P300-speller with the familiar face paradigm. The SVM ensemble with extended training set achieved 85% classification accuracy for the averaged results of four sequences, and 100% for 11 sequences in the P300-speller. In contrast, the conventional SVM ensemble with non-extended training set achieved only 65% accuracy for four sequences, and 92% for 11 sequences. The SVM ensemble with extended training set achieves higher classification accuracies than the conventional SVM ensemble, which verifies that the proposed method effectively improves the classification performance of BCI P300-spellers, thus enhancing their practicality.

  16. Poster abstract: A machine learning approach for vehicle classification using passive infrared and ultrasonic sensors

    KAUST Repository

    Warriach, Ehsan Ullah

    2013-01-01

    This article describes the implementation of four different machine learning techniques for vehicle classification in a dual ultrasonic/passive infrared traffic flow sensors. Using k-NN, Naive Bayes, SVM and KNN-SVM algorithms, we show that KNN-SVM significantly outperforms other algorithms in terms of classification accuracy. We also show that some of these algorithms could run in real time on the prototype system. Copyright © 2013 ACM.

  17. A Power Transformers Fault Diagnosis Model Based on Three DGA Ratios and PSO Optimization SVM

    Science.gov (United States)

    Ma, Hongzhe; Zhang, Wei; Wu, Rongrong; Yang, Chunyan

    2018-03-01

    In order to make up for the shortcomings of existing transformer fault diagnosis methods in dissolved gas-in-oil analysis (DGA) feature selection and parameter optimization, a transformer fault diagnosis model based on the three DGA ratios and particle swarm optimization (PSO) optimize support vector machine (SVM) is proposed. Using transforming support vector machine to the nonlinear and multi-classification SVM, establishing the particle swarm optimization to optimize the SVM multi classification model, and conducting transformer fault diagnosis combined with the cross validation principle. The fault diagnosis results show that the average accuracy of test method is better than the standard support vector machine and genetic algorithm support vector machine, and the proposed method can effectively improve the accuracy of transformer fault diagnosis is proved.

  18. Analisis Perbandingan KNN dengan SVM untuk Klasifikasi Penyakit Diabetes Retinopati berdasarkan Citra Eksudat dan Mikroaneurisma

    Directory of Open Access Journals (Sweden)

    SUCI AULIA

    2015-01-01

    Full Text Available ABSTRAK Penelitian mengenai pengklasifikasian tingkat keparahan penyakit Diabetes Retinopati berbasis image processing masih hangat dibicarakan, citra yang biasa digunakan untuk mendeteksi jenis penyakit ini adalah citra optik disk, mikroaneurisma, eksudat, dan hemorrhages yang berasal dari citra fundus. Pada penelitian ini telah dilakukan perbandingan algoritma SVM dengan KNN untuk klasifikasi penyakit diabetes retinopati (mild, moderate, severe berdasarkan citra eksudat dan microaneurisma. Untuk proses ekstraksi ciri digunakan metode wavelet  pada masing-masing kedua metode tersebut. Pada penelitian ini digunakan 160 data uji, masing-masing 40 citra untuk kelas normal, kelas mild, kelas moderate, kelas saviere. Tingkat akurasi yang diperoleh dengan menggunakan metode KNN lebih tinggi dibandingkan SVM, yaitu 65 % dan 62%. Klasifikasi dengan algoritma KNN diperoleh hasil terbaik dengan parameter K=9 cityblock. Sedangkan klasifikasi dengan metode SVM diperoleh hasil terbaik dengan parameter One Agains All. Kata kunci: Diabetic Retinopathy, KNN , SVM, Wavelet.   ABSTRACT Research based on severity classification of the disease diabetic retinopathy by using image processing method is still hotly debated, the image is used to detect the type of this disease is an optical image of the disk, microaneurysm, exudates, and bleeding of the image of the fundus. This study was performed to compare SVM method with KNN method for classification of diabetic retinopathy disease (mild, moderate, severe based on exudate and microaneurysm image. For feature extraction uses wavelet method, and each of the two methods. This study made use of 160 test data, each of 40 images for normal class, mild class, moderate class, severe class. The accuracy obtained by KNN higher than SVM, with 65% and 62%. KNN classification method achieved the best results with the parameters K = 9, cityblock. While the classification with SVM method obtained the best results with

  19. OPTIMALISASI SUPPORT VEKTOR MACHINE (SVM UNTUK KLASIFIKASI TEMA TUGAS AKHIR BERBASIS K-MEANS

    Directory of Open Access Journals (Sweden)

    Oman Somantri

    2017-01-01

    Full Text Available The difficulty in determining the classification of students final project theme often experienced by each college. The purpose of this study is to provide a decision support for policy makers in the study program so that each student can be achieved in accordance with their own competence. From the research that has been done text mining algorithms using Support Vector Machine ( SVM and K -Means as the technology used was produced a better accuracy rate with an accuracy rate of 86.21 % when compared to the SVM without K -Means is 85 , 38 %

  20. Classification of high resolution satellite images

    OpenAIRE

    Karlsson, Anders

    2003-01-01

    In this thesis the Support Vector Machine (SVM)is applied on classification of high resolution satellite images. Sveral different measures for classification, including texture mesasures, 1st order statistics, and simple contextual information were evaluated. Additionnally, the image was segmented, using an enhanced watershed method, in order to improve the classification accuracy.

  1. IMPACTS OF PATCH SIZE AND LANDSCAPE HETEROGENEITY ON THEMATIC IMAGE CLASSIFICATION ACCURACY

    Science.gov (United States)

    Impacts of Patch Size and Landscape Heterogeneity on Thematic Image Classification Accuracy. Currently, most thematic accuracy assessments of classified remotely sensed images oily account for errors between the various classes employed, at particular pixels of interest, thu...

  2. Research on feature extraction and classification of AE signals of fibers' tensile failure based on HHT and SVM

    Directory of Open Access Journals (Sweden)

    Yanding SHEN

    2016-10-01

    Full Text Available In order to study the feature extraction and recognition method of fibers' tensile failure, AE technology is used to collect AE signals of fiber bundle's tensile fracture of two kinds of fibers of Aramid 1313 and viscose. A transform called wavelet is used to deal with the signals to reduce noise. A method called Hilbert-Huang transform (HHT is used to extract characteristic frequencies of the signals after the noise is reduced. And a classification method called Least Squares support vector machines (LSSVM is used for the classification and recognition of characteristic frequencies of the two kinds of fibers. The results show that wavelet de-noise method can reduce some noise of the signals. Hilbert spectrum can reflect fracture circumstances of the two kinds of fibers in the time dimension to some extent. Characteristic frequencies' extraction can be done from marginal spectrum. The LSSVM can be used for the classification and recognition of characteristic frequencies. The recognition rates of Aramid 1313 and viscose reach 40%, 80% respectively, and the total recognition rate reaches 60%.

  3. GenSVM: a generalized multiclass support vector machine

    NARCIS (Netherlands)

    G.J.J. van den Burg (Gertjan); P.J.F. Groenen (Patrick)

    2016-01-01

    textabstractTraditional extensions of the binary support vector machine (SVM) to multiclass problems are either heuristics or require solving a large dual optimization problem. Here, a generalized multiclass SVM is proposed called GenSVM. In this method classification boundaries for a K-class

  4. Accurate Fluid Level Measurement in Dynamic Environment Using Ultrasonic Sensor and ν-SVM

    Directory of Open Access Journals (Sweden)

    Jenny TERZIC

    2009-10-01

    Full Text Available A fluid level measurement system based on a single Ultrasonic Sensor and Support Vector Machines (SVM based signal processing and classification system has been developed to determine the fluid level in automotive fuel tanks. The novel approach based on the ν-SVM classification method uses the Radial Basis Function (RBF to compensate for the measurement error induced by the sloshing effects in the tank caused by vehicle motion. A broad investigation on selected pre-processing filters, namely, Moving Mean, Moving Median, and Wavelet filter, has also been presented. Field drive trials were performed under normal driving conditions at various fuel volumes ranging from 5 L to 50 L to acquire sample data from the ultrasonic sensor for the training of SVM model. Further drive trials were conducted to obtain data to verify the SVM results. A comparison of the accuracy of the predicted fluid level obtained using SVM and the pre-processing filters is provided. It is demonstrated that the ν-SVM model using the RBF kernel function and the Moving Median filter has produced the most accurate outcome compared with the other signal filtration methods in terms of fluid level measurement.

  5. Land cover classification accuracy from electro-optical, X, C, and L-band Synthetic Aperture Radar data fusion

    Science.gov (United States)

    Hammann, Mark Gregory

    The fusion of electro-optical (EO) multi-spectral satellite imagery with Synthetic Aperture Radar (SAR) data was explored with the working hypothesis that the addition of multi-band SAR will increase the land-cover (LC) classification accuracy compared to EO alone. Three satellite sources for SAR imagery were used: X-band from TerraSAR-X, C-band from RADARSAT-2, and L-band from PALSAR. Images from the RapidEye satellites were the source of the EO imagery. Imagery from the GeoEye-1 and WorldView-2 satellites aided the selection of ground truth. Three study areas were chosen: Wad Medani, Sudan; Campinas, Brazil; and Fresno- Kings Counties, USA. EO imagery were radiometrically calibrated, atmospherically compensated, orthorectifed, co-registered, and clipped to a common area of interest (AOI). SAR imagery were radiometrically calibrated, and geometrically corrected for terrain and incidence angle by converting to ground range and Sigma Naught (?0). The original SAR HH data were included in the fused image stack after despeckling with a 3x3 Enhanced Lee filter. The variance and Gray-Level-Co-occurrence Matrix (GLCM) texture measures of contrast, entropy, and correlation were derived from the non-despeckled SAR HH bands. Data fusion was done with layer stacking and all data were resampled to a common spatial resolution. The Support Vector Machine (SVM) decision rule was used for the supervised classifications. Similar LC classes were identified and tested for each study area. For Wad Medani, nine classes were tested: low and medium intensity urban, sparse forest, water, barren ground, and four agriculture classes (fallow, bare agricultural ground, green crops, and orchards). For Campinas, Brazil, five generic classes were tested: urban, agriculture, forest, water, and barren ground. For the Fresno-Kings Counties location 11 classes were studied: three generic classes (urban, water, barren land), and eight specific crops. In all cases the addition of SAR to EO resulted

  6. SVM-based glioma grading. Optimization by feature reduction analysis

    Energy Technology Data Exchange (ETDEWEB)

    Zoellner, Frank G.; Schad, Lothar R. [University Medical Center Mannheim, Heidelberg Univ., Mannheim (Germany). Computer Assisted Clinical Medicine; Emblem, Kyrre E. [Massachusetts General Hospital, Charlestown, A.A. Martinos Center for Biomedical Imaging, Boston MA (United States). Dept. of Radiology; Harvard Medical School, Boston, MA (United States); Oslo Univ. Hospital (Norway). The Intervention Center

    2012-11-01

    We investigated the predictive power of feature reduction analysis approaches in support vector machine (SVM)-based classification of glioma grade. In 101 untreated glioma patients, three analytic approaches were evaluated to derive an optimal reduction in features; (i) Pearson's correlation coefficients (PCC), (ii) principal component analysis (PCA) and (iii) independent component analysis (ICA). Tumor grading was performed using a previously reported SVM approach including whole-tumor cerebral blood volume (CBV) histograms and patient age. Best classification accuracy was found using PCA at 85% (sensitivity = 89%, specificity = 84%) when reducing the feature vector from 101 (100-bins rCBV histogram + age) to 3 principal components. In comparison, classification accuracy by PCC was 82% (89%, 77%, 2 dimensions) and 79% by ICA (87%, 75%, 9 dimensions). For improved speed (up to 30%) and simplicity, feature reduction by all three methods provided similar classification accuracy to literature values ({proportional_to}87%) while reducing the number of features by up to 98%. (orig.)

  7. SVM-based multimodal classification of activities of daily living in Health Smart Homes: sensors, algorithms, and first experimental results.

    Science.gov (United States)

    Fleury, Anthony; Vacher, Michel; Noury, Norbert

    2010-03-01

    By 2050, about one third of the French population will be over 65. Our laboratory's current research focuses on the monitoring of elderly people at home, to detect a loss of autonomy as early as possible. Our aim is to quantify criteria such as the international activities of daily living (ADL) or the French Autonomie Gerontologie Groupes Iso-Ressources (AGGIR) scales, by automatically classifying the different ADL performed by the subject during the day. A Health Smart Home is used for this. Our Health Smart Home includes, in a real flat, infrared presence sensors (location), door contacts (to control the use of some facilities), temperature and hygrometry sensor in the bathroom, and microphones (sound classification and speech recognition). A wearable kinematic sensor also informs postural transitions (using pattern recognition) and walk periods (frequency analysis). This data collected from the various sensors are then used to classify each temporal frame into one of the ADL that was previously acquired (seven activities: hygiene, toilet use, eating, resting, sleeping, communication, and dressing/undressing). This is done using support vector machines. We performed a 1-h experimentation with 13 young and healthy subjects to determine the models of the different activities, and then we tested the classification algorithm (cross validation) with real data.

  8. Convolutional Neural Network Achieves Human-level Accuracy in Music Genre Classification

    OpenAIRE

    Dong, Mingwen

    2018-01-01

    Music genre classification is one example of content-based analysis of music signals. Traditionally, human-engineered features were used to automatize this task and 61% accuracy has been achieved in the 10-genre classification. However, it's still below the 70% accuracy that humans could achieve in the same task. Here, we propose a new method that combines knowledge of human perception study in music genre classification and the neurophysiology of the auditory system. The method works by trai...

  9. Signal peptide discrimination and cleavage site identification using SVM and NN.

    Science.gov (United States)

    Kazemian, H B; Yusuf, S A; White, K

    2014-02-01

    About 15% of all proteins in a genome contain a signal peptide (SP) sequence, at the N-terminus, that targets the protein to intracellular secretory pathways. Once the protein is targeted correctly in the cell, the SP is cleaved, releasing the mature protein. Accurate prediction of the presence of these short amino-acid SP chains is crucial for modelling the topology of membrane proteins, since SP sequences can be confused with transmembrane domains due to similar composition of hydrophobic amino acids. This paper presents a cascaded Support Vector Machine (SVM)-Neural Network (NN) classification methodology for SP discrimination and cleavage site identification. The proposed method utilises a dual phase classification approach using SVM as a primary classifier to discriminate SP sequences from Non-SP. The methodology further employs NNs to predict the most suitable cleavage site candidates. In phase one, a SVM classification utilises hydrophobic propensities as a primary feature vector extraction using symmetric sliding window amino-acid sequence analysis for discrimination of SP and Non-SP. In phase two, a NN classification uses asymmetric sliding window sequence analysis for prediction of cleavage site identification. The proposed SVM-NN method was tested using Uni-Prot non-redundant datasets of eukaryotic and prokaryotic proteins with SP and Non-SP N-termini. Computer simulation results demonstrate an overall accuracy of 0.90 for SP and Non-SP discrimination based on Matthews Correlation Coefficient (MCC) tests using SVM. For SP cleavage site prediction, the overall accuracy is 91.5% based on cross-validation tests using the novel SVM-NN model. © 2013 Published by Elsevier Ltd.

  10. CompareSVM: supervised, Support Vector Machine (SVM) inference of gene regularity networks.

    Science.gov (United States)

    Gillani, Zeeshan; Akash, Muhammad Sajid Hamid; Rahaman, M D Matiur; Chen, Ming

    2014-11-30

    Predication of gene regularity network (GRN) from expression data is a challenging task. There are many methods that have been developed to address this challenge ranging from supervised to unsupervised methods. Most promising methods are based on support vector machine (SVM). There is a need for comprehensive analysis on prediction accuracy of supervised method SVM using different kernels on different biological experimental conditions and network size. We developed a tool (CompareSVM) based on SVM to compare different kernel methods for inference of GRN. Using CompareSVM, we investigated and evaluated different SVM kernel methods on simulated datasets of microarray of different sizes in detail. The results obtained from CompareSVM showed that accuracy of inference method depends upon the nature of experimental condition and size of the network. For network with nodes (SVM Gaussian kernel outperform on knockout, knockdown, and multifactorial datasets compared to all the other inference methods. For network with large number of nodes (~500), choice of inference method depend upon nature of experimental condition. CompareSVM is available at http://bis.zju.edu.cn/CompareSVM/ .

  11. ASSESSMENT OF LANDSCAPE CHARACTERISTICS ON THEMATIC IMAGE CLASSIFICATION ACCURACY

    Science.gov (United States)

    Landscape characteristics such as small patch size and land cover heterogeneity have been hypothesized to increase the likelihood of misclassifying pixels during thematic image classification. However, there has been a lack of empirical evidence, to support these hypotheses. This...

  12. [Study on application of SVM in prediction of coronary heart disease].

    Science.gov (United States)

    Zhu, Yue; Wu, Jianghua; Fang, Ying

    2013-12-01

    Base on the data of blood pressure, plasma lipid, Glu and UA by physical test, Support Vector Machine (SVM) was applied to identify coronary heart disease (CHD) in patients and non-CHD individuals in south China population for guide of further prevention and treatment of the disease. Firstly, the SVM classifier was built using radial basis kernel function, liner kernel function and polynomial kernel function, respectively. Secondly, the SVM penalty factor C and kernel parameter sigma were optimized by particle swarm optimization (PSO) and then employed to diagnose and predict the CHD. By comparison with those from artificial neural network with the back propagation (BP) model, linear discriminant analysis, logistic regression method and non-optimized SVM, the overall results of our calculation demonstrated that the classification performance of optimized RBF-SVM model could be superior to other classifier algorithm with higher accuracy rate, sensitivity and specificity, which were 94.51%, 92.31% and 96.67%, respectively. So, it is well concluded that SVM could be used as a valid method for assisting diagnosis of CHD.

  13. Application of support vector machine for classification of multispectral data

    International Nuclear Information System (INIS)

    Bahari, Nurul Iman Saiful; Ahmad, Asmala; Aboobaider, Burhanuddin Mohd

    2014-01-01

    In this paper, support vector machine (SVM) is used to classify satellite remotely sensed multispectral data. The data are recorded from a Landsat-5 TM satellite with resolution of 30x30m. SVM finds the optimal separating hyperplane between classes by focusing on the training cases. The study area of Klang Valley has more than 10 land covers and classification using SVM has been done successfully without any pixel being unclassified. The training area is determined carefully by visual interpretation and with the aid of the reference map of the study area. The result obtained is then analysed for the accuracy and visual performance. Accuracy assessment is done by determination and discussion of Kappa coefficient value, overall and producer accuracy for each class (in pixels and percentage). While, visual analysis is done by comparing the classification data with the reference map. Overall the study shows that SVM is able to classify the land covers within the study area with a high accuracy

  14. Convolutional neural network for high-accuracy functional near-infrared spectroscopy in a brain-computer interface: three-class classification of rest, right-, and left-hand motor execution.

    Science.gov (United States)

    Trakoolwilaiwan, Thanawin; Behboodi, Bahareh; Lee, Jaeseok; Kim, Kyungsoo; Choi, Ji-Woong

    2018-01-01

    The aim of this work is to develop an effective brain-computer interface (BCI) method based on functional near-infrared spectroscopy (fNIRS). In order to improve the performance of the BCI system in terms of accuracy, the ability to discriminate features from input signals and proper classification are desired. Previous studies have mainly extracted features from the signal manually, but proper features need to be selected carefully. To avoid performance degradation caused by manual feature selection, we applied convolutional neural networks (CNNs) as the automatic feature extractor and classifier for fNIRS-based BCI. In this study, the hemodynamic responses evoked by performing rest, right-, and left-hand motor execution tasks were measured on eight healthy subjects to compare performances. Our CNN-based method provided improvements in classification accuracy over conventional methods employing the most commonly used features of mean, peak, slope, variance, kurtosis, and skewness, classified by support vector machine (SVM) and artificial neural network (ANN). Specifically, up to 6.49% and 3.33% improvement in classification accuracy was achieved by CNN compared with SVM and ANN, respectively.

  15. Toward accountable land use mapping: Using geocomputation to improve classification accuracy and reveal uncertainty

    NARCIS (Netherlands)

    Beekhuizen, J.; Clarke, K.C.

    2010-01-01

    The classification of satellite imagery into land use/cover maps is a major challenge in the field of remote sensing. This research aimed at improving the classification accuracy while also revealing uncertain areas by employing a geocomputational approach. We computed numerous land use maps by

  16. Neural network and SVM classifiers accurately predict lipid binding proteins, irrespective of sequence homology.

    Science.gov (United States)

    Bakhtiarizadeh, Mohammad Reza; Moradi-Shahrbabak, Mohammad; Ebrahimi, Mansour; Ebrahimie, Esmaeil

    2014-09-07

    Due to the central roles of lipid binding proteins (LBPs) in many biological processes, sequence based identification of LBPs is of great interest. The major challenge is that LBPs are diverse in sequence, structure, and function which results in low accuracy of sequence homology based methods. Therefore, there is a need for developing alternative functional prediction methods irrespective of sequence similarity. To identify LBPs from non-LBPs, the performances of support vector machine (SVM) and neural network were compared in this study. Comprehensive protein features and various techniques were employed to create datasets. Five-fold cross-validation (CV) and independent evaluation (IE) tests were used to assess the validity of the two methods. The results indicated that SVM outperforms neural network. SVM achieved 89.28% (CV) and 89.55% (IE) overall accuracy in identification of LBPs from non-LBPs and 92.06% (CV) and 92.90% (IE) (in average) for classification of different LBPs classes. Increasing the number and the range of extracted protein features as well as optimization of the SVM parameters significantly increased the efficiency of LBPs class prediction in comparison to the only previous report in this field. Altogether, the results showed that the SVM algorithm can be run on broad, computationally calculated protein features and offers a promising tool in detection of LBPs classes. The proposed approach has the potential to integrate and improve the common sequence alignment based methods. Copyright © 2014 Elsevier Ltd. All rights reserved.

  17. A Mass Spectrometric Analysis Method Based on PPCA and SVM for Early Detection of Ovarian Cancer.

    Science.gov (United States)

    Wu, Jiang; Ji, Yanju; Zhao, Ling; Ji, Mengying; Ye, Zhuang; Li, Suyi

    2016-01-01

    Background. Surfaced-enhanced laser desorption-ionization-time of flight mass spectrometry (SELDI-TOF-MS) technology plays an important role in the early diagnosis of ovarian cancer. However, the raw MS data is highly dimensional and redundant. Therefore, it is necessary to study rapid and accurate detection methods from the massive MS data. Methods. The clinical data set used in the experiments for early cancer detection consisted of 216 SELDI-TOF-MS samples. An MS analysis method based on probabilistic principal components analysis (PPCA) and support vector machine (SVM) was proposed and applied to the ovarian cancer early classification in the data set. Additionally, by the same data set, we also established a traditional PCA-SVM model. Finally we compared the two models in detection accuracy, specificity, and sensitivity. Results. Using independent training and testing experiments 10 times to evaluate the ovarian cancer detection models, the average prediction accuracy, sensitivity, and specificity of the PCA-SVM model were 83.34%, 82.70%, and 83.88%, respectively. In contrast, those of the PPCA-SVM model were 90.80%, 92.98%, and 88.97%, respectively. Conclusions. The PPCA-SVM model had better detection performance. And the model combined with the SELDI-TOF-MS technology had a prospect in early clinical detection and diagnosis of ovarian cancer.

  18. SVM-PB-Pred: SVM based protein block prediction method using sequence profiles and secondary structures.

    Science.gov (United States)

    Suresh, V; Parthasarathy, S

    2014-01-01

    We developed a support vector machine based web server called SVM-PB-Pred, to predict the Protein Block for any given amino acid sequence. The input features of SVM-PB-Pred include i) sequence profiles (PSSM) and ii) actual secondary structures (SS) from DSSP method or predicted secondary structures from NPS@ and GOR4 methods. There were three combined input features PSSM+SS(DSSP), PSSM+SS(NPS@) and PSSM+SS(GOR4) used to test and train the SVM models. Similarly, four datasets RS90, DB433, LI1264 and SP1577 were used to develop the SVM models. These four SVM models developed were tested using three different benchmarking tests namely; (i) self consistency, (ii) seven fold cross validation test and (iii) independent case test. The maximum possible prediction accuracy of ~70% was observed in self consistency test for the SVM models of both LI1264 and SP1577 datasets, where PSSM+SS(DSSP) input features was used to test. The prediction accuracies were reduced to ~53% for PSSM+SS(NPS@) and ~43% for PSSM+SS(GOR4) in independent case test, for the SVM models of above two same datasets. Using our method, it is possible to predict the protein block letters for any query protein sequence with ~53% accuracy, when the SP1577 dataset and predicted secondary structure from NPS@ server were used. The SVM-PB-Pred server can be freely accessed through http://bioinfo.bdu.ac.in/~svmpbpred.

  19. A SVM-based method for sentiment analysis in Persian language

    Science.gov (United States)

    Hajmohammadi, Mohammad Sadegh; Ibrahim, Roliana

    2013-03-01

    Persian language is the official language of Iran, Tajikistan and Afghanistan. Local online users often represent their opinions and experiences on the web with written Persian. Although the information in those reviews is valuable to potential consumers and sellers, the huge amount of web reviews make it difficult to give an unbiased evaluation to a product. In this paper, standard machine learning techniques SVM and naive Bayes are incorporated into the domain of online Persian Movie reviews to automatically classify user reviews as positive or negative and performance of these two classifiers is compared with each other in this language. The effects of feature presentations on classification performance are discussed. We find that accuracy is influenced by interaction between the classification models and the feature options. The SVM classifier achieves as well as or better accuracy than naive Bayes in Persian movie. Unigrams are proved better features than bigrams and trigrams in capturing Persian sentiment orientation.

  20. Gene masking - a technique to improve accuracy for cancer classification with high dimensionality in microarray data.

    Science.gov (United States)

    Saini, Harsh; Lal, Sunil Pranit; Naidu, Vimal Vikash; Pickering, Vincel Wince; Singh, Gurmeet; Tsunoda, Tatsuhiko; Sharma, Alok

    2016-12-05

    High dimensional feature space generally degrades classification in several applications. In this paper, we propose a strategy called gene masking, in which non-contributing dimensions are heuristically removed from the data to improve classification accuracy. Gene masking is implemented via a binary encoded genetic algorithm that can be integrated seamlessly with classifiers during the training phase of classification to perform feature selection. It can also be used to discriminate between features that contribute most to the classification, thereby, allowing researchers to isolate features that may have special significance. This technique was applied on publicly available datasets whereby it substantially reduced the number of features used for classification while maintaining high accuracies. The proposed technique can be extremely useful in feature selection as it heuristically removes non-contributing features to improve the performance of classifiers.

  1. Estimated accuracy of classification of defects detected in welded joints by radiographic tests

    International Nuclear Information System (INIS)

    Siqueira, M.H.S.; De Silva, R.R.; De Souza, M.P.V.; Rebello, J.M.A.; Caloba, L.P.; Mery, D.

    2004-01-01

    This work is a study to estimate the accuracy of classification of the main classes of weld defects detected by radiography test, such as: undercut, lack of penetration, porosity, slag inclusion, crack or lack of fusion. To carry out this work non-linear pattern classifiers were developed, using neural networks, and the largest number of radiographic patterns as possible was used as well as statistical inference techniques of random selection of samples with and without repositioning (bootstrap) in order to estimate the accuracy of the classification. The results pointed to an estimated accuracy of around 80% for the classes of defects analyzed. (author)

  2. Estimated accuracy of classification of defects detected in welded joints by radiographic tests

    Energy Technology Data Exchange (ETDEWEB)

    Siqueira, M.H.S.; De Silva, R.R.; De Souza, M.P.V.; Rebello, J.M.A. [Federal Univ. of Rio de Janeiro, Dept., of Metallurgical and Materials Engineering, Rio de Janeiro (Brazil); Caloba, L.P. [Federal Univ. of Rio de Janeiro, Dept., of Electrical Engineering, Rio de Janeiro (Brazil); Mery, D. [Pontificia Unversidad Catolica de Chile, Escuela de Ingenieria - DCC, Dept. de Ciencia de la Computacion, Casilla, Santiago (Chile)

    2004-07-01

    This work is a study to estimate the accuracy of classification of the main classes of weld defects detected by radiography test, such as: undercut, lack of penetration, porosity, slag inclusion, crack or lack of fusion. To carry out this work non-linear pattern classifiers were developed, using neural networks, and the largest number of radiographic patterns as possible was used as well as statistical inference techniques of random selection of samples with and without repositioning (bootstrap) in order to estimate the accuracy of the classification. The results pointed to an estimated accuracy of around 80% for the classes of defects analyzed. (author)

  3. Density-based penalty parameter optimization on C-SVM.

    Science.gov (United States)

    Liu, Yun; Lian, Jie; Bartolacci, Michael R; Zeng, Qing-An

    2014-01-01

    The support vector machine (SVM) is one of the most widely used approaches for data classification and regression. SVM achieves the largest distance between the positive and negative support vectors, which neglects the remote instances away from the SVM interface. In order to avoid a position change of the SVM interface as the result of an error system outlier, C-SVM was implemented to decrease the influences of the system's outliers. Traditional C-SVM holds a uniform parameter C for both positive and negative instances; however, according to the different number proportions and the data distribution, positive and negative instances should be set with different weights for the penalty parameter of the error terms. Therefore, in this paper, we propose density-based penalty parameter optimization of C-SVM. The experiential results indicated that our proposed algorithm has outstanding performance with respect to both precision and recall.

  4. LMD Based Features for the Automatic Seizure Detection of EEG Signals Using SVM.

    Science.gov (United States)

    Zhang, Tao; Chen, Wanzhong

    2017-08-01

    Achieving the goal of detecting seizure activity automatically using electroencephalogram (EEG) signals is of great importance and significance for the treatment of epileptic seizures. To realize this aim, a newly-developed time-frequency analytical algorithm, namely local mean decomposition (LMD), is employed in the presented study. LMD is able to decompose an arbitrary signal into a series of product functions (PFs). Primarily, the raw EEG signal is decomposed into several PFs, and then the temporal statistical and non-linear features of the first five PFs are calculated. The features of each PF are fed into five classifiers, including back propagation neural network (BPNN), K-nearest neighbor (KNN), linear discriminant analysis (LDA), un-optimized support vector machine (SVM) and SVM optimized by genetic algorithm (GA-SVM), for five classification cases, respectively. Confluent features of all PFs and raw EEG are further passed into the high-performance GA-SVM for the same classification tasks. Experimental results on the international public Bonn epilepsy EEG dataset show that the average classification accuracy of the presented approach are equal to or higher than 98.10% in all the five cases, and this indicates the effectiveness of the proposed approach for automated seizure detection.

  5. A WFS-SVM Model for Soil Salinity Mapping in Keriya Oasis, Northwestern China Using Polarimetric Decomposition and Fully PolSAR Data

    Directory of Open Access Journals (Sweden)

    Ilyas Nurmemet

    2018-04-01

    Full Text Available Timely monitoring and mapping of salt-affected areas are essential for the prevention of land degradation and sustainable soil management in arid and semi-arid regions. The main objective of this study was to develop Synthetic Aperture Radar (SAR polarimetry techniques for improved soil salinity mapping in the Keriya Oasis in the Xinjiang Uyghur Autonomous Region (Xinjiang, China, where salinized soil appears to be a major threat to local agricultural productivity. Multiple polarimetric target decomposition, optimal feature subset selection (wrapper feature selector, WFS, and support vector machine (SVM algorithms were used for optimal soil salinization classification using quad-polarized PALSAR-2 data. A threefold exercise was conducted. First, 16 polarimetric decomposition methods were implemented and a wide range of polarimetric parameters and SAR discriminators were derived in order to mine hidden information in PolSAR data. Second, the optimal polarimetric feature subset that constitutes 19 polarimetric elements was selected adopting the WFS approach; optimum classification parameters were identified, and the optimal SVM classification model was obtained by employing a cross-validation method. Third, the WFS-SVM classification model was constructed, optimized, and implemented based on the optimal match of polarimetric features and optimum classification parameters. Soils with different salinization degrees (i.e., highly, moderately and slightly salinized soils were extracted. Finally, classification results were compared with the Wishart supervised classification and conventional SVM classification to examine the performance of the proposed method for salinity mapping. Detailed field investigations and ground data were used for the validation of the adopted methods. The overall accuracy and kappa coefficient of the proposed WFS-SVM model were 87.57% and 0.85, respectively that were much higher than those obtained by the Wishart supervised

  6. PCA based feature reduction to improve the accuracy of decision tree c4.5 classification

    Science.gov (United States)

    Nasution, M. Z. F.; Sitompul, O. S.; Ramli, M.

    2018-03-01

    Splitting attribute is a major process in Decision Tree C4.5 classification. However, this process does not give a significant impact on the establishment of the decision tree in terms of removing irrelevant features. It is a major problem in decision tree classification process called over-fitting resulting from noisy data and irrelevant features. In turns, over-fitting creates misclassification and data imbalance. Many algorithms have been proposed to overcome misclassification and overfitting on classifications Decision Tree C4.5. Feature reduction is one of important issues in classification model which is intended to remove irrelevant data in order to improve accuracy. The feature reduction framework is used to simplify high dimensional data to low dimensional data with non-correlated attributes. In this research, we proposed a framework for selecting relevant and non-correlated feature subsets. We consider principal component analysis (PCA) for feature reduction to perform non-correlated feature selection and Decision Tree C4.5 algorithm for the classification. From the experiments conducted using available data sets from UCI Cervical cancer data set repository with 858 instances and 36 attributes, we evaluated the performance of our framework based on accuracy, specificity and precision. Experimental results show that our proposed framework is robust to enhance classification accuracy with 90.70% accuracy rates.

  7. MAMMOGRAMS ANALYSIS USING SVM CLASSIFIER IN COMBINED TRANSFORMS DOMAIN

    Directory of Open Access Journals (Sweden)

    B.N. Prathibha

    2011-02-01

    Full Text Available Breast cancer is a primary cause of mortality and morbidity in women. Reports reveal that earlier the detection of abnormalities, better the improvement in survival. Digital mammograms are one of the most effective means for detecting possible breast anomalies at early stages. Digital mammograms supported with Computer Aided Diagnostic (CAD systems help the radiologists in taking reliable decisions. The proposed CAD system extracts wavelet features and spectral features for the better classification of mammograms. The Support Vector Machines classifier is used to analyze 206 mammogram images from Mias database pertaining to the severity of abnormality, i.e., benign and malign. The proposed system gives 93.14% accuracy for discrimination between normal-malign and 87.25% accuracy for normal-benign samples and 89.22% accuracy for benign-malign samples. The study reveals that features extracted in hybrid transform domain with SVM classifier proves to be a promising tool for analysis of mammograms.

  8. SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition.

    Science.gov (United States)

    Melvin, Iain; Ie, Eugene; Kuang, Rui; Weston, Jason; Stafford, William Noble; Leslie, Christina

    2007-05-22

    significantly improves on the standard one-vs-all method for both the superfamily and fold prediction in the remote homology setting and on the fold recognition problem. Moreover, our code weight learning algorithm strongly outperforms nearest-neighbor methods based on PSI-BLAST in terms of prediction accuracy on every structure classification problem we consider. By combining state-of-the-art SVM kernel methods with a novel multi-class algorithm, the SVM-Fold system delivers efficient and accurate protein fold and superfamily recognition.

  9. The Sample Size Influence in the Accuracy of the Image Classification of the Remote Sensing

    Directory of Open Access Journals (Sweden)

    Thomaz C. e C. da Costa

    2004-12-01

    Full Text Available Landuse/landcover maps produced by classification of remote sensing images incorporate uncertainty. This uncertainty is measured by accuracy indices using reference samples. The size of the reference sample is defined by approximation by a binomial function without the use of a pilot sample. This way the accuracy are not estimated, but fixed a priori. In case of divergency between the estimated and a priori accuracy the error of the sampling will deviate from the expected error. The size using pilot sample (theorically correct procedure justify when haven´t estimate of accuracy for work area, referent the product remote sensing utility.

  10. Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Maolong Xi

    2016-01-01

    Full Text Available This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO for cancer feature gene selection, coupling support vector machine (SVM for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV. Finally, the BQPSO coupling SVM (BQPSO/SVM, binary PSO coupling SVM (BPSO/SVM, and genetic algorithm coupling SVM (GA/SVM are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms.

  11. Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine

    Science.gov (United States)

    Sun, Jun; Liu, Li; Fan, Fangyun; Wu, Xiaojun

    2016-01-01

    This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO) for cancer feature gene selection, coupling support vector machine (SVM) for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV). Finally, the BQPSO coupling SVM (BQPSO/SVM), binary PSO coupling SVM (BPSO/SVM), and genetic algorithm coupling SVM (GA/SVM) are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms. PMID:27642363

  12. Conceptual Scoring and Classification Accuracy of Vocabulary Testing in Bilingual Children

    Science.gov (United States)

    Anaya, Jissel B.; Peña, Elizabeth D.; Bedore, Lisa M.

    2018-01-01

    Purpose: This study examined the effects of single-language and conceptual scoring on the vocabulary performance of bilingual children with and without specific language impairment. We assessed classification accuracy across 3 scoring methods. Method: Participants included Spanish-English bilingual children (N = 247) aged 5;1 (years;months) to…

  13. Study on Classification Accuracy Inspection of Land Cover Data Aided by Automatic Image Change Detection Technology

    Science.gov (United States)

    Xie, W.-J.; Zhang, L.; Chen, H.-P.; Zhou, J.; Mao, W.-J.

    2018-04-01

    The purpose of carrying out national geographic conditions monitoring is to obtain information of surface changes caused by human social and economic activities, so that the geographic information can be used to offer better services for the government, enterprise and public. Land cover data contains detailed geographic conditions information, thus has been listed as one of the important achievements in the national geographic conditions monitoring project. At present, the main issue of the production of the land cover data is about how to improve the classification accuracy. For the land cover data quality inspection and acceptance, classification accuracy is also an important check point. So far, the classification accuracy inspection is mainly based on human-computer interaction or manual inspection in the project, which are time consuming and laborious. By harnessing the automatic high-resolution remote sensing image change detection technology based on the ERDAS IMAGINE platform, this paper carried out the classification accuracy inspection test of land cover data in the project, and presented a corresponding technical route, which includes data pre-processing, change detection, result output and information extraction. The result of the quality inspection test shows the effectiveness of the technical route, which can meet the inspection needs for the two typical errors, that is, missing and incorrect update error, and effectively reduces the work intensity of human-computer interaction inspection for quality inspectors, and also provides a technical reference for the data production and quality control of the land cover data.

  14. Assessing the Accuracy and Consistency of Language Proficiency Classification under Competing Measurement Models

    Science.gov (United States)

    Zhang, Bo

    2010-01-01

    This article investigates how measurement models and statistical procedures can be applied to estimate the accuracy of proficiency classification in language testing. The paper starts with a concise introduction of four measurement models: the classical test theory (CTT) model, the dichotomous item response theory (IRT) model, the testlet response…

  15. Effects of sample survey design on the accuracy of classification tree models in species distribution models

    Science.gov (United States)

    Thomas C. Edwards; D. Richard Cutler; Niklaus E. Zimmermann; Linda Geiser; Gretchen G. Moisen

    2006-01-01

    We evaluated the effects of probabilistic (hereafter DESIGN) and non-probabilistic (PURPOSIVE) sample surveys on resultant classification tree models for predicting the presence of four lichen species in the Pacific Northwest, USA. Models derived from both survey forms were assessed using an independent data set (EVALUATION). Measures of accuracy as gauged by...

  16. Automatic epileptic seizure detection in EEGs using MF-DFA, SVM based on cloud computing.

    Science.gov (United States)

    Zhang, Zhongnan; Wen, Tingxi; Huang, Wei; Wang, Meihong; Li, Chunfeng

    2017-01-01

    Epilepsy is a chronic disease with transient brain dysfunction that results from the sudden abnormal discharge of neurons in the brain. Since electroencephalogram (EEG) is a harmless and noninvasive detection method, it plays an important role in the detection of neurological diseases. However, the process of analyzing EEG to detect neurological diseases is often difficult because the brain electrical signals are random, non-stationary and nonlinear. In order to overcome such difficulty, this study aims to develop a new computer-aided scheme for automatic epileptic seizure detection in EEGs based on multi-fractal detrended fluctuation analysis (MF-DFA) and support vector machine (SVM). New scheme first extracts features from EEG by MF-DFA during the first stage. Then, the scheme applies a genetic algorithm (GA) to calculate parameters used in SVM and classify the training data according to the selected features using SVM. Finally, the trained SVM classifier is exploited to detect neurological diseases. The algorithm utilizes MLlib from library of SPARK and runs on cloud platform. Applying to a public dataset for experiment, the study results show that the new feature extraction method and scheme can detect signals with less features and the accuracy of the classification reached up to 99%. MF-DFA is a promising approach to extract features for analyzing EEG, because of its simple algorithm procedure and less parameters. The features obtained by MF-DFA can represent samples as well as traditional wavelet transform and Lyapunov exponents. GA can always find useful parameters for SVM with enough execution time. The results illustrate that the classification model can achieve comparable accuracy, which means that it is effective in epileptic seizure detection.

  17. Boosted classification trees result in minor to modest improvement in the accuracy in classifying cardiovascular outcomes compared to conventional classification trees

    Science.gov (United States)

    Austin, Peter C; Lee, Douglas S

    2011-01-01

    Purpose: Classification trees are increasingly being used to classifying patients according to the presence or absence of a disease or health outcome. A limitation of classification trees is their limited predictive accuracy. In the data-mining and machine learning literature, boosting has been developed to improve classification. Boosting with classification trees iteratively grows classification trees in a sequence of reweighted datasets. In a given iteration, subjects that were misclassified in the previous iteration are weighted more highly than subjects that were correctly classified. Classifications from each of the classification trees in the sequence are combined through a weighted majority vote to produce a final classification. The authors' objective was to examine whether boosting improved the accuracy of classification trees for predicting outcomes in cardiovascular patients. Methods: We examined the utility of boosting classification trees for classifying 30-day mortality outcomes in patients hospitalized with either acute myocardial infarction or congestive heart failure. Results: Improvements in the misclassification rate using boosted classification trees were at best minor compared to when conventional classification trees were used. Minor to modest improvements to sensitivity were observed, with only a negligible reduction in specificity. For predicting cardiovascular mortality, boosted classification trees had high specificity, but low sensitivity. Conclusions: Gains in predictive accuracy for predicting cardiovascular outcomes were less impressive than gains in performance observed in the data mining literature. PMID:22254181

  18. A Support Vector Machine Hydrometeor Classification Algorithm for Dual-Polarization Radar

    Directory of Open Access Journals (Sweden)

    Nicoletta Roberto

    2017-07-01

    Full Text Available An algorithm based on a support vector machine (SVM is proposed for hydrometeor classification. The training phase is driven by the output of a fuzzy logic hydrometeor classification algorithm, i.e., the most popular approach for hydrometer classification algorithms used for ground-based weather radar. The performance of SVM is evaluated by resorting to a weather scenario, generated by a weather model; the corresponding radar measurements are obtained by simulation and by comparing results of SVM classification with those obtained by a fuzzy logic classifier. Results based on the weather model and simulations show a higher accuracy of the SVM classification. Objective comparison of the two classifiers applied to real radar data shows that SVM classification maps are spatially more homogenous (textural indices, energy, and homogeneity increases by 21% and 12% respectively and do not present non-classified data. The improvements found by SVM classifier, even though it is applied pixel-by-pixel, can be attributed to its ability to learn from the entire hyperspace of radar measurements and to the accurate training. The reliability of results and higher computing performance make SVM attractive for some challenging tasks such as its implementation in Decision Support Systems for helping pilots to make optimal decisions about changes inthe flight route caused by unexpected adverse weather.

  19. Universum Learning for Multiclass SVM

    OpenAIRE

    Dhar, Sauptik; Ramakrishnan, Naveen; Cherkassky, Vladimir; Shah, Mohak

    2016-01-01

    We introduce Universum learning for multiclass problems and propose a novel formulation for multiclass universum SVM (MU-SVM). We also propose a span bound for MU-SVM that can be used for model selection thereby avoiding resampling. Empirical results demonstrate the effectiveness of MU-SVM and the proposed bound.

  20. Impacts of land use/cover classification accuracy on regional climate simulations

    Science.gov (United States)

    Ge, Jianjun; Qi, Jiaguo; Lofgren, Brent M.; Moore, Nathan; Torbick, Nathan; Olson, Jennifer M.

    2007-03-01

    Land use/cover change has been recognized as a key component in global change. Various land cover data sets, including historically reconstructed, recently observed, and future projected, have been used in numerous climate modeling studies at regional to global scales. However, little attention has been paid to the effect of land cover classification accuracy on climate simulations, though accuracy assessment has become a routine procedure in land cover production community. In this study, we analyzed the behavior of simulated precipitation in the Regional Atmospheric Modeling System (RAMS) over a range of simulated classification accuracies over a 3 month period. This study found that land cover accuracy under 80% had a strong effect on precipitation especially when the land surface had a greater control of the atmosphere. This effect became stronger as the accuracy decreased. As shown in three follow-on experiments, the effect was further influenced by model parameterizations such as convection schemes and interior nudging, which can mitigate the strength of surface boundary forcings. In reality, land cover accuracy rarely obtains the commonly recommended 85% target. Its effect on climate simulations should therefore be considered, especially when historically reconstructed and future projected land covers are employed.

  1. Impacts of Sample Design for Validation Data on the Accuracy of Feedforward Neural Network Classification

    Directory of Open Access Journals (Sweden)

    Giles M. Foody

    2017-08-01

    Full Text Available Validation data are often used to evaluate the performance of a trained neural network and used in the selection of a network deemed optimal for the task at-hand. Optimality is commonly assessed with a measure, such as overall classification accuracy. The latter is often calculated directly from a confusion matrix showing the counts of cases in the validation set with particular labelling properties. The sample design used to form the validation set can, however, influence the estimated magnitude of the accuracy. Commonly, the validation set is formed with a stratified sample to give balanced classes, but also via random sampling, which reflects class abundance. It is suggested that if the ultimate aim is to accurately classify a dataset in which the classes do vary in abundance, a validation set formed via random, rather than stratified, sampling is preferred. This is illustrated with the classification of simulated and remotely-sensed datasets. With both datasets, statistically significant differences in the accuracy with which the data could be classified arose from the use of validation sets formed via random and stratified sampling (z = 2.7 and 1.9 for the simulated and real datasets respectively, for both p < 0.05%. The accuracy of the classifications that used a stratified sample in validation were smaller, a result of cases of an abundant class being commissioned into a rarer class. Simple means to address the issue are suggested.

  2. A Study on SVM Based on the Weighted Elitist Teaching-Learning-Based Optimization and Application in the Fault Diagnosis of Chemical Process

    Directory of Open Access Journals (Sweden)

    Cao Junxiang

    2015-01-01

    Full Text Available Teaching-Learning-Based Optimization (TLBO is a new swarm intelligence optimization algorithm that simulates the class learning process. According to such problems of the traditional TLBO as low optimizing efficiency and poor stability, this paper proposes an improved TLBO algorithm mainly by introducing the elite thought in TLBO and adopting different inertia weight decreasing strategies for elite and ordinary individuals of the teacher stage and the student stage. In this paper, the validity of the improved TLBO is verified by the optimizations of several typical test functions and the SVM optimized by the weighted elitist TLBO is used in the diagnosis and classification of common failure data of the TE chemical process. Compared with the SVM combining other traditional optimizing methods, the SVM optimized by the weighted elitist TLBO has a certain improvement in the accuracy of fault diagnosis and classification.

  3. Grouped fuzzy SVM with EM-based partition of sample space for clustered microcalcification detection.

    Science.gov (United States)

    Wang, Huiya; Feng, Jun; Wang, Hongyu

    2017-07-20

    Detection of clustered microcalcification (MC) from mammograms plays essential roles in computer-aided diagnosis for early stage breast cancer. To tackle problems associated with the diversity of data structures of MC lesions and the variability of normal breast tissues, multi-pattern sample space learning is required. In this paper, a novel grouped fuzzy Support Vector Machine (SVM) algorithm with sample space partition based on Expectation-Maximization (EM) (called G-FSVM) is proposed for clustered MC detection. The diversified pattern of training data is partitioned into several groups based on EM algorithm. Then a series of fuzzy SVM are integrated for classification with each group of samples from the MC lesions and normal breast tissues. From DDSM database, a total of 1,064 suspicious regions are selected from 239 mammography, and the measurement of Accuracy, True Positive Rate (TPR), False Positive Rate (FPR) and EVL = TPR* 1-FPR are 0.82, 0.78, 0.14 and 0.72, respectively. The proposed method incorporates the merits of fuzzy SVM and multi-pattern sample space learning, decomposing the MC detection problem into serial simple two-class classification. Experimental results from synthetic data and DDSM database demonstrate that our integrated classification framework reduces the false positive rate significantly while maintaining the true positive rate.

  4. Generalized SMO algorithm for SVM-based multitask learning.

    Science.gov (United States)

    Cai, Feng; Cherkassky, Vladimir

    2012-06-01

    Exploiting additional information to improve traditional inductive learning is an active research area in machine learning. In many supervised-learning applications, training data can be naturally separated into several groups, and incorporating this group information into learning may improve generalization. Recently, Vapnik proposed a general approach to formalizing such problems, known as "learning with structured data" and its support vector machine (SVM) based optimization formulation called SVM+. Liang and Cherkassky showed the connection between SVM+ and multitask learning (MTL) approaches in machine learning, and proposed an SVM-based formulation for MTL called SVM+MTL for classification. Training the SVM+MTL classifier requires the solution of a large quadratic programming optimization problem which scales as O(n(3)) with sample size n. So there is a need to develop computationally efficient algorithms for implementing SVM+MTL. This brief generalizes Platt's sequential minimal optimization (SMO) algorithm to the SVM+MTL setting. Empirical results show that, for typical SVM+MTL problems, the proposed generalized SMO achieves over 100 times speed-up, in comparison with general-purpose optimization routines.

  5. Biased binomial assessment of cross-validated estimation of classification accuracies illustrated in diagnosis predictions

    Directory of Open Access Journals (Sweden)

    Quentin Noirhomme

    2014-01-01

    Full Text Available Multivariate classification is used in neuroimaging studies to infer brain activation or in medical applications to infer diagnosis. Their results are often assessed through either a binomial or a permutation test. Here, we simulated classification results of generated random data to assess the influence of the cross-validation scheme on the significance of results. Distributions built from classification of random data with cross-validation did not follow the binomial distribution. The binomial test is therefore not adapted. On the contrary, the permutation test was unaffected by the cross-validation scheme. The influence of the cross-validation was further illustrated on real-data from a brain–computer interface experiment in patients with disorders of consciousness and from an fMRI study on patients with Parkinson disease. Three out of 16 patients with disorders of consciousness had significant accuracy on binomial testing, but only one showed significant accuracy using permutation testing. In the fMRI experiment, the mental imagery of gait could discriminate significantly between idiopathic Parkinson's disease patients and healthy subjects according to the permutation test but not according to the binomial test. Hence, binomial testing could lead to biased estimation of significance and false positive or negative results. In our view, permutation testing is thus recommended for clinical application of classification with cross-validation.

  6. Biased binomial assessment of cross-validated estimation of classification accuracies illustrated in diagnosis predictions.

    Science.gov (United States)

    Noirhomme, Quentin; Lesenfants, Damien; Gomez, Francisco; Soddu, Andrea; Schrouff, Jessica; Garraux, Gaëtan; Luxen, André; Phillips, Christophe; Laureys, Steven

    2014-01-01

    Multivariate classification is used in neuroimaging studies to infer brain activation or in medical applications to infer diagnosis. Their results are often assessed through either a binomial or a permutation test. Here, we simulated classification results of generated random data to assess the influence of the cross-validation scheme on the significance of results. Distributions built from classification of random data with cross-validation did not follow the binomial distribution. The binomial test is therefore not adapted. On the contrary, the permutation test was unaffected by the cross-validation scheme. The influence of the cross-validation was further illustrated on real-data from a brain-computer interface experiment in patients with disorders of consciousness and from an fMRI study on patients with Parkinson disease. Three out of 16 patients with disorders of consciousness had significant accuracy on binomial testing, but only one showed significant accuracy using permutation testing. In the fMRI experiment, the mental imagery of gait could discriminate significantly between idiopathic Parkinson's disease patients and healthy subjects according to the permutation test but not according to the binomial test. Hence, binomial testing could lead to biased estimation of significance and false positive or negative results. In our view, permutation testing is thus recommended for clinical application of classification with cross-validation.

  7. Influence of different topographic correction strategies on mountain vegetation classification accuracy in the Lancang Watershed, China

    Science.gov (United States)

    Zhang, Zhiming; de Wulf, Robert R.; van Coillie, Frieke M. B.; Verbeke, Lieven P. C.; de Clercq, Eva M.; Ou, Xiaokun

    2011-01-01

    Mapping of vegetation using remote sensing in mountainous areas is considerably hampered by topographic effects on the spectral response pattern. A variety of topographic normalization techniques have been proposed to correct these illumination effects due to topography. The purpose of this study was to compare six different topographic normalization methods (Cosine correction, Minnaert correction, C-correction, Sun-canopy-sensor correction, two-stage topographic normalization, and slope matching technique) for their effectiveness in enhancing vegetation classification in mountainous environments. Since most of the vegetation classes in the rugged terrain of the Lancang Watershed (China) did not feature a normal distribution, artificial neural networks (ANNs) were employed as a classifier. Comparing the ANN classifications, none of the topographic correction methods could significantly improve ETM+ image classification overall accuracy. Nevertheless, at the class level, the accuracy of pine forest could be increased by using topographically corrected images. On the contrary, oak forest and mixed forest accuracies were significantly decreased by using corrected images. The results also showed that none of the topographic normalization strategies was satisfactorily able to correct for the topographic effects in severely shadowed areas.

  8. Comparison of SVM, RF and ELM on an Electronic Nose for the Intelligent Evaluation of Paraffin Samples

    Directory of Open Access Journals (Sweden)

    Hong Men

    2018-01-01

    Full Text Available Paraffin odor intensity is an important quality indicator when a paraffin inspection is performed. Currently, paraffin odor level assessment is mainly dependent on an artificial sensory evaluation. In this paper, we developed a paraffin odor analysis system to classify and grade four kinds of paraffin samples. The original feature set was optimized using Principal Component Analysis (PCA and Partial Least Squares (PLS. Support Vector Machine (SVM, Random Forest (RF, and Extreme Learning Machine (ELM were applied to three different feature data sets for classification and level assessment of paraffin. For classification, the model based on SVM, with an accuracy rate of 100%, was superior to that based on RF, with an accuracy rate of 98.33–100%, and ELM, with an accuracy rate of 98.01–100%. For level assessment, the R2 related to the training set was above 0.97 and the R2 related to the test set was above 0.87. Through comprehensive comparison, the generalization of the model based on ELM was superior to those based on SVM and RF. The scoring errors for the three models were 0.0016–0.3494, lower than the error of 0.5–1.0 measured by industry standard experts, meaning these methods have a higher prediction accuracy for scoring paraffin level.

  9. Discrimination between Alzheimer's Disease and Mild Cognitive Impairment Using SOM and PSO-SVM

    Directory of Open Access Journals (Sweden)

    Shih-Ting Yang

    2013-01-01

    Full Text Available In this study, an MRI-based classification framework was proposed to distinguish the patients with AD and MCI from normal participants by using multiple features and different classifiers. First, we extracted features (volume and shape from MRI data by using a series of image processing steps. Subsequently, we applied principal component analysis (PCA to convert a set of features of possibly correlated variables into a smaller set of values of linearly uncorrelated variables, decreasing the dimensions of feature space. Finally, we developed a novel data mining framework in combination with support vector machine (SVM and particle swarm optimization (PSO for the AD/MCI classification. In order to compare the hybrid method with traditional classifier, two kinds of classifiers, that is, SVM and a self-organizing map (SOM, were trained for patient classification. With the proposed framework, the classification accuracy is improved up to 82.35% and 77.78% in patients with AD and MCI. The result achieved up to 94.12% and 88.89% in AD and MCI by combining the volumetric features and shape features and using PCA. The present results suggest that novel multivariate methods of pattern matching reach a clinically relevant accuracy for the a priori prediction of the progression from MCI to AD.

  10. Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN

    Science.gov (United States)

    Zhu, Lianzhang; Chen, Leiming; Zhao, Dehai

    2017-01-01

    Accurate emotion recognition from speech is important for applications like smart health care, smart entertainment, and other smart services. High accuracy emotion recognition from Chinese speech is challenging due to the complexities of the Chinese language. In this paper, we explore how to improve the accuracy of speech emotion recognition, including speech signal feature extraction and emotion classification methods. Five types of features are extracted from a speech sample: mel frequency cepstrum coefficient (MFCC), pitch, formant, short-term zero-crossing rate and short-term energy. By comparing statistical features with deep features extracted by a Deep Belief Network (DBN), we attempt to find the best features to identify the emotion status for speech. We propose a novel classification method that combines DBN and SVM (support vector machine) instead of using only one of them. In addition, a conjugate gradient method is applied to train DBN in order to speed up the training process. Gender-dependent experiments are conducted using an emotional speech database created by the Chinese Academy of Sciences. The results show that DBN features can reflect emotion status better than artificial features, and our new classification approach achieves an accuracy of 95.8%, which is higher than using either DBN or SVM separately. Results also show that DBN can work very well for small training databases if it is properly designed. PMID:28737705

  11. Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN.

    Science.gov (United States)

    Zhu, Lianzhang; Chen, Leiming; Zhao, Dehai; Zhou, Jiehan; Zhang, Weishan

    2017-07-24

    Accurate emotion recognition from speech is important for applications like smart health care, smart entertainment, and other smart services. High accuracy emotion recognition from Chinese speech is challenging due to the complexities of the Chinese language. In this paper, we explore how to improve the accuracy of speech emotion recognition, including speech signal feature extraction and emotion classification methods. Five types of features are extracted from a speech sample: mel frequency cepstrum coefficient (MFCC), pitch, formant, short-term zero-crossing rate and short-term energy. By comparing statistical features with deep features extracted by a Deep Belief Network (DBN), we attempt to find the best features to identify the emotion status for speech. We propose a novel classification method that combines DBN and SVM (support vector machine) instead of using only one of them. In addition, a conjugate gradient method is applied to train DBN in order to speed up the training process. Gender-dependent experiments are conducted using an emotional speech database created by the Chinese Academy of Sciences. The results show that DBN features can reflect emotion status better than artificial features, and our new classification approach achieves an accuracy of 95.8%, which is higher than using either DBN or SVM separately. Results also show that DBN can work very well for small training databases if it is properly designed.

  12. An IPSO-SVM algorithm for security state prediction of mine production logistics system

    Science.gov (United States)

    Zhang, Yanliang; Lei, Junhui; Ma, Qiuli; Chen, Xin; Bi, Runfang

    2017-06-01

    A theoretical basis for the regulation of corporate security warning and resources was provided in order to reveal the laws behind the security state in mine production logistics. Considering complex mine production logistics system and the variable is difficult to acquire, a superior security status predicting model of mine production logistics system based on the improved particle swarm optimization and support vector machine (IPSO-SVM) is proposed in this paper. Firstly, through the linear adjustments of inertia weight and learning weights, the convergence speed and search accuracy are enhanced with the aim to deal with situations associated with the changeable complexity and the data acquisition difficulty. The improved particle swarm optimization (IPSO) is then introduced to resolve the problem of parameter settings in traditional support vector machines (SVM). At the same time, security status index system is built to determine the classification standards of safety status. The feasibility and effectiveness of this method is finally verified using the experimental results.

  13. Geographical traceability of wild Boletus edulis based on data fusion of FT-MIR and ICP-AES coupled with data mining methods (SVM).

    Science.gov (United States)

    Li, Yun; Zhang, Ji; Li, Tao; Liu, Honggao; Li, Jieqing; Wang, Yuanzhong

    2017-04-15

    In this work, the data fusion strategy of Fourier transform mid infrared (FT-MIR) spectroscopy and inductively coupled plasma-atomic emission spectrometry (ICP-AES) was used in combination with Support Vector Machine (SVM) to determine the geographic origin of Boletus edulis collected from nine regions of Yunnan Province in China. Firstly, competitive adaptive reweighted sampling (CARS) was used for selecting an optimal combination of key wavenumbers of second derivative FT-MIR spectra, and thirteen elements were sorted with variable importance in projection (VIP) scores. Secondly, thirteen subsets of multi-elements with the best VIP score were generated and each subset was used to fuse with FT-MIR. Finally, the classification models were established by SVM, and the combination of parameter C and γ (gamma) of SVM models was calculated by the approaches of grid search (GS) and genetic algorithm (GA). The results showed that both GS-SVM and GA-SVM models achieved good performances based on the #9 subset and the prediction accuracy in calibration and validation sets of the two models were 81.40% and 90.91%, correspondingly. In conclusion, it indicated that the data fusion strategy of FT-MIR and ICP-AES coupled with the algorithm of SVM can be used as a reliable tool for accurate identification of B. edulis, and it can provide a useful way of thinking for the quality control of edible mushrooms. Copyright © 2017. Published by Elsevier B.V.

  14. Geographical traceability of wild Boletus edulis based on data fusion of FT-MIR and ICP-AES coupled with data mining methods (SVM)

    Science.gov (United States)

    Li, Yun; Zhang, Ji; Li, Tao; Liu, Honggao; Li, Jieqing; Wang, Yuanzhong

    2017-04-01

    In this work, the data fusion strategy of Fourier transform mid infrared (FT-MIR) spectroscopy and inductively coupled plasma-atomic emission spectrometry (ICP-AES) was used in combination with Support Vector Machine (SVM) to determine the geographic origin of Boletus edulis collected from nine regions of Yunnan Province in China. Firstly, competitive adaptive reweighted sampling (CARS) was used for selecting an optimal combination of key wavenumbers of second derivative FT-MIR spectra, and thirteen elements were sorted with variable importance in projection (VIP) scores. Secondly, thirteen subsets of multi-elements with the best VIP score were generated and each subset was used to fuse with FT-MIR. Finally, the classification models were established by SVM, and the combination of parameter C and γ (gamma) of SVM models was calculated by the approaches of grid search (GS) and genetic algorithm (GA). The results showed that both GS-SVM and GA-SVM models achieved good performances based on the #9 subset and the prediction accuracy in calibration and validation sets of the two models were 81.40% and 90.91%, correspondingly. In conclusion, it indicated that the data fusion strategy of FT-MIR and ICP-AES coupled with the algorithm of SVM can be used as a reliable tool for accurate identification of B. edulis, and it can provide a useful way of thinking for the quality control of edible mushrooms.

  15. A Novel Feature Extraction Approach Using Window Function Capturing and QPSO-SVM for Enhancing Electronic Nose Performance

    Directory of Open Access Journals (Sweden)

    Xiuzhen Guo

    2015-06-01

    Full Text Available In this paper, a novel feature extraction approach which can be referred to as moving window function capturing (MWFC has been proposed to analyze signals of an electronic nose (E-nose used for detecting types of infectious pathogens in rat wounds. Meanwhile, a quantum-behaved particle swarm optimization (QPSO algorithm is implemented in conjunction with support vector machine (SVM for realizing a synchronization optimization of the sensor array and SVM model parameters. The results prove the efficacy of the proposed method for E-nose feature extraction, which can lead to a higher classification accuracy rate compared to other established techniques. Meanwhile it is interesting to note that different classification results can be obtained by changing the types, widths or positions of windows. By selecting the optimum window function for the sensor response, the performance of an E-nose can be enhanced.

  16. An SVM-Based Solution for Fault Detection in Wind Turbines

    Directory of Open Access Journals (Sweden)

    Pedro Santos

    2015-03-01

    Full Text Available Research into fault diagnosis in machines with a wide range of variable loads and speeds, such as wind turbines, is of great industrial interest. Analysis of the power signals emitted by wind turbines for the diagnosis of mechanical faults in their mechanical transmission chain is insufficient. A successful diagnosis requires the inclusion of accelerometers to evaluate vibrations. This work presents a multi-sensory system for fault diagnosis in wind turbines, combined with a data-mining solution for the classification of the operational state of the turbine. The selected sensors are accelerometers, in which vibration signals are processed using angular resampling techniques and electrical, torque and speed measurements. Support vector machines (SVMs are selected for the classification task, including two traditional and two promising new kernels. This multi-sensory system has been validated on a test-bed that simulates the real conditions of wind turbines with two fault typologies: misalignment and imbalance. Comparison of SVM performance with the results of artificial neural networks (ANNs shows that linear kernel SVM outperforms other kernels and ANNs in terms of accuracy, training and tuning times. The suitability and superior performance of linear SVM is also experimentally analyzed, to conclude that this data acquisition technique generates linearly separable datasets.

  17. Applications of PCA and SVM-PSO Based Real-Time Face Recognition System

    Directory of Open Access Journals (Sweden)

    Ming-Yuan Shieh

    2014-01-01

    Full Text Available This paper incorporates principal component analysis (PCA with support vector machine-particle swarm optimization (SVM-PSO for developing real-time face recognition systems. The integrated scheme aims to adopt the SVM-PSO method to improve the validity of PCA based image recognition systems on dynamically visual perception. The face recognition for most human-robot interaction applications is accomplished by PCA based method because of its dimensionality reduction. However, PCA based systems are only suitable for processing the faces with the same face expressions and/or under the same view directions. Since the facial feature selection process can be considered as a problem of global combinatorial optimization in machine learning, the SVM-PSO is usually used as an optimal classifier of the system. In this paper, the PSO is used to implement a feature selection, and the SVMs serve as fitness functions of the PSO for classification problems. Experimental results demonstrate that the proposed method simplifies features effectively and obtains higher classification accuracy.

  18. An SVM-based solution for fault detection in wind turbines.

    Science.gov (United States)

    Santos, Pedro; Villa, Luisa F; Reñones, Aníbal; Bustillo, Andres; Maudes, Jesús

    2015-03-09

    Research into fault diagnosis in machines with a wide range of variable loads and speeds, such as wind turbines, is of great industrial interest. Analysis of the power signals emitted by wind turbines for the diagnosis of mechanical faults in their mechanical transmission chain is insufficient. A successful diagnosis requires the inclusion of accelerometers to evaluate vibrations. This work presents a multi-sensory system for fault diagnosis in wind turbines, combined with a data-mining solution for the classification of the operational state of the turbine. The selected sensors are accelerometers, in which vibration signals are processed using angular resampling techniques and electrical, torque and speed measurements. Support vector machines (SVMs) are selected for the classification task, including two traditional and two promising new kernels. This multi-sensory system has been validated on a test-bed that simulates the real conditions of wind turbines with two fault typologies: misalignment and imbalance. Comparison of SVM performance with the results of artificial neural networks (ANNs) shows that linear kernel SVM outperforms other kernels and ANNs in terms of accuracy, training and tuning times. The suitability and superior performance of linear SVM is also experimentally analyzed, to conclude that this data acquisition technique generates linearly separable datasets.

  19. Integrated Features by Administering the Support Vector Machine (SVM of Translational Initiations Sites in Alternative Polymorphic Contex

    Directory of Open Access Journals (Sweden)

    Nurul Arneida Husin

    2012-04-01

    Full Text Available Many algorithms and methods have been proposed for classification problems in bioinformatics. In this study, the discriminative approach in particular support vector machines (SVM is employed to recognize the studied TIS patterns. The applied discriminative approach is used to learn about some discriminant functions of samples that have been labelled as positive or negative. After learning, the discriminant functions are employed to decide whether a new sample is true or false. In this study, support vector machines (SVM is employed to recognize the patterns for studied translational initiation sites in alternative weak context. The method has been optimized with the best parameters selected; c=100, E=10-6 and ex=2 for non linear kernel function. Results show that with top 5 features and non linear kernel, the best prediction accuracy achieved is 95.8%. J48 algorithm is applied to compare with SVM with top 15 features and the results show a good prediction accuracy of 95.8%. This indicates that the top 5 features selected by the IGR method and that are performed by SVM are sufficient to use in the prediction of TIS in weak contexts.

  20. Classification of Herbaceous Vegetation Using Airborne Hyperspectral Imagery

    Directory of Open Access Journals (Sweden)

    Péter Burai

    2015-02-01

    Full Text Available Alkali landscapes hold an extremely fine-scale mosaic of several vegetation types, thus it seems challenging to separate these classes by remote sensing. Our aim was to test the applicability of different image classification methods of hyperspectral data in this complex situation. To reach the highest classification accuracy, we tested traditional image classifiers (maximum likelihood classifier—MLC, machine learning algorithms (support vector machine—SVM, random forest—RF and feature extraction (minimum noise fraction (MNF-transformation on training datasets of different sizes. Digital images were acquired from an AISA EAGLE II hyperspectral sensor of 128 contiguous bands (400–1000 nm, a spectral sampling of 5 nm bandwidth and a ground pixel size of 1 m. For the classification, we established twenty vegetation classes based on the dominant species, canopy height, and total vegetation cover. Image classification was applied to the original and MNF (minimum noise fraction transformed dataset with various training sample sizes between 10 and 30 pixels. In order to select the optimal number of the transformed features, we applied SVM, RF and MLC classification to 2–15 MNF transformed bands. In the case of the original bands, SVM and RF classifiers provided high accuracy irrespective of the number of the training pixels. We found that SVM and RF produced the best accuracy when using the first nine MNF transformed bands; involving further features did not increase classification accuracy. SVM and RF provided high accuracies with the transformed bands, especially in the case of the aggregated groups. Even MLC provided high accuracy with 30 training pixels (80.78%, but the use of a smaller training dataset (10 training pixels significantly reduced the accuracy of classification (52.56%. Our results suggest that in alkali landscapes, the application of SVM is a feasible solution, as it provided the highest accuracies compared to RF and MLC

  1. Application of SVM on satellite images to detect hotspots in Jharia coal field region of India

    Energy Technology Data Exchange (ETDEWEB)

    Gautam, R.S.; Singh, D.; Mittal, A.; Sajin, P. [Indian Institute for Technology, Roorkee (India)

    2008-07-01

    The present paper deals with the application of Support Vector Machine (SVM) and image analysis techniques on NOAA/AVHRR satellite image to detect hotspots on the Jharia coal field region of India. One of the major advantages of using these satellite data is that the data are free with very good temporal resolution; while, one drawback is that these have low spatial resolution (i.e., approximately 1.1 km at nadir). Therefore, it is important to do research by applying some efficient optimization techniques along with the image analysis techniques to rectify these drawbacks and use satellite images for efficient hotspot detection and monitoring. For this purpose, SVM and multi-threshold techniques are explored for hotspot detection. The multi-threshold algorithm is developed to remove the cloud coverage from the land coverage. This algorithm also highlights the hotspots or fire spots in the suspected regions. SVM has the advantage over multi-thresholding technique that it can learn patterns from the examples and therefore is used to optimize the performance by removing the false points which are highlighted in the threshold technique. Both approaches can be used separately or in combination depending on the size of the image. The RBF (Radial Basis Function) kernel is used in training of three sets of inputs: brightness temperature of channel 3, Normalized Difference Vegetation Index (NDVI) and Global Environment Monitoring Index (GEMI), respectively. This makes a classified image in the output that highlights the hotspot and non-hotspot pixels. The performance of the SVM is also compared with the performance obtained from the neural networks and SVM appears to detect hotspots more accurately (greater than 91% classification accuracy) with lesser false alarm rate. The results obtained are found to be in good agreement with the ground based observations of the hotspots.

  2. Recursive Cluster Elimination (RCE for classification and feature selection from gene expression data

    Directory of Open Access Journals (Sweden)

    Showe Louise C

    2007-05-01

    Full Text Available Abstract Background Classification studies using gene expression datasets are usually based on small numbers of samples and tens of thousands of genes. The selection of those genes that are important for distinguishing the different sample classes being compared, poses a challenging problem in high dimensional data analysis. We describe a new procedure for selecting significant genes as recursive cluster elimination (RCE rather than recursive feature elimination (RFE. We have tested this algorithm on six datasets and compared its performance with that of two related classification procedures with RFE. Results We have developed a novel method for selecting significant genes in comparative gene expression studies. This method, which we refer to as SVM-RCE, combines K-means, a clustering method, to identify correlated gene clusters, and Support Vector Machines (SVMs, a supervised machine learning classification method, to identify and score (rank those gene clusters for the purpose of classification. K-means is used initially to group genes into clusters. Recursive cluster elimination (RCE is then applied to iteratively remove those clusters of genes that contribute the least to the classification performance. SVM-RCE identifies the clusters of correlated genes that are most significantly differentially expressed between the sample classes. Utilization of gene clusters, rather than individual genes, enhances the supervised classification accuracy of the same data as compared to the accuracy when either SVM or Penalized Discriminant Analysis (PDA with recursive feature elimination (SVM-RFE and PDA-RFE are used to remove genes based on their individual discriminant weights. Conclusion SVM-RCE provides improved classification accuracy with complex microarray data sets when it is compared to the classification accuracy of the same datasets using either SVM-RFE or PDA-RFE. SVM-RCE identifies clusters of correlated genes that when considered together

  3. Classification of visualization exudates fundus images results using ...

    African Journals Online (AJOL)

    The kernel function settings; linear, polynomial, quadratic and RBF have an effect on the classification results. For SVM1, the best parameter in classifying pixels is linear kernel function. The visualization results using CAC and radar chart are classified using ts accuracy. It has proven to discriminated exudates and non ...

  4. SVM and ANFIS Models for precipitaton Modeling (Case Study: GonbadKavouse

    Directory of Open Access Journals (Sweden)

    N. Zabet Pishkhani

    2016-10-01

    Full Text Available Introduction: In recent years, according to the intelligent models increased as new techniques and tools in hydrological processes such as precipitation forecasting. ANFIS model has good ability in train, construction and classification, and also has the advantage that allows the extraction of fuzzy rules from numerical information or knowledge. Another intelligent technique in recent years has been used in various areas is support vector machine (SVM. In this paper the ability of artificial intelligence methods including support vector machine (SVM and adaptive neuro fuzzy inference system (ANFIS were analyzed in monthly precipitation prediction. Materials and Methods: The study area was the city of Gonbad in Golestan Province. The city has a temperate climate in the southern highlands and southern plains, mountains and temperate humid, semi-arid and semi-arid in the north of Gorganroud river. In total, the city's climate is temperate and humid. In the present study, monthly precipitation was modeled in Gonbad using ANFIS and SVM and two different database structures were designed. The first structure: input layer consisted of mean temperature, relative humidity, pressure and wind speed at Gonbad station. The second structure: According to Pearson coefficient, the monthly precipitation data were used from four stations: Arazkoose, Bahalke, Tamar and Aqqala which had a higher correlation with Gonbad station precipitation. In this study precipitation data was used from 1995 to 2012. 80% data were used for model training and the remaining 20% of data for validation. SVM was developed from support vector machines in the 1990s by Vapnik. SVM has been widely recognized as a powerful tool to deal with function fitting problems. An Adaptive Neuro-Fuzzy Inference System (ANFIS refers, in general, to an adaptive network which performs the function of a fuzzy inference system. The most commonly used fuzzy system in ANFIS architectures is the Sugeno model

  5. Using spectrotemporal indices to improve the fruit-tree crop classification accuracy

    Science.gov (United States)

    Peña, M. A.; Liao, R.; Brenning, A.

    2017-06-01

    This study assesses the potential of spectrotemporal indices derived from satellite image time series (SITS) to improve the classification accuracy of fruit-tree crops. Six major fruit-tree crop types in the Aconcagua Valley, Chile, were classified by applying various linear discriminant analysis (LDA) techniques on a Landsat-8 time series of nine images corresponding to the 2014-15 growing season. As features we not only used the complete spectral resolution of the SITS, but also all possible normalized difference indices (NDIs) that can be constructed from any two bands of the time series, a novel approach to derive features from SITS. Due to the high dimensionality of this "enhanced" feature set we used the lasso and ridge penalized variants of LDA (PLDA). Although classification accuracies yielded by the standard LDA applied on the full-band SITS were good (misclassification error rate, MER = 0.13), they were further improved by 23% (MER = 0.10) with ridge PLDA using the enhanced feature set. The most important bands to discriminate the crops of interest were mainly concentrated on the first two image dates of the time series, corresponding to the crops' greenup stage. Despite the high predictor weights provided by the red and near infrared bands, typically used to construct greenness spectral indices, other spectral regions were also found important for the discrimination, such as the shortwave infrared band at 2.11-2.19 μm, sensitive to foliar water changes. These findings support the usefulness of spectrotemporal indices in the context of SITS-based crop type classifications, which until now have been mainly constructed by the arithmetic combination of two bands of the same image date in order to derive greenness temporal profiles like those from the normalized difference vegetation index.

  6. Combining MLC and SVM Classifiers for Learning Based Decision Making: Analysis and Evaluations.

    Science.gov (United States)

    Zhang, Yi; Ren, Jinchang; Jiang, Jianmin

    2015-01-01

    Maximum likelihood classifier (MLC) and support vector machines (SVM) are two commonly used approaches in machine learning. MLC is based on Bayesian theory in estimating parameters of a probabilistic model, whilst SVM is an optimization based nonparametric method in this context. Recently, it is found that SVM in some cases is equivalent to MLC in probabilistically modeling the learning process. In this paper, MLC and SVM are combined in learning and classification, which helps to yield probabilistic output for SVM and facilitate soft decision making. In total four groups of data are used for evaluations, covering sonar, vehicle, breast cancer, and DNA sequences. The data samples are characterized in terms of Gaussian/non-Gaussian distributed and balanced/unbalanced samples which are then further used for performance assessment in comparing the SVM and the combined SVM-MLC classifier. Interesting results are reported to indicate how the combined classifier may work under various conditions.

  7. Combining MLC and SVM Classifiers for Learning Based Decision Making: Analysis and Evaluations

    Directory of Open Access Journals (Sweden)

    Yi Zhang

    2015-01-01

    Full Text Available Maximum likelihood classifier (MLC and support vector machines (SVM are two commonly used approaches in machine learning. MLC is based on Bayesian theory in estimating parameters of a probabilistic model, whilst SVM is an optimization based nonparametric method in this context. Recently, it is found that SVM in some cases is equivalent to MLC in probabilistically modeling the learning process. In this paper, MLC and SVM are combined in learning and classification, which helps to yield probabilistic output for SVM and facilitate soft decision making. In total four groups of data are used for evaluations, covering sonar, vehicle, breast cancer, and DNA sequences. The data samples are characterized in terms of Gaussian/non-Gaussian distributed and balanced/unbalanced samples which are then further used for performance assessment in comparing the SVM and the combined SVM-MLC classifier. Interesting results are reported to indicate how the combined classifier may work under various conditions.

  8. An assessment of support vector machines for land cover classification

    Science.gov (United States)

    Huang, C.; Davis, L.S.; Townshend, J.R.G.

    2002-01-01

    The support vector machine (SVM) is a group of theoretically superior machine learning algorithms. It was found competitive with the best available machine learning algorithms in classifying high-dimensional data sets. This paper gives an introduction to the theoretical development of the SVM and an experimental evaluation of its accuracy, stability and training speed in deriving land cover classifications from satellite images. The SVM was compared to three other popular classifiers, including the maximum likelihood classifier (MLC), neural network classifiers (NNC) and decision tree classifiers (DTC). The impacts of kernel configuration on the performance of the SVM and of the selection of training data and input variables on the four classifiers were also evaluated in this experiment.

  9. Improvement of User's Accuracy Through Classification of Principal Component Images and Stacked Temporal Images

    Institute of Scientific and Technical Information of China (English)

    Nilanchal Patel; Brijesh Kumar Kaushal

    2010-01-01

    The classification accuracy of the various categories on the classified remotely sensed images are usually evaluated by two different measures of accuracy, namely, producer's accuracy (PA) and user's accuracy (UA). The PA of a category indicates to what extent the reference pixels of the category are correctly classified, whereas the UA ora category represents to what extent the other categories are less misclassified into the category in question. Therefore, the UA of the various categories determines the reliability of their interpretation on the classified image and is more important to the analyst than the PA. The present investigation has been performed in order to determine ifthere occurs improvement in the UA of the various categories on the classified image of the principal components of the original bands and on the classified image of the stacked image of two different years. We performed the analyses using the IRS LISS Ⅲ images of two different years, i.e., 1996 and 2009, that represent the different magnitude of urbanization and the stacked image of these two years pertaining to Ranchi area, Jharkhand, India, with a view to assessing the impacts of urbanization on the UA of the different categories. The results of the investigation demonstrated that there occurs significant improvement in the UA of the impervious categories in the classified image of the stacked image, which is attributable to the aggregation of the spectral information from twice the number of bands from two different years. On the other hand, the classified image of the principal components did not show any improvement in the UA as compared to the original images.

  10. Per-field crop classification in irrigated agricultural regions in middle Asia using random forest and support vector machine ensemble

    Science.gov (United States)

    Löw, Fabian; Schorcht, Gunther; Michel, Ulrich; Dech, Stefan; Conrad, Christopher

    2012-10-01

    Accurate crop identification and crop area estimation are important for studies on irrigated agricultural systems, yield and water demand modeling, and agrarian policy development. In this study a novel combination of Random Forest (RF) and Support Vector Machine (SVM) classifiers is presented that (i) enhances crop classification accuracy and (ii) provides spatial information on map uncertainty. The methodology was implemented over four distinct irrigated sites in Middle Asia using RapidEye time series data. The RF feature importance statistics was used as feature-selection strategy for the SVM to assess possible negative effects on classification accuracy caused by an oversized feature space. The results of the individual RF and SVM classifications were combined with rules based on posterior classification probability and estimates of classification probability entropy. SVM classification performance was increased by feature selection through RF. Further experimental results indicate that the hybrid classifier improves overall classification accuracy in comparison to the single classifiers as well as useŕs and produceŕs accuracy.

  11. Speed and accuracy of facial expression classification in avoidant personality disorder: a preliminary study.

    Science.gov (United States)

    Rosenthal, M Zachary; Kim, Kwanguk; Herr, Nathaniel R; Smoski, Moria J; Cheavens, Jennifer S; Lynch, Thomas R; Kosson, David S

    2011-10-01

    The aim of this preliminary study was to examine whether individuals with avoidant personality disorder (APD) could be characterized by deficits in the classification of dynamically presented facial emotional expressions. Using a community sample of adults with APD (n = 17) and non-APD controls (n = 16), speed and accuracy of facial emotional expression recognition was investigated in a task that morphs facial expressions from neutral to prototypical expressions (Multi-Morph Facial Affect Recognition Task; Blair, Colledge, Murray, & Mitchell, 2001). Results indicated that individuals with APD were significantly more likely than controls to make errors when classifying fully expressed fear. However, no differences were found between groups in the speed to correctly classify facial emotional expressions. The findings are some of the first to investigate facial emotional processing in a sample of individuals with APD and point to an underlying deficit in processing social cues that may be involved in the maintenance of APD.

  12. Applications of Support Vector Machine (SVM) Learning in Cancer Genomics

    OpenAIRE

    HUANG, SHUJUN; CAI, NIANGUANG; PACHECO, PEDRO PENZUTI; NARANDES, SHAVIRA; WANG, YANG; XU, WAYNE

    2017-01-01

    Machine learning with maximization (support) of separating margin (vector), called support vector machine (SVM) learning, is a powerful classification tool that has been used for cancer genomic classification or subtyping. Today, as advancements in high-throughput technologies lead to production of large amounts of genomic and epigenomic data, the classification feature of SVMs is expanding its use in cancer genomics, leading to the discovery of new biomarkers, new drug targets, and a better ...

  13. A COMPARISON OF HAZE REMOVAL ALGORITHMS AND THEIR IMPACTS ON CLASSIFICATION ACCURACY FOR LANDSAT IMAGERY

    Directory of Open Access Journals (Sweden)

    Yang Xiao

    Full Text Available The quality of Landsat images in humid areas is considerably degraded by haze in terms of their spectral response pattern, which limits the possibility of their application in using visible and near-infrared bands. A variety of haze removal algorithms have been proposed to correct these unsatisfactory illumination effects caused by the haze contamination. The purpose of this study was to illustrate the difference of two major algorithms (the improved homomorphic filtering (HF and the virtual cloud point (VCP for their effectiveness in solving spatially varying haze contamination, and to evaluate the impacts of haze removal on land cover classification. A case study with exploiting large quantities of Landsat TM images and climates (clear and haze in the most humid areas in China proved that these haze removal algorithms both perform well in processing Landsat images contaminated by haze. The outcome of the application of VCP appears to be more similar to the reference images compared to HF. Moreover, the Landsat image with VCP haze removal can improve the classification accuracy effectively in comparison to that without haze removal, especially in the cloudy contaminated area

  14. Accuracy of the all patient refined diagnosis related groups classification system in congenital heart surgery.

    Science.gov (United States)

    Parnell, Aimee S; Shults, Justine; Gaynor, J William; Leonard, Mary B; Dai, Dingwei; Feudtner, Chris

    2014-02-01

    Administrative data are increasingly used to evaluate clinical outcomes and quality of care in pediatric congenital heart surgery (CHS) programs. Several published analyses of large pediatric administrative data sets have relied on the All Patient Refined Diagnosis Related Groups (APR-DRG, version 24) diagnostic classification system. The accuracy of this classification system for patients undergoing CHS is unclear. We performed a retrospective cohort study of all 14,098 patients 0 to 5 years of age undergoing any of six selected congenital heart operations, ranging in complexity from isolated closure of a ventricular septal defect to single-ventricle palliation, at 40 tertiary-care pediatric centers in the Pediatric Health Information Systems database between 2007 and 2010. Assigned APR-DRGs (cardiac versus noncardiac) were compared using χ2 or Fisher's exact tests between those patients admitted during the first day of life versus later and between those receiving extracorporeal membrane oxygenation support versus those not. Recursive partitioning was used to assess the greatest determinants of APR-DRG type in the model. Every patient admitted on day 1 of life was assigned to a noncardiac APR-DRG (pDRG (pDRG experienced a significantly increased mortality (pDRG coding has systematic misclassifications, which may result in inaccurate reporting of CHS case volumes and mortality. Copyright © 2014 The Society of Thoracic Surgeons. Published by Elsevier Inc. All rights reserved.

  15. Comparison of ANN (MLP), ANFIS, SVM, and RF models for the online classification of heating value of burning municipal solid waste in circulating fluidized bed incinerators.

    Science.gov (United States)

    You, Haihui; Ma, Zengyi; Tang, Yijun; Wang, Yuelan; Yan, Jianhua; Ni, Mingjiang; Cen, Kefa; Huang, Qunxing

    2017-10-01

    The heating values, particularly lower heating values of burning municipal solid waste are critically important parameters in operating circulating fluidized bed incineration systems. However, the heating values change widely and frequently, while there is no reliable real-time instrument to measure heating values in the process of incinerating municipal solid waste. A rapid, cost-effective, and comparative methodology was proposed to evaluate the heating values of burning MSW online based on prior knowledge, expert experience, and data-mining techniques. First, selecting the input variables of the model by analyzing the operational mechanism of circulating fluidized bed incinerators, and the corresponding heating value was classified into one of nine fuzzy expressions according to expert advice. Development of prediction models by employing four different nonlinear models was undertaken, including a multilayer perceptron neural network, a support vector machine, an adaptive neuro-fuzzy inference system, and a random forest; a series of optimization schemes were implemented simultaneously in order to improve the performance of each model. Finally, a comprehensive comparison study was carried out to evaluate the performance of the models. Results indicate that the adaptive neuro-fuzzy inference system model outperforms the other three models, with the random forest model performing second-best, and the multilayer perceptron model performing at the worst level. A model with sufficient accuracy would contribute adequately to the control of circulating fluidized bed incinerator operation and provide reliable heating value signals for an automatic combustion control system. Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. Enhancing the Performance of LibSVM Classifier by Kernel F-Score Feature Selection

    Science.gov (United States)

    Sarojini, Balakrishnan; Ramaraj, Narayanasamy; Nickolas, Savarimuthu

    Medical Data mining is the search for relationships and patterns within the medical datasets that could provide useful knowledge for effective clinical decisions. The inclusion of irrelevant, redundant and noisy features in the process model results in poor predictive accuracy. Much research work in data mining has gone into improving the predictive accuracy of the classifiers by applying the techniques of feature selection. Feature selection in medical data mining is appreciable as the diagnosis of the disease could be done in this patient-care activity with minimum number of significant features. The objective of this work is to show that selecting the more significant features would improve the performance of the classifier. We empirically evaluate the classification effectiveness of LibSVM classifier on the reduced feature subset of diabetes dataset. The evaluations suggest that the feature subset selected improves the predictive accuracy of the classifier and reduce false negatives and false positives.

  17. Implementation of support vector machine for classification of speech marked hijaiyah letters based on Mel frequency cepstrum coefficient feature extraction

    Science.gov (United States)

    Adhi Pradana, Wisnu; Adiwijaya; Novia Wisesty, Untari

    2018-03-01

    Support Vector Machine or commonly called SVM is one method that can be used to process the classification of a data. SVM classifies data from 2 different classes with hyperplane. In this study, the system was built using SVM to develop Arabic Speech Recognition. In the development of the system, there are 2 kinds of speakers that have been tested that is dependent speakers and independent speakers. The results from this system is an accuracy of 85.32% for speaker dependent and 61.16% for independent speakers.

  18. An SVM Based Approach for the Analysis Of Mammography Images

    Science.gov (United States)

    Gan, X.; Kapsokalivas, L.; Skaliotis, A.; Steinhöfel, K.; Tangaro, S.

    2007-09-01

    Mammography is among the most popular imaging techniques used in the diagnosis of breast cancer. Nevertheless distinguishing between healthy and ill images is hard even for an experienced radiologist, because a single image usually includes several regions of interest (ROIs). The hardness of this classification problem along with the substantial amount of data, gathered from patients' medical history, motivates the use of a machine learning approach as part of a CAD (Computer Aided Detection) tool, aiming to assist radiologists in the characterization of mammography images. Specifically, our approach involves: i) the ROI extraction, ii) the Feature Vector extraction, iii) the Support Vector Machine (SVM) classification of ROIs and iv) the characterization of the whole image. We evaluate the performance of our approach in terms of the SVM's training and testing error and in terms of ROI specificity—sensitivity. The results show a relation between the number of features used and the SVM's performance.

  19. An SVM Based Approach for the Analysis Of Mammography Images

    International Nuclear Information System (INIS)

    Gan, X.; Kapsokalivas, L.; Skaliotis, A.; Steinhoefel, K.; Tangaro, S.

    2007-01-01

    Mammography is among the most popular imaging techniques used in the diagnosis of breast cancer. Nevertheless distinguishing between healthy and ill images is hard even for an experienced radiologist, because a single image usually includes several regions of interest (ROIs). The hardness of this classification problem along with the substantial amount of data, gathered from patients' medical history, motivates the use of a machine learning approach as part of a CAD (Computer Aided Detection) tool, aiming to assist radiologists in the characterization of mammography images. Specifically, our approach involves: i) the ROI extraction, ii) the Feature Vector extraction, iii) the Support Vector Machine (SVM) classification of ROIs and iv) the characterization of the whole image. We evaluate the performance of our approach in terms of the SVM's training and testing error and in terms of ROI specificity - sensitivity. The results show a relation between the number of features used and the SVM's performance

  20. Comparison of accuracy of fibrosis degree classifications by liver biopsy and non-invasive tests in chronic hepatitis C.

    Science.gov (United States)

    Boursier, Jérôme; Bertrais, Sandrine; Oberti, Frédéric; Gallois, Yves; Fouchard-Hubert, Isabelle; Rousselet, Marie-Christine; Zarski, Jean-Pierre; Calès, Paul

    2011-11-30

    Non-invasive tests have been constructed and evaluated mainly for binary diagnoses such as significant fibrosis. Recently, detailed fibrosis classifications for several non-invasive tests have been developed, but their accuracy has not been thoroughly evaluated in comparison to liver biopsy, especially in clinical practice and for Fibroscan. Therefore, the main aim of the present study was to evaluate the accuracy of detailed fibrosis classifications available for non-invasive tests and liver biopsy. The secondary aim was to validate these accuracies in independent populations. Four HCV populations provided 2,068 patients with liver biopsy, four different pathologist skill-levels and non-invasive tests. Results were expressed as percentages of correctly classified patients. In population #1 including 205 patients and comparing liver biopsy (reference: consensus reading by two experts) and blood tests, Metavir fibrosis (FM) stage accuracy was 64.4% in local pathologists vs. 82.2% (p blood tests, the discrepancy scores, taking into account the error magnitude, of detailed fibrosis classification were significantly different between FibroMeter2G (0.30 ± 0.55) and FibroMeter3G (0.14 ± 0.37, p blood tests and Fibroscan, accuracies of detailed fibrosis classification were, respectively: Fibrotest: 42.5% (33.5%), Fibroscan: 64.9% (50.7%), FibroMeter2G: 68.7% (68.2%), FibroMeter3G: 77.1% (83.4%), p fibrosis classification of the best-performing blood test outperforms liver biopsy read by a local pathologist, i.e., in clinical practice; however, the classification precision is apparently lesser. This detailed classification accuracy is much lower than that of significant fibrosis with Fibroscan and even Fibrotest but higher with FibroMeter3G. FibroMeter classification accuracy was significantly higher than those of other non-invasive tests. Finally, for hepatitis C evaluation in clinical practice, fibrosis degree can be evaluated using an accurate blood test.

  1. Application of machine learning on brain cancer multiclass classification

    Science.gov (United States)

    Panca, V.; Rustam, Z.

    2017-07-01

    Classification of brain cancer is a problem of multiclass classification. One approach to solve this problem is by first transforming it into several binary problems. The microarray gene expression dataset has the two main characteristics of medical data: extremely many features (genes) and only a few number of samples. The application of machine learning on microarray gene expression dataset mainly consists of two steps: feature selection and classification. In this paper, the features are selected using a method based on support vector machine recursive feature elimination (SVM-RFE) principle which is improved to solve multiclass classification, called multiple multiclass SVM-RFE. Instead of using only the selected features on a single classifier, this method combines the result of multiple classifiers. The features are divided into subsets and SVM-RFE is used on each subset. Then, the selected features on each subset are put on separate classifiers. This method enhances the feature selection ability of each single SVM-RFE. Twin support vector machine (TWSVM) is used as the method of the classifier to reduce computational complexity. While ordinary SVM finds single optimum hyperplane, the main objective Twin SVM is to find two non-parallel optimum hyperplanes. The experiment on the brain cancer microarray gene expression dataset shows this method could classify 71,4% of the overall test data correctly, using 100 and 1000 genes selected from multiple multiclass SVM-RFE feature selection method. Furthermore, the per class results show that this method could classify data of normal and MD class with 100% accuracy.

  2. Classification of Stellar Spectra with Fuzzy Minimum Within-Class ...

    Indian Academy of Sciences (India)

    Liu Zhong-bao

    2017-06-19

    Jun 19, 2017 ... 2School of Information, Business College of Shanxi University, Taiyuan 030031, China. ∗ ... tor Machine (SVM) is a typical classification method, which is widely used in ... In the research of spectra classification with SVM,.

  3. Feature Selection Has a Large Impact on One-Class Classification Accuracy for MicroRNAs in Plants.

    Science.gov (United States)

    Yousef, Malik; Saçar Demirci, Müşerref Duygu; Khalifa, Waleed; Allmer, Jens

    2016-01-01

    MicroRNAs (miRNAs) are short RNA sequences involved in posttranscriptional gene regulation. Their experimental analysis is complicated and, therefore, needs to be supplemented with computational miRNA detection. Currently computational miRNA detection is mainly performed using machine learning and in particular two-class classification. For machine learning, the miRNAs need to be parametrized and more than 700 features have been described. Positive training examples for machine learning are readily available, but negative data is hard to come by. Therefore, it seems prerogative to use one-class classification instead of two-class classification. Previously, we were able to almost reach two-class classification accuracy using one-class classifiers. In this work, we employ feature selection procedures in conjunction with one-class classification and show that there is up to 36% difference in accuracy among these feature selection methods. The best feature set allowed the training of a one-class classifier which achieved an average accuracy of ~95.6% thereby outperforming previous two-class-based plant miRNA detection approaches by about 0.5%. We believe that this can be improved upon in the future by rigorous filtering of the positive training examples and by improving current feature clustering algorithms to better target pre-miRNA feature selection.

  4. Improving the Classification Accuracy for Near-Infrared Spectroscopy of Chinese Salvia miltiorrhiza Using Local Variable Selection

    Directory of Open Access Journals (Sweden)

    Lianqing Zhu

    2018-01-01

    Full Text Available In order to improve the classification accuracy of Chinese Salvia miltiorrhiza using near-infrared spectroscopy, a novel local variable selection strategy is thus proposed. Combining the strengths of the local algorithm and interval partial least squares, the spectra data have firstly been divided into several pairs of classes in sample direction and equidistant subintervals in variable direction. Then, a local classification model has been built, and the most proper spectral region has been selected based on the new evaluation criterion considering both classification error rate and best predictive ability under the leave-one-out cross validation scheme for each pair of classes. Finally, each observation can be assigned to belong to the class according to the statistical analysis of classification results of the local classification model built on selected variables. The performance of the proposed method was demonstrated through near-infrared spectra of cultivated or wild Salvia miltiorrhiza, which are collected from 8 geographical origins in 5 provinces of China. For comparison, soft independent modelling of class analogy and partial least squares discriminant analysis methods are, respectively, employed as the classification model. Experimental results showed that classification performance of the classification model with local variable selection was obvious better than that without variable selection.

  5. Measurement Properties and Classification Accuracy of Two Spanish Parent Surveys of Language Development for Preschool-Age Children

    Science.gov (United States)

    Guiberson, Mark; Rodriguez, Barbara L.

    2010-01-01

    Purpose: To describe the concurrent validity and classification accuracy of 2 Spanish parent surveys of language development, the Spanish Ages and Stages Questionnaire (ASQ; Squires, Potter, & Bricker, 1999) and the Pilot Inventario-III (Pilot INV-III; Guiberson, 2008a). Method: Forty-eight Spanish-speaking parents of preschool-age children…

  6. Learning using privileged information: SVM+ and weighted SVM.

    Science.gov (United States)

    Lapin, Maksim; Hein, Matthias; Schiele, Bernt

    2014-05-01

    Prior knowledge can be used to improve predictive performance of learning algorithms or reduce the amount of data required for training. The same goal is pursued within the learning using privileged information paradigm which was recently introduced by Vapnik et al. and is aimed at utilizing additional information available only at training time-a framework implemented by SVM+. We relate the privileged information to importance weighting and show that the prior knowledge expressible with privileged features can also be encoded by weights associated with every training example. We show that a weighted SVM can always replicate an SVM+ solution, while the converse is not true and we construct a counterexample highlighting the limitations of SVM+. Finally, we touch on the problem of choosing weights for weighted SVMs when privileged features are not available. Copyright © 2014 Elsevier Ltd. All rights reserved.

  7. Point Based Emotion Classification Using SVM

    OpenAIRE

    Swinkels, Wout

    2016-01-01

    The detection of emotions is a hot topic in the area of computer vision. Emotions are based on subtle changes in the face that are intuitively detected and interpreted by humans. Detecting these subtle changes, based on mathematical models, is a great challenge in the area of computer vision. In this thesis a new method is proposed to achieve state-of-the-art emotion detection performance. This method is based on facial feature points to monitor subtle changes in the face. Therefore the c...

  8. SVM and SVM Ensembles in Breast Cancer Prediction

    OpenAIRE

    Huang, Min-Wei; Chen, Chih-Wen; Lin, Wei-Chao; Ke, Shih-Wen; Tsai, Chih-Fong

    2017-01-01

    Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction per...

  9. Increasing accuracy of vehicle detection from conventional vehicle detectors - counts, speeds, classification, and travel time.

    Science.gov (United States)

    2014-09-01

    Vehicle classification is an important traffic parameter for transportation planning and infrastructure : management. Length-based vehicle classification from dual loop detectors is among the lowest cost : technologies commonly used for collecting th...

  10. Classification of different tomato seed cultivars by multispectral visible-near infrared spectroscopy and chemometrics

    DEFF Research Database (Denmark)

    Shrestha, Santosh; Deleuran, Lise Christina; Gislum, René

    2016-01-01

    nm were extracted from multispectral images of tomato seeds. Principal component analysis (PCA) was used for data exploration, while partial least squares discriminant analysis (PLS-DA) and support vector machine discriminant analysis (SVM-DA) were used to classify the five different tomato cultivars....... The results showed very good classification accuracy for two independent test sets ranging from 94% to 100% for all tomato cultivars irrespective of chemometric methods. The overall classification error rates were 3.2% and 0.4% for the PLS-DA and SVM-DA calibration models, respectively. The results indicate...

  11. An SVM model with hybrid kernels for hydrological time series

    Science.gov (United States)

    Wang, C.; Wang, H.; Zhao, X.; Xie, Q.

    2017-12-01

    Support Vector Machine (SVM) models have been widely applied to the forecast of climate/weather and its impact on other environmental variables such as hydrologic response to climate/weather. When using SVM, the choice of the kernel function plays the key role. Conventional SVM models mostly use one single type of kernel function, e.g., radial basis kernel function. Provided that there are several featured kernel functions available, each having its own advantages and drawbacks, a combination of these kernel functions may give more flexibility and robustness to SVM approach, making it suitable for a wide range of application scenarios. This paper presents such a linear combination of radial basis kernel and polynomial kernel for the forecast of monthly flowrate in two gaging stations using SVM approach. The results indicate significant improvement in the accuracy of predicted series compared to the approach with either individual kernel function, thus demonstrating the feasibility and advantages of such hybrid kernel approach for SVM applications.

  12. Power quality events recognition using a SVM-based method

    Energy Technology Data Exchange (ETDEWEB)

    Cerqueira, Augusto Santiago; Ferreira, Danton Diego; Ribeiro, Moises Vidal; Duque, Carlos Augusto [Department of Electrical Circuits, Federal University of Juiz de Fora, Campus Universitario, 36036 900, Juiz de Fora MG (Brazil)

    2008-09-15

    In this paper, a novel SVM-based method for power quality event classification is proposed. A simple approach for feature extraction is introduced, based on the subtraction of the fundamental component from the acquired voltage signal. The resulting signal is presented to a support vector machine for event classification. Results from simulation are presented and compared with two other methods, the OTFR and the LCEC. The proposed method shown an improved performance followed by a reasonable computational cost. (author)

  13. STUDY COMPARISON OF SVM-, K-NN- AND BACKPROPAGATION-BASED CLASSIFIER FOR IMAGE RETRIEVAL

    Directory of Open Access Journals (Sweden)

    Muhammad Athoillah

    2015-03-01

    Full Text Available Classification is a method for compiling data systematically according to the rules that have been set previously. In recent years classification method has been proven to help many people’s work, such as image classification, medical biology, traffic light, text classification etc. There are many methods to solve classification problem. This variation method makes the researchers find it difficult to determine which method is best for a problem. This framework is aimed to compare the ability of classification methods, such as Support Vector Machine (SVM, K-Nearest Neighbor (K-NN, and Backpropagation, especially in study cases of image retrieval with five category of image dataset. The result shows that K-NN has the best average result in accuracy with 82%. It is also the fastest in average computation time with 17,99 second during retrieve session for all categories class. The Backpropagation, however, is the slowest among three of them. In average it needed 883 second for training session and 41,7 second for retrieve session.

  14. A fast learning method for large scale and multi-class samples of SVM

    Science.gov (United States)

    Fan, Yu; Guo, Huiming

    2017-06-01

    A multi-class classification SVM(Support Vector Machine) fast learning method based on binary tree is presented to solve its low learning efficiency when SVM processing large scale multi-class samples. This paper adopts bottom-up method to set up binary tree hierarchy structure, according to achieved hierarchy structure, sub-classifier learns from corresponding samples of each node. During the learning, several class clusters are generated after the first clustering of the training samples. Firstly, central points are extracted from those class clusters which just have one type of samples. For those which have two types of samples, cluster numbers of their positive and negative samples are set respectively according to their mixture degree, secondary clustering undertaken afterwards, after which, central points are extracted from achieved sub-class clusters. By learning from the reduced samples formed by the integration of extracted central points above, sub-classifiers are obtained. Simulation experiment shows that, this fast learning method, which is based on multi-level clustering, can guarantee higher classification accuracy, greatly reduce sample numbers and effectively improve learning efficiency.

  15. Automated, high accuracy classification of Parkinsonian disorders: a pattern recognition approach.

    Directory of Open Access Journals (Sweden)

    Andre F Marquand

    Full Text Available Progressive supranuclear palsy (PSP, multiple system atrophy (MSA and idiopathic Parkinson's disease (IPD can be clinically indistinguishable, especially in the early stages, despite distinct patterns of molecular pathology. Structural neuroimaging holds promise for providing objective biomarkers for discriminating these diseases at the single subject level but all studies to date have reported incomplete separation of disease groups. In this study, we employed multi-class pattern recognition to assess the value of anatomical patterns derived from a widely available structural neuroimaging sequence for automated classification of these disorders. To achieve this, 17 patients with PSP, 14 with IPD and 19 with MSA were scanned using structural MRI along with 19 healthy controls (HCs. An advanced probabilistic pattern recognition approach was employed to evaluate the diagnostic value of several pre-defined anatomical patterns for discriminating the disorders, including: (i a subcortical motor network; (ii each of its component regions and (iii the whole brain. All disease groups could be discriminated simultaneously with high accuracy using the subcortical motor network. The region providing the most accurate predictions overall was the midbrain/brainstem, which discriminated all disease groups from one another and from HCs. The subcortical network also produced more accurate predictions than the whole brain and all of its constituent regions. PSP was accurately predicted from the midbrain/brainstem, cerebellum and all basal ganglia compartments; MSA from the midbrain/brainstem and cerebellum and IPD from the midbrain/brainstem only. This study demonstrates that automated analysis of structural MRI can accurately predict diagnosis in individual patients with Parkinsonian disorders, and identifies distinct patterns of regional atrophy particularly useful for this process.

  16. Bagging Approach for Increasing Classification Accuracy of CART on Family Participation Prediction in Implementation of Elderly Family Development Program

    Directory of Open Access Journals (Sweden)

    Wisoedhanie Widi Anugrahanti

    2017-06-01

    Full Text Available Classification and Regression Tree (CART was a method of Machine Learning where data exploration was done by decision tree technique. CART was a classification technique with binary recursive reconciliation algorithms where the sorting was performed on a group of data collected in a space called a node / node into two child nodes (Lewis, 2000. The aim of this study was to predict family participation in Elderly Family Development program based on family behavior in providing physical, mental, social care for the elderly. Family involvement accuracy using Bagging CART method was calculated based on 1-APER value, sensitivity, specificity, and G-Means. Based on CART method, classification accuracy was obtained 97,41% with Apparent Error Rate value 2,59%. The most important determinant of family behavior as a sorter was society participation (100,00000, medical examination (98,95988, providing nutritious food (68.60476, establishing communication (67,19877 and worship (57,36587. To improved the stability and accuracy of CART prediction, used CART Bootstrap Aggregating (Bagging with 100% accuracy result. Bagging CART classifies a total of 590 families (84.77% were appropriately classified into implement elderly Family Development program class.

  17. PMSVM: An Optimized Support Vector Machine Classification Algorithm Based on PCA and Multilevel Grid Search Methods

    Directory of Open Access Journals (Sweden)

    Yukai Yao

    2015-01-01

    Full Text Available We propose an optimized Support Vector Machine classifier, named PMSVM, in which System Normalization, PCA, and Multilevel Grid Search methods are comprehensively considered for data preprocessing and parameters optimization, respectively. The main goals of this study are to improve the classification efficiency and accuracy of SVM. Sensitivity, Specificity, Precision, and ROC curve, and so forth, are adopted to appraise the performances of PMSVM. Experimental results show that PMSVM has relatively better accuracy and remarkable higher efficiency compared with traditional SVM algorithms.

  18. Design of a hybrid model for cardiac arrhythmia classification based on Daubechies wavelet transform.

    Science.gov (United States)

    Rajagopal, Rekha; Ranganathan, Vidhyapriya

    2018-06-05

    Automation in cardiac arrhythmia classification helps medical professionals make accurate decisions about the patient's health. The aim of this work was to design a hybrid classification model to classify cardiac arrhythmias. The design phase of the classification model comprises the following stages: preprocessing of the cardiac signal by eliminating detail coefficients that contain noise, feature extraction through Daubechies wavelet transform, and arrhythmia classification using a collaborative decision from the K nearest neighbor classifier (KNN) and a support vector machine (SVM). The proposed model is able to classify 5 arrhythmia classes as per the ANSI/AAMI EC57: 1998 classification standard. Level 1 of the proposed model involves classification using the KNN and the classifier is trained with examples from all classes. Level 2 involves classification using an SVM and is trained specifically to classify overlapped classes. The final classification of a test heartbeat pertaining to a particular class is done using the proposed KNN/SVM hybrid model. The experimental results demonstrated that the average sensitivity of the proposed model was 92.56%, the average specificity 99.35%, the average positive predictive value 98.13%, the average F-score 94.5%, and the average accuracy 99.78%. The results obtained using the proposed model were compared with the results of discriminant, tree, and KNN classifiers. The proposed model is able to achieve a high classification accuracy.

  19. A comparison of the accuracy of pixel based and object based classifications of integrated optical and LiDAR data

    Science.gov (United States)

    Gajda, Agnieszka; Wójtowicz-Nowakowska, Anna

    2013-04-01

    A comparison of the accuracy of pixel based and object based classifications of integrated optical and LiDAR data Land cover maps are generally produced on the basis of high resolution imagery. Recently, LiDAR (Light Detection and Ranging) data have been brought into use in diverse applications including land cover mapping. In this study we attempted to assess the accuracy of land cover classification using both high resolution aerial imagery and LiDAR data (airborne laser scanning, ALS), testing two classification approaches: a pixel-based classification and object-oriented image analysis (OBIA). The study was conducted on three test areas (3 km2 each) in the administrative area of Kraków, Poland, along the course of the Vistula River. They represent three different dominating land cover types of the Vistula River valley. Test site 1 had a semi-natural vegetation, with riparian forests and shrubs, test site 2 represented a densely built-up area, and test site 3 was an industrial site. Point clouds from ALS and ortophotomaps were both captured in November 2007. Point cloud density was on average 16 pt/m2 and it contained additional information about intensity and encoded RGB values. Ortophotomaps had a spatial resolution of 10 cm. From point clouds two raster maps were generated: intensity (1) and (2) normalised Digital Surface Model (nDSM), both with the spatial resolution of 50 cm. To classify the aerial data, a supervised classification approach was selected. Pixel based classification was carried out in ERDAS Imagine software. Ortophotomaps and intensity and nDSM rasters were used in classification. 15 homogenous training areas representing each cover class were chosen. Classified pixels were clumped to avoid salt and pepper effect. Object oriented image object classification was carried out in eCognition software, which implements both the optical and ALS data. Elevation layers (intensity, firs/last reflection, etc.) were used at segmentation stage due to

  20. A simulated Linear Mixture Model to Improve Classification Accuracy of Satellite Data Utilizing Degradation of Atmospheric Effect

    Directory of Open Access Journals (Sweden)

    WIDAD Elmahboub

    2005-02-01

    Full Text Available Researchers in remote sensing have attempted to increase the accuracy of land cover information extracted from remotely sensed imagery. Factors that influence the supervised and unsupervised classification accuracy are the presence of atmospheric effect and mixed pixel information. A linear mixture simulated model experiment is generated to simulate real world data with known end member spectral sets and class cover proportions (CCP. The CCP were initially generated by a random number generator and normalized to make the sum of the class proportions equal to 1.0 using MATLAB program. Random noise was intentionally added to pixel values using different combinations of noise levels to simulate a real world data set. The atmospheric scattering error is computed for each pixel value for three generated images with SPOT data. Accuracy can either be classified or misclassified. Results portrayed great improvement in classified accuracy, for example, in image 1, misclassified pixels due to atmospheric noise is 41 %. Subsequent to the degradation of atmospheric effect, the misclassified pixels were reduced to 4 %. We can conclude that accuracy of classification can be improved by degradation of atmospheric noise.

  1. Ensemble support vector machine classification of dementia using structural MRI and mini-mental state examination.

    Science.gov (United States)

    Sørensen, Lauge; Nielsen, Mads

    2018-05-15

    The International Challenge for Automated Prediction of MCI from MRI data offered independent, standardized comparison of machine learning algorithms for multi-class classification of normal control (NC), mild cognitive impairment (MCI), converting MCI (cMCI), and Alzheimer's disease (AD) using brain imaging and general cognition. We proposed to use an ensemble of support vector machines (SVMs) that combined bagging without replacement and feature selection. SVM is the most commonly used algorithm in multivariate classification of dementia, and it was therefore valuable to evaluate the potential benefit of ensembling this type of classifier. The ensemble SVM, using either a linear or a radial basis function (RBF) kernel, achieved multi-class classification accuracies of 55.6% and 55.0% in the challenge test set (60 NC, 60 MCI, 60 cMCI, 60 AD), resulting in a third place in the challenge. Similar feature subset sizes were obtained for both kernels, and the most frequently selected MRI features were the volumes of the two hippocampal subregions left presubiculum and right subiculum. Post-challenge analysis revealed that enforcing a minimum number of selected features and increasing the number of ensemble classifiers improved classification accuracy up to 59.1%. The ensemble SVM outperformed single SVM classifications consistently in the challenge test set. Ensemble methods using bagging and feature selection can improve the performance of the commonly applied SVM classifier in dementia classification. This resulted in competitive classification accuracies in the International Challenge for Automated Prediction of MCI from MRI data. Copyright © 2018 Elsevier B.V. All rights reserved.

  2. COMPARISON OF SVM AND FUZZY CLASSIFIER FOR AN INDIAN SCRIPT

    Directory of Open Access Journals (Sweden)

    M. J. Baheti

    2012-01-01

    Full Text Available With the advent of technological era, conversion of scanned document (handwritten or printed into machine editable format has attracted many researchers. This paper deals with the problem of recognition of Gujarati handwritten numerals. Gujarati numeral recognition requires performing some specific steps as a part of preprocessing. For preprocessing digitization, segmentation, normalization and thinning are done with considering that the image have almost no noise. Further affine invariant moments based model is used for feature extraction and finally Support Vector Machine (SVM and Fuzzy classifiers are used for numeral classification. . The comparison of SVM and Fuzzy classifier is made and it can be seen that SVM procured better results as compared to Fuzzy Classifier.

  3. Identification and classification of similar looking food grains

    Science.gov (United States)

    Anami, B. S.; Biradar, Sunanda D.; Savakar, D. G.; Kulkarni, P. V.

    2013-01-01

    This paper describes the comparative study of Artificial Neural Network (ANN) and Support Vector Machine (SVM) classifiers by taking a case study of identification and classification of four pairs of similar looking food grains namely, Finger Millet, Mustard, Soyabean, Pigeon Pea, Aniseed, Cumin-seeds, Split Greengram and Split Blackgram. Algorithms are developed to acquire and process color images of these grains samples. The developed algorithms are used to extract 18 colors-Hue Saturation Value (HSV), and 42 wavelet based texture features. Back Propagation Neural Network (BPNN)-based classifier is designed using three feature sets namely color - HSV, wavelet-texture and their combined model. SVM model for color- HSV model is designed for the same set of samples. The classification accuracies ranging from 93% to 96% for color-HSV, ranging from 78% to 94% for wavelet texture model and from 92% to 97% for combined model are obtained for ANN based models. The classification accuracy ranging from 80% to 90% is obtained for color-HSV based SVM model. Training time required for the SVM based model is substantially lesser than ANN for the same set of images.

  4. Comparison of accuracy of fibrosis degree classifications by liver biopsy and non-invasive tests in chronic hepatitis C

    Directory of Open Access Journals (Sweden)

    Boursier Jérôme

    2011-11-01

    Full Text Available Abstract Background Non-invasive tests have been constructed and evaluated mainly for binary diagnoses such as significant fibrosis. Recently, detailed fibrosis classifications for several non-invasive tests have been developed, but their accuracy has not been thoroughly evaluated in comparison to liver biopsy, especially in clinical practice and for Fibroscan. Therefore, the main aim of the present study was to evaluate the accuracy of detailed fibrosis classifications available for non-invasive tests and liver biopsy. The secondary aim was to validate these accuracies in independent populations. Methods Four HCV populations provided 2,068 patients with liver biopsy, four different pathologist skill-levels and non-invasive tests. Results were expressed as percentages of correctly classified patients. Results In population #1 including 205 patients and comparing liver biopsy (reference: consensus reading by two experts and blood tests, Metavir fibrosis (FM stage accuracy was 64.4% in local pathologists vs. 82.2% (p -3 in single expert pathologist. Significant discrepancy (≥ 2FM vs reference histological result rates were: Fibrotest: 17.2%, FibroMeter2G: 5.6%, local pathologists: 4.9%, FibroMeter3G: 0.5%, expert pathologist: 0% (p -3. In population #2 including 1,056 patients and comparing blood tests, the discrepancy scores, taking into account the error magnitude, of detailed fibrosis classification were significantly different between FibroMeter2G (0.30 ± 0.55 and FibroMeter3G (0.14 ± 0.37, p -3 or Fibrotest (0.84 ± 0.80, p -3. In population #3 (and #4 including 458 (359 patients and comparing blood tests and Fibroscan, accuracies of detailed fibrosis classification were, respectively: Fibrotest: 42.5% (33.5%, Fibroscan: 64.9% (50.7%, FibroMeter2G: 68.7% (68.2%, FibroMeter3G: 77.1% (83.4%, p -3 (p -3. Significant discrepancy (≥ 2 FM rates were, respectively: Fibrotest: 21.3% (22.2%, Fibroscan: 12.9% (12.3%, FibroMeter2G: 5.7% (6

  5. Improved classification accuracy of powdery mildew infection levels of wine grapes by spatial-spectral analysis of hyperspectral images.

    Science.gov (United States)

    Knauer, Uwe; Matros, Andrea; Petrovic, Tijana; Zanker, Timothy; Scott, Eileen S; Seiffert, Udo

    2017-01-01

    Hyperspectral imaging is an emerging means of assessing plant vitality, stress parameters, nutrition status, and diseases. Extraction of target values from the high-dimensional datasets either relies on pixel-wise processing of the full spectral information, appropriate selection of individual bands, or calculation of spectral indices. Limitations of such approaches are reduced classification accuracy, reduced robustness due to spatial variation of the spectral information across the surface of the objects measured as well as a loss of information intrinsic to band selection and use of spectral indices. In this paper we present an improved spatial-spectral segmentation approach for the analysis of hyperspectral imaging data and its application for the prediction of powdery mildew infection levels (disease severity) of intact Chardonnay grape bunches shortly before veraison. Instead of calculating texture features (spatial features) for the huge number of spectral bands independently, dimensionality reduction by means of Linear Discriminant Analysis (LDA) was applied first to derive a few descriptive image bands. Subsequent classification was based on modified Random Forest classifiers and selective extraction of texture parameters from the integral image representation of the image bands generated. Dimensionality reduction, integral images, and the selective feature extraction led to improved classification accuracies of up to [Formula: see text] for detached berries used as a reference sample (training dataset). Our approach was validated by predicting infection levels for a sample of 30 intact bunches. Classification accuracy improved with the number of decision trees of the Random Forest classifier. These results corresponded with qPCR results. An accuracy of 0.87 was achieved in classification of healthy, infected, and severely diseased bunches. However, discrimination between visually healthy and infected bunches proved to be challenging for a few samples

  6. Applications of Support Vector Machine (SVM) Learning in Cancer Genomics.

    Science.gov (United States)

    Huang, Shujun; Cai, Nianguang; Pacheco, Pedro Penzuti; Narrandes, Shavira; Wang, Yang; Xu, Wayne

    2018-01-01

    Machine learning with maximization (support) of separating margin (vector), called support vector machine (SVM) learning, is a powerful classification tool that has been used for cancer genomic classification or subtyping. Today, as advancements in high-throughput technologies lead to production of large amounts of genomic and epigenomic data, the classification feature of SVMs is expanding its use in cancer genomics, leading to the discovery of new biomarkers, new drug targets, and a better understanding of cancer driver genes. Herein we reviewed the recent progress of SVMs in cancer genomic studies. We intend to comprehend the strength of the SVM learning and its future perspective in cancer genomic applications. Copyright© 2018, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.

  7. A novel transmission line protection using DOST and SVM

    Directory of Open Access Journals (Sweden)

    M. Jaya Bharata Reddy

    2016-06-01

    Full Text Available This paper proposes a smart fault detection, classification and location (SFDCL methodology for transmission systems with multi-generators using discrete orthogonal Stockwell transform (DOST. The methodology is based on synchronized current measurements from remote telemetry units (RTUs installed at both ends of the transmission line. The energy coefficients extracted from the transient current signals due to occurrence of different types of faults using DOST are being utilized for real-time fault detection and classification. Support vector machine (SVM has been deployed for locating the fault distance using the extracted coefficients. A comparative study is performed for establishing the superiority of SVM over other popular computational intelligence methods, such as adaptive neuro-fuzzy inference system (ANFIS and artificial neural network (ANN, for more precise and reliable estimation of fault distance. The results corroborate the effectiveness of the suggested SFDCL algorithm for real-time transmission line fault detection, classification and localization.

  8. Progressive Classification Using Support Vector Machines

    Science.gov (United States)

    Wagstaff, Kiri; Kocurek, Michael

    2009-01-01

    An algorithm for progressive classification of data, analogous to progressive rendering of images, makes it possible to compromise between speed and accuracy. This algorithm uses support vector machines (SVMs) to classify data. An SVM is a machine learning algorithm that builds a mathematical model of the desired classification concept by identifying the critical data points, called support vectors. Coarse approximations to the concept require only a few support vectors, while precise, highly accurate models require far more support vectors. Once the model has been constructed, the SVM can be applied to new observations. The cost of classifying a new observation is proportional to the number of support vectors in the model. When computational resources are limited, an SVM of the appropriate complexity can be produced. However, if the constraints are not known when the model is constructed, or if they can change over time, a method for adaptively responding to the current resource constraints is required. This capability is particularly relevant for spacecraft (or any other real-time systems) that perform onboard data analysis. The new algorithm enables the fast, interactive application of an SVM classifier to a new set of data. The classification process achieved by this algorithm is characterized as progressive because a coarse approximation to the true classification is generated rapidly and thereafter iteratively refined. The algorithm uses two SVMs: (1) a fast, approximate one and (2) slow, highly accurate one. New data are initially classified by the fast SVM, producing a baseline approximate classification. For each classified data point, the algorithm calculates a confidence index that indicates the likelihood that it was classified correctly in the first pass. Next, the data points are sorted by their confidence indices and progressively reclassified by the slower, more accurate SVM, starting with the items most likely to be incorrectly classified. The user

  9. On the classification techniques in data mining for microarray data classification

    Science.gov (United States)

    Aydadenta, Husna; Adiwijaya

    2018-03-01

    Cancer is one of the deadly diseases, according to data from WHO by 2015 there are 8.8 million more deaths caused by cancer, and this will increase every year if not resolved earlier. Microarray data has become one of the most popular cancer-identification studies in the field of health, since microarray data can be used to look at levels of gene expression in certain cell samples that serve to analyze thousands of genes simultaneously. By using data mining technique, we can classify the sample of microarray data thus it can be identified with cancer or not. In this paper we will discuss some research using some data mining techniques using microarray data, such as Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5, and simulation of Random Forest algorithm with technique of reduction dimension using Relief. The result of this paper show performance measure (accuracy) from classification algorithm (SVM, ANN, Naive Bayes, kNN, C4.5, and Random Forets).The results in this paper show the accuracy of Random Forest algorithm higher than other classification algorithms (Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5). It is hoped that this paper can provide some information about the speed, accuracy, performance and computational cost generated from each Data Mining Classification Technique based on microarray data.

  10. Hybrid Brain–Computer Interface Techniques for Improved Classification Accuracy and Increased Number of Commands: A Review

    OpenAIRE

    Hong, Keum-Shik; Khan, Muhammad Jawad

    2017-01-01

    In this article, non-invasive hybrid brain–computer interface (hBCI) technologies for improving classification accuracy and increasing the number of commands are reviewed. Hybridization combining more than two modalities is a new trend in brain imaging and prosthesis control. Electroencephalography (EEG), due to its easy use and fast temporal resolution, is most widely utilized in combination with other brain/non-brain signal acquisition modalities, for instance, functional near infrared spec...

  11. Classification of masses on mammograms using support vector machine

    Science.gov (United States)

    Chu, Yong; Li, Lihua; Goldgof, Dmitry B.; Qui, Yan; Clark, Robert A.

    2003-05-01

    Mammography is the most effective method for early detection of breast cancer. However, the positive predictive value for classification of malignant and benign lesion from mammographic images is not very high. Clinical studies have shown that most biopsies for cancer are very low, between 15% and 30%. It is important to increase the diagnostic accuracy by improving the positive predictive value to reduce the number of unnecessary biopsies. In this paper, a new classification method was proposed to distinguish malignant from benign masses in mammography by Support Vector Machine (SVM) method. Thirteen features were selected based on receiver operating characteristic (ROC) analysis of classification using individual feature. These features include four shape features, two gradient features and seven Laws features. With these features, SVM was used to classify the masses into two categories, benign and malignant, in which a Gaussian kernel and sequential minimal optimization learning technique are performed. The data set used in this study consists of 193 cases, in which there are 96 benign cases and 97 malignant cases. The leave-one-out evaluation of SVM classifier was taken. The results show that the positive predict value of the presented method is 81.6% with the sensitivity of 83.7% and the false-positive rate of 30.2%. It demonstrated that the SVM-based classifier is effective in mass classification.

  12. Hybrid Brain–Computer Interface Techniques for Improved Classification Accuracy and Increased Number of Commands: A Review

    Science.gov (United States)

    Hong, Keum-Shik; Khan, Muhammad Jawad

    2017-01-01

    In this article, non-invasive hybrid brain–computer interface (hBCI) technologies for improving classification accuracy and increasing the number of commands are reviewed. Hybridization combining more than two modalities is a new trend in brain imaging and prosthesis control. Electroencephalography (EEG), due to its easy use and fast temporal resolution, is most widely utilized in combination with other brain/non-brain signal acquisition modalities, for instance, functional near infrared spectroscopy (fNIRS), electromyography (EMG), electrooculography (EOG), and eye tracker. Three main purposes of hybridization are to increase the number of control commands, improve classification accuracy and reduce the signal detection time. Currently, such combinations of EEG + fNIRS and EEG + EOG are most commonly employed. Four principal components (i.e., hardware, paradigm, classifiers, and features) relevant to accuracy improvement are discussed. In the case of brain signals, motor imagination/movement tasks are combined with cognitive tasks to increase active brain–computer interface (BCI) accuracy. Active and reactive tasks sometimes are combined: motor imagination with steady-state evoked visual potentials (SSVEP) and motor imagination with P300. In the case of reactive tasks, SSVEP is most widely combined with P300 to increase the number of commands. Passive BCIs, however, are rare. After discussing the hardware and strategies involved in the development of hBCI, the second part examines the approaches used to increase the number of control commands and to enhance classification accuracy. The future prospects and the extension of hBCI in real-time applications for daily life scenarios are provided. PMID:28790910

  13. Hybrid Brain-Computer Interface Techniques for Improved Classification Accuracy and Increased Number of Commands: A Review.

    Science.gov (United States)

    Hong, Keum-Shik; Khan, Muhammad Jawad

    2017-01-01

    In this article, non-invasive hybrid brain-computer interface (hBCI) technologies for improving classification accuracy and increasing the number of commands are reviewed. Hybridization combining more than two modalities is a new trend in brain imaging and prosthesis control. Electroencephalography (EEG), due to its easy use and fast temporal resolution, is most widely utilized in combination with other brain/non-brain signal acquisition modalities, for instance, functional near infrared spectroscopy (fNIRS), electromyography (EMG), electrooculography (EOG), and eye tracker. Three main purposes of hybridization are to increase the number of control commands, improve classification accuracy and reduce the signal detection time. Currently, such combinations of EEG + fNIRS and EEG + EOG are most commonly employed. Four principal components (i.e., hardware, paradigm, classifiers, and features) relevant to accuracy improvement are discussed. In the case of brain signals, motor imagination/movement tasks are combined with cognitive tasks to increase active brain-computer interface (BCI) accuracy. Active and reactive tasks sometimes are combined: motor imagination with steady-state evoked visual potentials (SSVEP) and motor imagination with P300. In the case of reactive tasks, SSVEP is most widely combined with P300 to increase the number of commands. Passive BCIs, however, are rare. After discussing the hardware and strategies involved in the development of hBCI, the second part examines the approaches used to increase the number of control commands and to enhance classification accuracy. The future prospects and the extension of hBCI in real-time applications for daily life scenarios are provided.

  14. Hybrid Brain–Computer Interface Techniques for Improved Classification Accuracy and Increased Number of Commands: A Review

    Directory of Open Access Journals (Sweden)

    Keum-Shik Hong

    2017-07-01

    Full Text Available In this article, non-invasive hybrid brain–computer interface (hBCI technologies for improving classification accuracy and increasing the number of commands are reviewed. Hybridization combining more than two modalities is a new trend in brain imaging and prosthesis control. Electroencephalography (EEG, due to its easy use and fast temporal resolution, is most widely utilized in combination with other brain/non-brain signal acquisition modalities, for instance, functional near infrared spectroscopy (fNIRS, electromyography (EMG, electrooculography (EOG, and eye tracker. Three main purposes of hybridization are to increase the number of control commands, improve classification accuracy and reduce the signal detection time. Currently, such combinations of EEG + fNIRS and EEG + EOG are most commonly employed. Four principal components (i.e., hardware, paradigm, classifiers, and features relevant to accuracy improvement are discussed. In the case of brain signals, motor imagination/movement tasks are combined with cognitive tasks to increase active brain–computer interface (BCI accuracy. Active and reactive tasks sometimes are combined: motor imagination with steady-state evoked visual potentials (SSVEP and motor imagination with P300. In the case of reactive tasks, SSVEP is most widely combined with P300 to increase the number of commands. Passive BCIs, however, are rare. After discussing the hardware and strategies involved in the development of hBCI, the second part examines the approaches used to increase the number of control commands and to enhance classification accuracy. The future prospects and the extension of hBCI in real-time applications for daily life scenarios are provided.

  15. Hyperspectral image classification using Support Vector Machine

    International Nuclear Information System (INIS)

    Moughal, T A

    2013-01-01

    Classification of land cover hyperspectral images is a very challenging task due to the unfavourable ratio between the number of spectral bands and the number of training samples. The focus in many applications is to investigate an effective classifier in terms of accuracy. The conventional multiclass classifiers have the ability to map the class of interest but the considerable efforts and large training sets are required to fully describe the classes spectrally. Support Vector Machine (SVM) is suggested in this paper to deal with the multiclass problem of hyperspectral imagery. The attraction to this method is that it locates the optimal hyper plane between the class of interest and the rest of the classes to separate them in a new high-dimensional feature space by taking into account only the training samples that lie on the edge of the class distributions known as support vectors and the use of the kernel functions made the classifier more flexible by making it robust against the outliers. A comparative study has undertaken to find an effective classifier by comparing Support Vector Machine (SVM) to the other two well known classifiers i.e. Maximum likelihood (ML) and Spectral Angle Mapper (SAM). At first, the Minimum Noise Fraction (MNF) was applied to extract the best possible features form the hyperspectral imagery and then the resulting subset of the features was applied to the classifiers. Experimental results illustrate that the integration of MNF and SVM technique significantly reduced the classification complexity and improves the classification accuracy.

  16. Classification and Accuracy Assessment for Coarse Resolution Mapping within the Great Lakes Basin, USA

    Science.gov (United States)

    This study applied a phenology-based land-cover classification approach across the Laurentian Great Lakes Basin (GLB) using time-series data consisting of 23 Moderate Resolution Imaging Spectroradiometer (MODIS) Normalized Difference Vegetation Index (NDVI) composite images (250 ...

  17. IMPACTS OF PATCH SIZE AND LAND COVER HETEROGENEITY ON THEMATIC IMAGE CLASSIFICATION ACCURACY

    Science.gov (United States)

    Landscape characteristics such as small patch size and land cover heterogeneity have been hypothesized to increase the likelihood of miss-classifying pixels during thematic image classification. However, there has been a lack of empirical evidence to support these hypotheses,...

  18. sw-SVM: sensor weighting support vector machines for EEG-based brain-computer interfaces.

    Science.gov (United States)

    Jrad, N; Congedo, M; Phlypo, R; Rousseau, S; Flamary, R; Yger, F; Rakotomamonjy, A

    2011-10-01

    In many machine learning applications, like brain-computer interfaces (BCI), high-dimensional sensor array data are available. Sensor measurements are often highly correlated and signal-to-noise ratio is not homogeneously spread across sensors. Thus, collected data are highly variable and discrimination tasks are challenging. In this work, we focus on sensor weighting as an efficient tool to improve the classification procedure. We present an approach integrating sensor weighting in the classification framework. Sensor weights are considered as hyper-parameters to be learned by a support vector machine (SVM). The resulting sensor weighting SVM (sw-SVM) is designed to satisfy a margin criterion, that is, the generalization error. Experimental studies on two data sets are presented, a P300 data set and an error-related potential (ErrP) data set. For the P300 data set (BCI competition III), for which a large number of trials is available, the sw-SVM proves to perform equivalently with respect to the ensemble SVM strategy that won the competition. For the ErrP data set, for which a small number of trials are available, the sw-SVM shows superior performances as compared to three state-of-the art approaches. Results suggest that the sw-SVM promises to be useful in event-related potentials classification, even with a small number of training trials.

  19. Impact of geometry and viewing angle on classification accuracy of 2D based analysis of dysmorphic faces.

    Science.gov (United States)

    Vollmar, Tobias; Maus, Baerbel; Wurtz, Rolf P; Gillessen-Kaesbach, Gabriele; Horsthemke, Bernhard; Wieczorek, Dagmar; Boehringer, Stefan

    2008-01-01

    Digital image analysis of faces has been demonstrated to be effective in a small number of syndromes. In this paper we investigate several aspects that help bringing these methods closer to clinical application. First, we investigate the impact of increasing the number of syndromes from 10 to 14 as compared to an earlier study. Second, we include a side-view pose into the analysis and third, we scrutinize the effect of geometry information. Picture analysis uses a Gabor wavelet transform, standardization of landmark coordinates and subsequent statistical analysis. We can demonstrate that classification accuracy drops from 76% for 10 syndromes to 70% for 14 syndromes for frontal images. Including side-views achieves an accuracy of 76% again. Geometry performs excellently with 85% for combined poses. Combination of wavelets and geometry for both poses increases accuracy to 93%. In conclusion, a larger number of syndromes can be handled effectively by means of image analysis.

  20. A support vector machine (SVM) based voltage stability classifier

    Energy Technology Data Exchange (ETDEWEB)

    Dosano, R.D.; Song, H. [Kunsan National Univ., Kunsan, Jeonbuk (Korea, Republic of); Lee, B. [Korea Univ., Seoul (Korea, Republic of)

    2007-07-01

    Power system stability has become even more complex and critical with the advent of deregulated energy markets and the growing desire to completely employ existing transmission and infrastructure. The economic pressure on electricity markets forces the operation of power systems and components to their limit of capacity and performance. System conditions can be more exposed to instability due to greater uncertainty in day to day system operations and increase in the number of potential components for system disturbances potentially resulting in voltage stability. This paper proposed a support vector machine (SVM) based power system voltage stability classifier using local measurements of voltage and active power of load. It described the procedure for fast classification of long-term voltage stability using the SVM algorithm. The application of the SVM based voltage stability classifier was presented with reference to the choice of input parameters; input data preconditioning; moving window for feature vector; determination of learning samples; and other considerations in SVM applications. The paper presented a case study with numerical examples of an 11-bus test system. The test results for the feasibility study demonstrated that the classifier could offer an excellent performance in classification with time-series measurements in terms of long-term voltage stability. 9 refs., 14 figs.

  1. Atterberg Limits Prediction Comparing SVM with ANFIS Model

    Directory of Open Access Journals (Sweden)

    Mohammad Murtaza Sherzoy

    2017-03-01

    Full Text Available Support Vector Machine (SVM and Adaptive Neuro-Fuzzy inference Systems (ANFIS both analytical methods are used to predict the values of Atterberg limits, such as the liquid limit, plastic limit and plasticity index. The main objective of this study is to make a comparison between both forecasts (SVM & ANFIS methods. All data of 54 soil samples are used and taken from the area of Peninsular Malaysian and tested for different parameters containing liquid limit, plastic limit, plasticity index and grain size distribution and were. The input parameter used in for this case are the fraction of grain size distribution which are the percentage of silt, clay and sand. The actual and predicted values of Atterberg limit which obtained from the SVM and ANFIS models are compared by using the correlation coefficient R2 and root mean squared error (RMSE value.  The outcome of the study show that the ANFIS model shows higher accuracy than SVM model for the liquid limit (R2 = 0.987, plastic limit (R2 = 0.949 and plastic index (R2 = 0966. RMSE value that obtained for both methods have shown that the ANFIS model has represent the best performance than SVM model to predict the Atterberg Limits as a whole.

  2. Improvement of the classification accuracy in discriminating diabetic retinopathy by multifocal electroretinogram analysis

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    The multifocal electroretinogram (mfERG) is a newly developed electrophysiological technique. In this paper, a classification method is proposed for early diagnosis of the diabetic retinopathy using mfERG data. MfERG records were obtained from eyes of healthy individuals and patients with diabetes at different stages. For each mfERG record, 103 local responses were extracted. Amplitude value of each point on all the mfERG local responses was looked as one potential feature to classify the experimental subjects. Feature subsets were selected from the feature space by comparing the inter-intra distance. Based on the selected feature subset, Fisher's linear classifiers were trained. And the final classification decision of the record was made by voting all the classifiers' outputs. Applying the method to classify all experimental subjects, very low error rates were achieved. Some crucial properties of the diabetic retinopathy classification method are also discussed.

  3. A DDoS Attack Detection Method Based on SVM in Software Defined Network

    Directory of Open Access Journals (Sweden)

    Jin Ye

    2018-01-01

    Full Text Available The detection of DDoS attacks is an important topic in the field of network security. The occurrence of software defined network (SDN (Zhang et al., 2018 brings up some novel methods to this topic in which some deep learning algorithm is adopted to model the attack behavior based on collecting from the SDN controller. However, the existing methods such as neural network algorithm are not practical enough to be applied. In this paper, the SDN environment by mininet and floodlight (Ning et al., 2014 simulation platform is constructed, 6-tuple characteristic values of the switch flow table is extracted, and then DDoS attack model is built by combining the SVM classification algorithms. The experiments show that average accuracy rate of our method is 95.24% with a small amount of flow collecting. Our work is of good value for the detection of DDoS attack in SDN.

  4. Density Based Support Vector Machines for Classification

    OpenAIRE

    Zahra Nazari; Dongshik Kang

    2015-01-01

    Support Vector Machines (SVM) is the most successful algorithm for classification problems. SVM learns the decision boundary from two classes (for Binary Classification) of training points. However, sometimes there are some less meaningful samples amongst training points, which are corrupted by noises or misplaced in wrong side, called outliers. These outliers are affecting on margin and classification performance, and machine should better to discard them. SVM as a popular and widely used cl...

  5. The 2nu-SVM: A Cost-Sensitive Extension of the nu-SVM

    National Research Council Canada - National Science Library

    Davenport, Mark A

    2005-01-01

    .... In this report we review cost-sensitive extensions of standard support vector machines (SVMs). In particular, we describe cost-sensitive extensions of the C-SVM and the nu-SVM, which we denote the 2C-SVM and 2nu-SVM respectively...

  6. Intrusion detection model using fusion of chi-square feature selection and multi class SVM

    Directory of Open Access Journals (Sweden)

    Ikram Sumaiya Thaseen

    2017-10-01

    Full Text Available Intrusion detection is a promising area of research in the domain of security with the rapid development of internet in everyday life. Many intrusion detection systems (IDS employ a sole classifier algorithm for classifying network traffic as normal or abnormal. Due to the large amount of data, these sole classifier models fail to achieve a high attack detection rate with reduced false alarm rate. However by applying dimensionality reduction, data can be efficiently reduced to an optimal set of attributes without loss of information and then classified accurately using a multi class modeling technique for identifying the different network attacks. In this paper, we propose an intrusion detection model using chi-square feature selection and multi class support vector machine (SVM. A parameter tuning technique is adopted for optimization of Radial Basis Function kernel parameter namely gamma represented by ‘ϒ’ and over fitting constant ‘C’. These are the two important parameters required for the SVM model. The main idea behind this model is to construct a multi class SVM which has not been adopted for IDS so far to decrease the training and testing time and increase the individual classification accuracy of the network attacks. The investigational results on NSL-KDD dataset which is an enhanced version of KDDCup 1999 dataset shows that our proposed approach results in a better detection rate and reduced false alarm rate. An experimentation on the computational time required for training and testing is also carried out for usage in time critical applications.

  7. The impact of catchment source group classification on the accuracy of sediment fingerprinting outputs.

    Science.gov (United States)

    Pulley, Simon; Foster, Ian; Collins, Adrian L

    2017-06-01

    The objective classification of sediment source groups is at present an under-investigated aspect of source tracing studies, which has the potential to statistically improve discrimination between sediment sources and reduce uncertainty. This paper investigates this potential using three different source group classification schemes. The first classification scheme was simple surface and subsurface groupings (Scheme 1). The tracer signatures were then used in a two-step cluster analysis to identify the sediment source groupings naturally defined by the tracer signatures (Scheme 2). The cluster source groups were then modified by splitting each one into a surface and subsurface component to suit catchment management goals (Scheme 3). The schemes were tested using artificial mixtures of sediment source samples. Controlled corruptions were made to some of the mixtures to mimic the potential causes of tracer non-conservatism present when using tracers in natural fluvial environments. It was determined how accurately the known proportions of sediment sources in the mixtures were identified after unmixing modelling using the three classification schemes. The cluster analysis derived source groups (2) significantly increased tracer variability ratios (inter-/intra-source group variability) (up to 2122%, median 194%) compared to the surface and subsurface groupings (1). As a result, the composition of the artificial mixtures was identified an average of 9.8% more accurately on the 0-100% contribution scale. It was found that the cluster groups could be reclassified into a surface and subsurface component (3) with no significant increase in composite uncertainty (a 0.1% increase over Scheme 2). The far smaller effects of simulated tracer non-conservatism for the cluster analysis based schemes (2 and 3) was primarily attributed to the increased inter-group variability producing a far larger sediment source signal that the non-conservatism noise (1). Modified cluster analysis

  8. Classification of Autism Spectrum Disorder Using Random Support Vector Machine Cluster

    Directory of Open Access Journals (Sweden)

    Xia-an Bi

    2018-02-01

    Full Text Available Autism spectrum disorder (ASD is mainly reflected in the communication and language barriers, difficulties in social communication, and it is a kind of neurological developmental disorder. Most researches have used the machine learning method to classify patients and normal controls, among which support vector machines (SVM are widely employed. But the classification accuracy of SVM is usually low, due to the usage of a single SVM as classifier. Thus, we used multiple SVMs to classify ASD patients and typical controls (TC. Resting-state functional magnetic resonance imaging (fMRI data of 46 TC and 61 ASD patients were obtained from the Autism Brain Imaging Data Exchange (ABIDE database. Only 84 of 107 subjects are utilized in experiments because the translation or rotation of 7 TC and 16 ASD patients has surpassed ±2 mm or ±2°. Then the random SVM cluster was proposed to distinguish TC and ASD. The results show that this method has an excellent classification performance based on all the features. Furthermore, the accuracy based on the optimal feature set could reach to 96.15%. Abnormal brain regions could also be found, such as inferior frontal gyrus (IFG (orbital and opercula part, hippocampus, and precuneus. It is indicated that the method of random SVM cluster may apply to the auxiliary diagnosis of ASD.

  9. A comparative study of the SVM and K-nn machine learning algorithms for the diagnosis of respiratory pathologies using pulmonary acoustic signals.

    Science.gov (United States)

    Palaniappan, Rajkumar; Sundaraj, Kenneth; Sundaraj, Sebastian

    2014-06-27

    Pulmonary acoustic parameters extracted from recorded respiratory sounds provide valuable information for the detection of respiratory pathologies. The automated analysis of pulmonary acoustic signals can serve as a differential diagnosis tool for medical professionals, a learning tool for medical students, and a self-management tool for patients. In this context, we intend to evaluate and compare the performance of the support vector machine (SVM) and K-nearest neighbour (K-nn) classifiers in diagnosis respiratory pathologies using respiratory sounds from R.A.L.E database. The pulmonary acoustic signals used in this study were obtained from the R.A.L.E lung sound database. The pulmonary acoustic signals were manually categorised into three different groups, namely normal, airway obstruction pathology, and parenchymal pathology. The mel-frequency cepstral coefficient (MFCC) features were extracted from the pre-processed pulmonary acoustic signals. The MFCC features were analysed by one-way ANOVA and then fed separately into the SVM and K-nn classifiers. The performances of the classifiers were analysed using the confusion matrix technique. The statistical analysis of the MFCC features using one-way ANOVA showed that the extracted MFCC features are significantly different (p < 0.001). The classification accuracies of the SVM and K-nn classifiers were found to be 92.19% and 98.26%, respectively. Although the data used to train and test the classifiers are limited, the classification accuracies found are satisfactory. The K-nn classifier was better than the SVM classifier for the discrimination of pulmonary acoustic signals from pathological and normal subjects obtained from the RALE database.

  10. Multispectral imaging burn wound tissue classification system: a comparison of test accuracies between several common machine learning algorithms

    Science.gov (United States)

    Squiers, John J.; Li, Weizhi; King, Darlene R.; Mo, Weirong; Zhang, Xu; Lu, Yang; Sellke, Eric W.; Fan, Wensheng; DiMaio, J. Michael; Thatcher, Jeffrey E.

    2016-03-01

    The clinical judgment of expert burn surgeons is currently the standard on which diagnostic and therapeutic decisionmaking regarding burn injuries is based. Multispectral imaging (MSI) has the potential to increase the accuracy of burn depth assessment and the intraoperative identification of viable wound bed during surgical debridement of burn injuries. A highly accurate classification model must be developed using machine-learning techniques in order to translate MSI data into clinically-relevant information. An animal burn model was developed to build an MSI training database and to study the burn tissue classification ability of several models trained via common machine-learning algorithms. The algorithms tested, from least to most complex, were: K-nearest neighbors (KNN), decision tree (DT), linear discriminant analysis (LDA), weighted linear discriminant analysis (W-LDA), quadratic discriminant analysis (QDA), ensemble linear discriminant analysis (EN-LDA), ensemble K-nearest neighbors (EN-KNN), and ensemble decision tree (EN-DT). After the ground-truth database of six tissue types (healthy skin, wound bed, blood, hyperemia, partial injury, full injury) was generated by histopathological analysis, we used 10-fold cross validation to compare the algorithms' performances based on their accuracies in classifying data against the ground truth, and each algorithm was tested 100 times. The mean test accuracy of the algorithms were KNN 68.3%, DT 61.5%, LDA 70.5%, W-LDA 68.1%, QDA 68.9%, EN-LDA 56.8%, EN-KNN 49.7%, and EN-DT 36.5%. LDA had the highest test accuracy, reflecting the bias-variance tradeoff over the range of complexities inherent to the algorithms tested. Several algorithms were able to match the current standard in burn tissue classification, the clinical judgment of expert burn surgeons. These results will guide further development of an MSI burn tissue classification system. Given that there are few surgeons and facilities specializing in burn care

  11. Stellar Spectral Classification with Minimum Within-Class and ...

    Indian Academy of Sciences (India)

    spectral classification methods, and it is widely used in practice. But its ... Digital Sky Survey (SDSS) show that MMSVM performs better than. SVM. Key words. ... to feature extraction, and then the traditional classifier SVM is used to classify the.

  12. Construction of Pancreatic Cancer Classifier Based on SVM Optimized by Improved FOA

    Science.gov (United States)

    Ma, Xiaoqi

    2015-01-01

    A novel method is proposed to establish the pancreatic cancer classifier. Firstly, the concept of quantum and fruit fly optimal algorithm (FOA) are introduced, respectively. Then FOA is improved by quantum coding and quantum operation, and a new smell concentration determination function is defined. Finally, the improved FOA is used to optimize the parameters of support vector machine (SVM) and the classifier is established by optimized SVM. In order to verify the effectiveness of the proposed method, SVM and other classification methods have been chosen as the comparing methods. The experimental results show that the proposed method can improve the classifier performance and cost less time. PMID:26543867

  13. Discrimination of soft tissues using laser-induced breakdown spectroscopy in combination with k nearest neighbors (kNN) and support vector machine (SVM) classifiers

    Science.gov (United States)

    Li, Xiaohui; Yang, Sibo; Fan, Rongwei; Yu, Xin; Chen, Deying

    2018-06-01

    In this paper, discrimination of soft tissues using laser-induced breakdown spectroscopy (LIBS) in combination with multivariate statistical methods is presented. Fresh pork fat, skin, ham, loin and tenderloin muscle tissues are manually cut into slices and ablated using a 1064 nm pulsed Nd:YAG laser. Discrimination analyses between fat, skin and muscle tissues, and further between highly similar ham, loin and tenderloin muscle tissues, are performed based on the LIBS spectra in combination with multivariate statistical methods, including principal component analysis (PCA), k nearest neighbors (kNN) classification, and support vector machine (SVM) classification. Performances of the discrimination models, including accuracy, sensitivity and specificity, are evaluated using 10-fold cross validation. The classification models are optimized to achieve best discrimination performances. The fat, skin and muscle tissues can be definitely discriminated using both kNN and SVM classifiers, with accuracy of over 99.83%, sensitivity of over 0.995 and specificity of over 0.998. The highly similar ham, loin and tenderloin muscle tissues can also be discriminated with acceptable performances. The best performances are achieved with SVM classifier using Gaussian kernel function, with accuracy of 76.84%, sensitivity of over 0.742 and specificity of over 0.869. The results show that the LIBS technique assisted with multivariate statistical methods could be a powerful tool for online discrimination of soft tissues, even for tissues of high similarity, such as muscles from different parts of the animal body. This technique could be used for discrimination of tissues suffering minor clinical changes, thus may advance the diagnosis of early lesions and abnormalities.

  14. Improving ECG classification accuracy using an ensemble of neural network modules.

    Directory of Open Access Journals (Sweden)

    Mehrdad Javadi

    Full Text Available This paper illustrates the use of a combined neural network model based on Stacked Generalization method for classification of electrocardiogram (ECG beats. In conventional Stacked Generalization method, the combiner learns to map the base classifiers' outputs to the target data. We claim adding the input pattern to the base classifiers' outputs helps the combiner to obtain knowledge about the input space and as the result, performs better on the same task. Experimental results support our claim that the additional knowledge according to the input space, improves the performance of the proposed method which is called Modified Stacked Generalization. In particular, for classification of 14966 ECG beats that were not previously seen during training phase, the Modified Stacked Generalization method reduced the error rate for 12.41% in comparison with the best of ten popular classifier fusion methods including Max, Min, Average, Product, Majority Voting, Borda Count, Decision Templates, Weighted Averaging based on Particle Swarm Optimization and Stacked Generalization.

  15. Comparative analysis of Worldview-2 and Landsat 8 for coastal saltmarsh mapping accuracy assessment

    Science.gov (United States)

    Rasel, Sikdar M. M.; Chang, Hsing-Chung; Diti, Israt Jahan; Ralph, Tim; Saintilan, Neil

    2016-05-01

    Coastal saltmarsh and their constituent components and processes are of an interest scientifically due to their ecological function and services. However, heterogeneity and seasonal dynamic of the coastal wetland system makes it challenging to map saltmarshes with remotely sensed data. This study selected four important saltmarsh species Pragmitis australis, Sporobolus virginicus, Ficiona nodosa and Schoeloplectus sp. as well as a Mangrove and Pine tree species, Avecinia and Casuarina sp respectively. High Spatial Resolution Worldview-2 data and Coarse Spatial resolution Landsat 8 imagery were selected in this study. Among the selected vegetation types some patches ware fragmented and close to the spatial resolution of Worldview-2 data while and some patch were larger than the 30 meter resolution of Landsat 8 data. This study aims to test the effectiveness of different classifier for the imagery with various spatial and spectral resolutions. Three different classification algorithm, Maximum Likelihood Classifier (MLC), Support Vector Machine (SVM) and Artificial Neural Network (ANN) were tested and compared with their mapping accuracy of the results derived from both satellite imagery. For Worldview-2 data SVM was giving the higher overall accuracy (92.12%, kappa =0.90) followed by ANN (90.82%, Kappa 0.89) and MLC (90.55%, kappa = 0.88). For Landsat 8 data, MLC (82.04%) showed the highest classification accuracy comparing to SVM (77.31%) and ANN (75.23%). The producer accuracy of the classification results were also presented in the paper.

  16. Sales Growth Rate Forecasting Using Improved PSO and SVM

    Directory of Open Access Journals (Sweden)

    Xibin Wang

    2014-01-01

    Full Text Available Accurate forecast of the sales growth rate plays a decisive role in determining the amount of advertising investment. In this study, we present a preclassification and later regression based method optimized by improved particle swarm optimization (IPSO for sales growth rate forecasting. We use support vector machine (SVM as a classification model. The nonlinear relationship in sales growth rate forecasting is efficiently represented by SVM, while IPSO is optimizing the training parameters of SVM. IPSO addresses issues of traditional PSO, such as relapsing into local optimum, slow convergence speed, and low convergence precision in the later evolution. We performed two experiments; firstly, three classic benchmark functions are used to verify the validity of the IPSO algorithm against PSO. Having shown IPSO outperform PSO in convergence speed, precision, and escaping local optima, in our second experiment, we apply IPSO to the proposed model. The sales growth rate forecasting cases are used to testify the forecasting performance of proposed model. According to the requirements and industry knowledge, the sample data was first classified to obtain types of the test samples. Next, the values of the test samples were forecast using the SVM regression algorithm. The experimental results demonstrate that the proposed model has good forecasting performance.

  17. Segmentasi Citra menggunakan Support Vector Machine (SVM dan Ellipsoid Region Search Strategy (ERSS Arimoto Entropy berdasarkan Ciri Warna dan Tekstur

    Directory of Open Access Journals (Sweden)

    Lukman Hakim

    2016-02-01

    Full Text Available Abstrak Segmentasi citra merupakan suatu metode penting dalam pengolahan citra digital yang bertujuan membagi citra menjadi beberapa region yang homogen berdasarkan kriteria kemiripan tertentu. Salah satu syarat utama yang harus dimiliki suatu metode segmentasi citra yaitu menghasilkan citra boundary yang optimal.Untuk memenuhi syarat tersebut suatu metode segmentasi membutuhkan suatu klasifikasi piksel citra yang dapat memisahkan piksel secara linier dan non-linear. Pada penelitian ini, penulis mengusulkan metode segmentasi citra menggunakan SVM dan entropi Arimoto berbasis ERSS sehingga tahan terhadap derau dan mempunyai kompleksitas yang rendah untuk menghasilkan citra boundary yang optimal. Pertama, ekstraksi ciri warna dengan local homogeneity dan ciri tekstur dengan menggunakan Gray Level Co-occurrence Matrix (GLCM yang menghasilkan beberapa fitur. Kedua, pelabelan dengan Arimoto berbasis ERSS yang digunakan sebagai kelas dalam klasifikasi. Ketiga, hasil ekstraksi fitur dan training kemudian diklasifikasi berdasarkan label dengan SVM yang telah di-training. Dari percobaan yang dilakukan menunjukkan hasil segmentasi kurang optimal dengan akurasi 69 %. Reduksi fitur perlu dilakukan untuk menghasilkan citra yang tersegmentasi dengan baik. Kata kunci: segmentasi citra, support vector machine, ERSS Arimoto Entropy, ekstraksi ciri. Abstract Image segmentation is an important tool in image processing that divides an image into homogeneous regions based on certain similarity criteria, which ideally should be meaning-full for a certain purpose. Optimal boundary is one of the main criteria that an image segmentation method should has. A classification method that can partitions pixel linearly or non-linearly is needed by an image segmentation method. We propose a color image segmentation using Support Vector Machine (SVM classification and ERSS Arimoto entropy thresholding to get optimal boundary of segmented image that noise-free and low complexity

  18. Linear Discriminant Analysis achieves high classification accuracy for the BOLD fMRI response to naturalistic movie stimuli.

    Directory of Open Access Journals (Sweden)

    Hendrik eMandelkow

    2016-03-01

    Full Text Available Naturalistic stimuli like movies evoke complex perceptual processes, which are of great interest in the study of human cognition by functional MRI (fMRI. However, conventional fMRI analysis based on statistical parametric mapping (SPM and the general linear model (GLM is hampered by a lack of accurate parametric models of the BOLD response to complex stimuli. In this situation, statistical machine-learning methods, a.k.a. multivariate pattern analysis (MVPA, have received growing attention for their ability to generate stimulus response models in a data-driven fashion. However, machine-learning methods typically require large amounts of training data as well as computational resources. In the past this has largely limited their application to fMRI experiments involving small sets of stimulus categories and small regions of interest in the brain. By contrast, the present study compares several classification algorithms known as Nearest Neighbour (NN, Gaussian Naïve Bayes (GNB, and (regularised Linear Discriminant Analysis (LDA in terms of their classification accuracy in discriminating the global fMRI response patterns evoked by a large number of naturalistic visual stimuli presented as a movie.Results show that LDA regularised by principal component analysis (PCA achieved high classification accuracies, above 90% on average for single fMRI volumes acquired 2s apart during a 300s movie (chance level 0.7% = 2s/300s. The largest source of classification errors were autocorrelations in the BOLD signal compounded by the similarity of consecutive stimuli. All classifiers performed best when given input features from a large region of interest comprising around 25% of the voxels that responded significantly to the visual stimulus. Consistent with this, the most informative principal components represented widespread distributions of co-activated brain regions that were similar between subjects and may represent functional networks. In light of these

  19. Application of ANFIS and SVM Systems in Order to Estimate Monthly Reference Crop Evapotranspiration in the Northwest of Iran

    Directory of Open Access Journals (Sweden)

    F. Ahmadi

    2016-10-01

    Full Text Available Introduction Crop evapotranspiration modeling process mainly performs with empirical methods, aerodynamic and energy balance. In these methods, the evapotranspiration is calculated based on the average values of meteorological parameters at different time steps. The linear models didn’t have a good performance in this field due to high variability of evapotranspiration and the researchers have turned to the use of nonlinear and intelligent models. For accurate estimation of this hydrologic variable, it should be spending much time and money to measure many data (19. Materials and Methods Recently the new hybrid methods have been developed by combining some of methods such as artificial neural networks, fuzzy logic and evolutionary computation, that called Soft Computing and Intelligent Systems. These soft techniques are used in various fields of engineering. A fuzzy neurosis is a hybrid system that incorporates the decision ability of fuzzy logic with the computational ability of neural network, which provides a high capability for modeling and estimating. Basically, the Fuzzy part is used to classify the input data set and determines the degree of membership (that each number can be laying between 0 and 1 and decisions for the next activity made based on a set of rules and move to the next stage. Adaptive Neuro-Fuzzy Inference Systems (ANFIS includes some parts of a typical fuzzy expert system which the calculations at each step is performed by the hidden layer neurons and the learning ability of the neural network has been created to increase the system information (9. SVM is a one of supervised learning methods which used for classification and regression affairs. This method was developed by Vapink (15 based on statistical learning theory. The SVM is a method for binary classification in an arbitrary characteristic space, so it is suitable for prediction problems (12. The SVM is originally a two-class Classifier that separates the classes

  20. Large margin classification with indefinite similarities

    KAUST Repository

    Alabdulmohsin, Ibrahim

    2016-01-07

    Classification with indefinite similarities has attracted attention in the machine learning community. This is partly due to the fact that many similarity functions that arise in practice are not symmetric positive semidefinite, i.e. the Mercer condition is not satisfied, or the Mercer condition is difficult to verify. Examples of such indefinite similarities in machine learning applications are ample including, for instance, the BLAST similarity score between protein sequences, human-judged similarities between concepts and words, and the tangent distance or the shape matching distance in computer vision. Nevertheless, previous works on classification with indefinite similarities are not fully satisfactory. They have either introduced sources of inconsistency in handling past and future examples using kernel approximation, settled for local-minimum solutions using non-convex optimization, or produced non-sparse solutions by learning in Krein spaces. Despite the large volume of research devoted to this subject lately, we demonstrate in this paper how an old idea, namely the 1-norm support vector machine (SVM) proposed more than 15 years ago, has several advantages over more recent work. In particular, the 1-norm SVM method is conceptually simpler, which makes it easier to implement and maintain. It is competitive, if not superior to, all other methods in terms of predictive accuracy. Moreover, it produces solutions that are often sparser than more recent methods by several orders of magnitude. In addition, we provide various theoretical justifications by relating 1-norm SVM to well-established learning algorithms such as neural networks, SVM, and nearest neighbor classifiers. Finally, we conduct a thorough experimental evaluation, which reveals that the evidence in favor of 1-norm SVM is statistically significant.

  1. SVM prediction of ligand-binding sites in bacterial lipoproteins employing shape and physio-chemical descriptors.

    Science.gov (United States)

    Kadam, Kiran; Prabhakar, Prashant; Jayaraman, V K

    2012-11-01

    Bacterial lipoproteins play critical roles in various physiological processes including the maintenance of pathogenicity and numbers of them are being considered as potential candidates for generating novel vaccines. In this work, we put forth an algorithm to identify and predict ligand-binding sites in bacterial lipoproteins. The method uses three types of pocket descriptors, namely fpocket descriptors, 3D Zernike descriptors and shell descriptors, and combines them with Support Vector Machine (SVM) method for the classification. The three types of descriptors represent shape-based properties of the pocket as well as its local physio-chemical features. All three types of descriptors, along with their hybrid combinations are evaluated with SVM and to improve classification performance, WEKA-InfoGain feature selection is applied. Results obtained in the study show that the classifier successfully differentiates between ligand-binding and non-binding pockets. For the combination of three types of descriptors, 10 fold cross-validation accuracy of 86.83% is obtained for training while the selected model achieved test Matthews Correlation Coefficient (MCC) of 0.534. Individually or in combination with new and existing methods, our model can be a very useful tool for the prediction of potential ligand-binding sites in bacterial lipoproteins.

  2. Cancer survival classification using integrated data sets and intermediate information.

    Science.gov (United States)

    Kim, Shinuk; Park, Taesung; Kon, Mark

    2014-09-01

    Although numerous studies related to cancer survival have been published, increasing the prediction accuracy of survival classes still remains a challenge. Integration of different data sets, such as microRNA (miRNA) and mRNA, might increase the accuracy of survival class prediction. Therefore, we suggested a machine learning (ML) approach to integrate different data sets, and developed a novel method based on feature selection with Cox proportional hazard regression model (FSCOX) to improve the prediction of cancer survival time. FSCOX provides us with intermediate survival information, which is usually discarded when separating survival into 2 groups (short- and long-term), and allows us to perform survival analysis. We used an ML-based protocol for feature selection, integrating information from miRNA and mRNA expression profiles at the feature level. To predict survival phenotypes, we used the following classifiers, first, existing ML methods, support vector machine (SVM) and random forest (RF), second, a new median-based classifier using FSCOX (FSCOX_median), and third, an SVM classifier using FSCOX (FSCOX_SVM). We compared these methods using 3 types of cancer tissue data sets: (i) miRNA expression, (ii) mRNA expression, and (iii) combined miRNA and mRNA expression. The latter data set included features selected either from the combined miRNA/mRNA profile or independently from miRNAs and mRNAs profiles (IFS). In the ovarian data set, the accuracy of survival classification using the combined miRNA/mRNA profiles with IFS was 75% using RF, 86.36% using SVM, 84.09% using FSCOX_median, and 88.64% using FSCOX_SVM with a balanced 22 short-term and 22 long-term survivor data set. These accuracies are higher than those using miRNA alone (70.45%, RF; 75%, SVM; 75%, FSCOX_median; and 75%, FSCOX_SVM) or mRNA alone (65.91%, RF; 63.64%, SVM; 72.73%, FSCOX_median; and 70.45%, FSCOX_SVM). Similarly in the glioblastoma multiforme data, the accuracy of miRNA/mRNA using IFS

  3. Hybrid Optimization of Object-Based Classification in High-Resolution Images Using Continous ANT Colony Algorithm with Emphasis on Building Detection

    Science.gov (United States)

    Tamimi, E.; Ebadi, H.; Kiani, A.

    2017-09-01

    Automatic building detection from High Spatial Resolution (HSR) images is one of the most important issues in Remote Sensing (RS). Due to the limited number of spectral bands in HSR images, using other features will lead to improve accuracy. By adding these features, the presence probability of dependent features will be increased, which leads to accuracy reduction. In addition, some parameters should be determined in Support Vector Machine (SVM) classification. Therefore, it is necessary to simultaneously determine classification parameters and select independent features according to image type. Optimization algorithm is an efficient method to solve this problem. On the other hand, pixel-based classification faces several challenges such as producing salt-paper results and high computational time in high dimensional data. Hence, in this paper, a novel method is proposed to optimize object-based SVM classification by applying continuous Ant Colony Optimization (ACO) algorithm. The advantages of the proposed method are relatively high automation level, independency of image scene and type, post processing reduction for building edge reconstruction and accuracy improvement. The proposed method was evaluated by pixel-based SVM and Random Forest (RF) classification in terms of accuracy. In comparison with optimized pixel-based SVM classification, the results showed that the proposed method improved quality factor and overall accuracy by 17% and 10%, respectively. Also, in the proposed method, Kappa coefficient was improved by 6% rather than RF classification. Time processing of the proposed method was relatively low because of unit of image analysis (image object). These showed the superiority of the proposed method in terms of time and accuracy.

  4. HYBRID OPTIMIZATION OF OBJECT-BASED CLASSIFICATION IN HIGH-RESOLUTION IMAGES USING CONTINOUS ANT COLONY ALGORITHM WITH EMPHASIS ON BUILDING DETECTION

    Directory of Open Access Journals (Sweden)

    E. Tamimi

    2017-09-01

    Full Text Available Automatic building detection from High Spatial Resolution (HSR images is one of the most important issues in Remote Sensing (RS. Due to the limited number of spectral bands in HSR images, using other features will lead to improve accuracy. By adding these features, the presence probability of dependent features will be increased, which leads to accuracy reduction. In addition, some parameters should be determined in Support Vector Machine (SVM classification. Therefore, it is necessary to simultaneously determine classification parameters and select independent features according to image type. Optimization algorithm is an efficient method to solve this problem. On the other hand, pixel-based classification faces several challenges such as producing salt-paper results and high computational time in high dimensional data. Hence, in this paper, a novel method is proposed to optimize object-based SVM classification by applying continuous Ant Colony Optimization (ACO algorithm. The advantages of the proposed method are relatively high automation level, independency of image scene and type, post processing reduction for building edge reconstruction and accuracy improvement. The proposed method was evaluated by pixel-based SVM and Random Forest (RF classification in terms of accuracy. In comparison with optimized pixel-based SVM classification, the results showed that the proposed method improved quality factor and overall accuracy by 17% and 10%, respectively. Also, in the proposed method, Kappa coefficient was improved by 6% rather than RF classification. Time processing of the proposed method was relatively low because of unit of image analysis (image object. These showed the superiority of the proposed method in terms of time and accuracy.

  5. Stationary Wavelet Transform and AdaBoost with SVM Based Pathological Brain Detection in MRI Scanning.

    Science.gov (United States)

    Nayak, Deepak Ranjan; Dash, Ratnakar; Majhi, Banshidhar

    2017-01-01

    This paper presents an automatic classification system for segregating pathological brain from normal brains in magnetic resonance imaging scanning. The proposed system employs contrast limited adaptive histogram equalization scheme to enhance the diseased region in brain MR images. Two-dimensional stationary wavelet transform is harnessed to extract features from the preprocessed images. The feature vector is constructed using the energy and entropy values, computed from the level- 2 SWT coefficients. Then, the relevant and uncorrelated features are selected using symmetric uncertainty ranking filter. Subsequently, the selected features are given input to the proposed AdaBoost with support vector machine classifier, where SVM is used as the base classifier of AdaBoost algorithm. To validate the proposed system, three standard MR image datasets, Dataset-66, Dataset-160, and Dataset- 255 have been utilized. The 5 runs of k-fold stratified cross validation results indicate the suggested scheme offers better performance than other existing schemes in terms of accuracy and number of features. The proposed system earns ideal classification over Dataset-66 and Dataset-160; whereas, for Dataset- 255, an accuracy of 99.45% is achieved. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  6. Analysis And Voice Recognition In Indonesian Language Using MFCC And SVM Method

    Directory of Open Access Journals (Sweden)

    Harvianto Harvianto

    2016-06-01

    Full Text Available Voice recognition technology is one of biometric technology. Sound is a unique part of the human being which made an individual can be easily distinguished one from another. Voice can also provide information such as gender, emotion, and identity of the speaker. This research will record human voices that pronounce digits between 0 and 9 with and without noise. Features of this sound recording will be extracted using Mel Frequency Cepstral Coefficient (MFCC. Mean, standard deviation, max, min, and the combination of them will be used to construct the feature vectors. This feature vectors then will be classified using Support Vector Machine (SVM. There will be two classification models. The first one is based on the speaker and the other one based on the digits pronounced. The classification model then will be validated by performing 10-fold cross-validation.The best average accuracy from two classification model is 91.83%. This result achieved using Mean + Standard deviation + Min + Max as features.

  7. Non-invasive classification of severe sepsis and systemic inflammatory response syndrome using a nonlinear support vector machine: a preliminary study

    International Nuclear Information System (INIS)

    Tang, Collin H H; Savkin, Andrey V; Chan, Gregory S H; Middleton, Paul M; Bishop, Sarah; Lovell, Nigel H

    2010-01-01

    Sepsis has been defined as the systemic response to infection in critically ill patients, with severe sepsis and septic shock representing increasingly severe stages of the same disease. Based on the non-invasive cardiovascular spectrum analysis, this paper presents a pilot study on the potential use of the nonlinear support vector machine (SVM) in the classification of the sepsis continuum into severe sepsis and systemic inflammatory response syndrome (SIRS) groups. 28 consecutive eligible patients attending the emergency department with presumptive diagnoses of sepsis syndrome have participated in this study. Through principal component analysis (PCA), the first three principal components were used to construct the SVM feature space. The SVM classifier with a fourth-order polynomial kernel was found to have a better overall performance compared with the other SVM classifiers, showing the following classification results: sensitivity = 94.44%, specificity = 62.50%, positive predictive value = 85.00%, negative predictive value = 83.33% and accuracy = 84.62%. Our classification results suggested that the combinatory use of cardiovascular spectrum analysis and the proposed SVM classification of autonomic neural activity is a potentially useful clinical tool to classify the sepsis continuum into two distinct pathological groups of varying sepsis severity

  8. Support vector machine for breast cancer classification using diffusion-weighted MRI histogram features: Preliminary study.

    Science.gov (United States)

    Vidić, Igor; Egnell, Liv; Jerome, Neil P; Teruel, Jose R; Sjøbakk, Torill E; Østlie, Agnes; Fjøsne, Hans E; Bathen, Tone F; Goa, Pål Erik

    2018-05-01

    Diffusion-weighted MRI (DWI) is currently one of the fastest developing MRI-based techniques in oncology. Histogram properties from model fitting of DWI are useful features for differentiation of lesions, and classification can potentially be improved by machine learning. To evaluate classification of malignant and benign tumors and breast cancer subtypes using support vector machine (SVM). Prospective. Fifty-one patients with benign (n = 23) and malignant (n = 28) breast tumors (26 ER+, whereof six were HER2+). Patients were imaged with DW-MRI (3T) using twice refocused spin-echo echo-planar imaging with echo time / repetition time (TR/TE) = 9000/86 msec, 90 × 90 matrix size, 2 × 2 mm in-plane resolution, 2.5 mm slice thickness, and 13 b-values. Apparent diffusion coefficient (ADC), relative enhanced diffusivity (RED), and the intravoxel incoherent motion (IVIM) parameters diffusivity (D), pseudo-diffusivity (D*), and perfusion fraction (f) were calculated. The histogram properties (median, mean, standard deviation, skewness, kurtosis) were used as features in SVM (10-fold cross-validation) for differentiation of lesions and subtyping. Accuracies of the SVM classifications were calculated to find the combination of features with highest prediction accuracy. Mann-Whitney tests were performed for univariate comparisons. For benign versus malignant tumors, univariate analysis found 11 histogram properties to be significant differentiators. Using SVM, the highest accuracy (0.96) was achieved from a single feature (mean of RED), or from three feature combinations of IVIM or ADC. Combining features from all models gave perfect classification. No single feature predicted HER2 status of ER + tumors (univariate or SVM), although high accuracy (0.90) was achieved with SVM combining several features. Importantly, these features had to include higher-order statistics (kurtosis and skewness), indicating the importance to account for heterogeneity. Our

  9. Face Verification using MLP and SVM

    OpenAIRE

    Cardinaux, Fabien; Marcel, Sébastien

    2002-01-01

    The performance of machine learning algorithms has steadily improved over the past few years, such as MLP or more recently SVM. In this paper, we compare two successful discriminant machine learning algorithms apply to the problem of face verification: MLP and SVM. These two algorithms are tested on a benchmark database, namely XM2VTS. Results show that a MLP is better than a SVM on this particular task.

  10. Emotional State Classification in Virtual Reality Using Wearable Electroencephalography

    Science.gov (United States)

    Suhaimi, N. S.; Teo, J.; Mountstephens, J.

    2018-03-01

    This paper presents the classification of emotions on EEG signals. One of the key issues in this research is the lack of mental classification using VR as the medium to stimulate emotion. The approach towards this research is by using K-nearest neighbor (KNN) and Support Vector Machine (SVM). Firstly, each of the participant will be required to wear the EEG headset and recording their brainwaves when they are immersed inside the VR. The data points are then marked if they showed any physical signs of emotion or by observing the brainwave pattern. Secondly, the data will then be tested and trained with KNN and SVM algorithms. The accuracy achieved from both methods were approximately 82% throughout the brainwave spectrum (α, β, γ, δ, θ). These methods showed promising results and will be further enhanced using other machine learning approaches in VR stimulus.

  11. Multi-view L2-SVM and its multi-view core vector machine.

    Science.gov (United States)

    Huang, Chengquan; Chung, Fu-lai; Wang, Shitong

    2016-03-01

    In this paper, a novel L2-SVM based classifier Multi-view L2-SVM is proposed to address multi-view classification tasks. The proposed Multi-view L2-SVM classifier does not have any bias in its objective function and hence has the flexibility like μ-SVC in the sense that the number of the yielded support vectors can be controlled by a pre-specified parameter. The proposed Multi-view L2-SVM classifier can make full use of the coherence and the difference of different views through imposing the consensus among multiple views to improve the overall classification performance. Besides, based on the generalized core vector machine GCVM, the proposed Multi-view L2-SVM classifier is extended into its GCVM version MvCVM which can realize its fast training on large scale multi-view datasets, with its asymptotic linear time complexity with the sample size and its space complexity independent of the sample size. Our experimental results demonstrated the effectiveness of the proposed Multi-view L2-SVM classifier for small scale multi-view datasets and the proposed MvCVM classifier for large scale multi-view datasets. Copyright © 2015 Elsevier Ltd. All rights reserved.

  12. Basic visual dysfunction allows classification of patients with schizophrenia with exceptional accuracy.

    Science.gov (United States)

    González-Hernández, J A; Pita-Alcorta, C; Padrón, A; Finalé, A; Galán, L; Martínez, E; Díaz-Comas, L; Samper-González, J A; Lencer, R; Marot, M

    2014-10-01

    Basic visual dysfunctions are commonly reported in schizophrenia; however their value as diagnostic tools remains uncertain. This study reports a novel electrophysiological approach using checkerboard visual evoked potentials (VEP). Sources of spectral resolution VEP-components C1, P1 and N1 were estimated by LORETA, and the band-effects (BSE) on these estimated sources were explored in each subject. BSEs were Z-transformed for each component and relationships with clinical variables were assessed. Clinical effects were evaluated by ROC-curves and predictive values. Forty-eight patients with schizophrenia (SZ) and 55 healthy controls participated in the study. For each of the 48 patients, the three VEP components were localized to both dorsal and ventral brain areas and also deviated from a normal distribution. P1 and N1 deviations were independent of treatment, illness chronicity or gender. Results from LORETA also suggest that deficits in thalamus, posterior cingulum, precuneus, superior parietal and medial occipitotemporal areas were associated with symptom severity. While positive symptoms were more strongly related to sensory processing deficits (P1), negative symptoms were more strongly related to perceptual processing dysfunction (N1). Clinical validation revealed positive and negative predictive values for correctly classifying SZ of 100% and 77%, respectively. Classification in an additional independent sample of 30 SZ corroborated these results. In summary, this novel approach revealed basic visual dysfunctions in all patients with schizophrenia, suggesting these visual dysfunctions represent a promising candidate as a biomarker for schizophrenia. Copyright © 2014 Elsevier B.V. All rights reserved.

  13. Classification

    Science.gov (United States)

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  14. Human Walking Pattern Recognition Based on KPCA and SVM with Ground Reflex Pressure Signal

    Directory of Open Access Journals (Sweden)

    Zhaoqin Peng

    2013-01-01

    Full Text Available Algorithms based on the ground reflex pressure (GRF signal obtained from a pair of sensing shoes for human walking pattern recognition were investigated. The dimensionality reduction algorithms based on principal component analysis (PCA and kernel principal component analysis (KPCA for walking pattern data compression were studied in order to obtain higher recognition speed. Classifiers based on support vector machine (SVM, SVM-PCA, and SVM-KPCA were designed, and the classification performances of these three kinds of algorithms were compared using data collected from a person who was wearing the sensing shoes. Experimental results showed that the algorithm fusing SVM and KPCA had better recognition performance than the other two methods. Experimental outcomes also confirmed that the sensing shoes developed in this paper can be employed for automatically recognizing human walking pattern in unlimited environments which demonstrated the potential application in the control of exoskeleton robots.

  15. PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Balachandran Manavalan

    2018-03-01

    Full Text Available Accurately identifying bacteriophage virion proteins from uncharacterized sequences is important to understand interactions between the phage and its host bacteria in order to develop new antibacterial drugs. However, identification of such proteins using experimental techniques is expensive and often time consuming; hence, development of an efficient computational algorithm for the prediction of phage virion proteins (PVPs prior to in vitro experimentation is needed. Here, we describe a support vector machine (SVM-based PVP predictor, called PVP-SVM, which was trained with 136 optimal features. A feature selection protocol was employed to identify the optimal features from a large set that included amino acid composition, dipeptide composition, atomic composition, physicochemical properties, and chain-transition-distribution. PVP-SVM achieved an accuracy of 0.870 during leave-one-out cross-validation, which was 6% higher than control SVM predictors trained with all features, indicating the efficiency of the feature selection method. Furthermore, PVP-SVM displayed superior performance compared to the currently available method, PVPred, and two other machine-learning methods developed in this study when objectively evaluated with an independent dataset. For the convenience of the scientific community, a user-friendly and publicly accessible web server has been established at www.thegleelab.org/PVP-SVM/PVP-SVM.html.

  16. Solution Path for Pin-SVM Classifiers With Positive and Negative $\\tau $ Values.

    Science.gov (United States)

    Huang, Xiaolin; Shi, Lei; Suykens, Johan A K

    2017-07-01

    Applying the pinball loss in a support vector machine (SVM) classifier results in pin-SVM. The pinball loss is characterized by a parameter τ . Its value is related to the quantile level and different τ values are suitable for different problems. In this paper, we establish an algorithm to find the entire solution path for pin-SVM with different τ values. This algorithm is based on the fact that the optimal solution to pin-SVM is continuous and piecewise linear with respect to τ . We also show that the nonnegativity constraint on τ is not necessary, i.e., τ can be extended to negative values. First, in some applications, a negative τ leads to better accuracy. Second, τ = -1 corresponds to a simple solution that links SVM and the classical kernel rule. The solution for τ = -1 can be obtained directly and then be used as a starting point of the solution path. The proposed method efficiently traverses τ values through the solution path, and then achieves good performance by a suitable τ . In particular, τ = 0 corresponds to C-SVM, meaning that the traversal algorithm can output a result at least as good as C-SVM with respect to validation error.

  17. PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine.

    Science.gov (United States)

    Manavalan, Balachandran; Shin, Tae H; Lee, Gwang

    2018-01-01

    Accurately identifying bacteriophage virion proteins from uncharacterized sequences is important to understand interactions between the phage and its host bacteria in order to develop new antibacterial drugs. However, identification of such proteins using experimental techniques is expensive and often time consuming; hence, development of an efficient computational algorithm for the prediction of phage virion proteins (PVPs) prior to in vitro experimentation is needed. Here, we describe a support vector machine (SVM)-based PVP predictor, called PVP-SVM, which was trained with 136 optimal features. A feature selection protocol was employed to identify the optimal features from a large set that included amino acid composition, dipeptide composition, atomic composition, physicochemical properties, and chain-transition-distribution. PVP-SVM achieved an accuracy of 0.870 during leave-one-out cross-validation, which was 6% higher than control SVM predictors trained with all features, indicating the efficiency of the feature selection method. Furthermore, PVP-SVM displayed superior performance compared to the currently available method, PVPred, and two other machine-learning methods developed in this study when objectively evaluated with an independent dataset. For the convenience of the scientific community, a user-friendly and publicly accessible web server has been established at www.thegleelab.org/PVP-SVM/PVP-SVM.html.

  18. Classification

    DEFF Research Database (Denmark)

    Hjørland, Birger

    2017-01-01

    This article presents and discusses definitions of the term “classification” and the related concepts “Concept/conceptualization,”“categorization,” “ordering,” “taxonomy” and “typology.” It further presents and discusses theories of classification including the influences of Aristotle...... and Wittgenstein. It presents different views on forming classes, including logical division, numerical taxonomy, historical classification, hermeneutical and pragmatic/critical views. Finally, issues related to artificial versus natural classification and taxonomic monism versus taxonomic pluralism are briefly...

  19. One-Dimensional Convolutional Neural Network Land-Cover Classification of Multi-Seasonal Hyperspectral Imagery in the San Francisco Bay Area, California

    Directory of Open Access Journals (Sweden)

    Daniel Guidici

    2017-06-01

    Full Text Available In this study, a 1-D Convolutional Neural Network (CNN architecture was developed, trained and utilized to classify single (summer and three seasons (spring, summer, fall of hyperspectral imagery over the San Francisco Bay Area, California for the year 2015. For comparison, the Random Forests (RF and Support Vector Machine (SVM classifiers were trained and tested with the same data. In order to support space-based hyperspectral applications, all analyses were performed with simulated Hyperspectral Infrared Imager (HyspIRI imagery. Three-season data improved classifier overall accuracy by 2.0% (SVM, 1.9% (CNN to 3.5% (RF over single-season data. The three-season CNN provided an overall classification accuracy of 89.9%, which was comparable to overall accuracy of 89.5% for SVM. Both three-season CNN and SVM outperformed RF by over 7% overall accuracy. Analysis and visualization of the inner products for the CNN provided insight to distinctive features within the spectral-temporal domain. A method for CNN kernel tuning was presented to assess the importance of learned features. We concluded that CNN is a promising candidate for hyperspectral remote sensing applications because of the high classification accuracy and interpretability of its inner products.

  20. Improving supervised classification accuracy using non-rigid multimodal image registration: detecting prostate cancer

    Science.gov (United States)

    Chappelow, Jonathan; Viswanath, Satish; Monaco, James; Rosen, Mark; Tomaszewski, John; Feldman, Michael; Madabhushi, Anant

    2008-03-01

    Computer-aided diagnosis (CAD) systems for the detection of cancer in medical images require precise labeling of training data. For magnetic resonance (MR) imaging (MRI) of the prostate, training labels define the spatial extent of prostate cancer (CaP); the most common source for these labels is expert segmentations. When ancillary data such as whole mount histology (WMH) sections, which provide the gold standard for cancer ground truth, are available, the manual labeling of CaP can be improved by referencing WMH. However, manual segmentation is error prone, time consuming and not reproducible. Therefore, we present the use of multimodal image registration to automatically and accurately transcribe CaP from histology onto MRI following alignment of the two modalities, in order to improve the quality of training data and hence classifier performance. We quantitatively demonstrate the superiority of this registration-based methodology by comparing its results to the manual CaP annotation of expert radiologists. Five supervised CAD classifiers were trained using the labels for CaP extent on MRI obtained by the expert and 4 different registration techniques. Two of the registration methods were affi;ne schemes; one based on maximization of mutual information (MI) and the other method that we previously developed, Combined Feature Ensemble Mutual Information (COFEMI), which incorporates high-order statistical features for robust multimodal registration. Two non-rigid schemes were obtained by succeeding the two affine registration methods with an elastic deformation step using thin-plate splines (TPS). In the absence of definitive ground truth for CaP extent on MRI, classifier accuracy was evaluated against 7 ground truth surrogates obtained by different combinations of the expert and registration segmentations. For 26 multimodal MRI-WMH image pairs, all four registration methods produced a higher area under the receiver operating characteristic curve compared to that

  1. Classification Accuracy of a Wearable Activity Tracker for Assessing Sedentary Behavior and Physical Activity in 3–5-Year-Old Children

    Directory of Open Access Journals (Sweden)

    Wonwoo Byun

    2018-03-01

    Full Text Available This study examined the accuracy of the Fitbit activity tracker (FF for quantifying sedentary behavior (SB and varying intensities of physical activity (PA in 3–5-year-old children. Twenty-eight healthy preschool-aged children (Girls: 46%, Mean age: 4.8 ± 1.0 years wore the FF and were directly observed while performing a set of various unstructured and structured free-living activities from sedentary to vigorous intensity. The classification accuracy of the FF for measuring SB, light PA (LPA, moderate-to-vigorous PA (MVPA, and total PA (TPA was examined calculating Pearson correlation coefficients (r, mean absolute percent error (MAPE, Cohen’s kappa (k, sensitivity (Se, specificity (Sp, and area under the receiver operating curve (ROC-AUC. The classification accuracies of the FF (ROC-AUC were 0.92, 0.63, 0.77 and 0.92 for SB, LPA, MVPA and TPA, respectively. Similarly, values of kappa, Se, Sp and percentage of correct classification were consistently high for SB and TPA, but low for LPA and MVPA. The FF demonstrated excellent classification accuracy for assessing SB and TPA, but lower accuracy for classifying LPA and MVPA. Our findings suggest that the FF should be considered as a valid instrument for assessing time spent sedentary and overall physical activity in preschool-aged children.

  2. Application of SVM methods for mid-term load forecasting

    Directory of Open Access Journals (Sweden)

    Božić Miloš

    2011-01-01

    Full Text Available This paper presents an approach for the medium-term load forecasting using Support Vector Machines (SVMs. The proposed SVM model was employed to predict the maximum daily load demand for the period of a month. Analyses of available data were performed and the most important features for the construction of SVM model are selected. It was shown that the size and the structure of the training set may significantly affect the accuracy of predictions. The presented model was tested by applying it on real-life load data obtained from distribution company 'ED Jugoistok' for the territory of city Niš and its surroundings. Experimental results show that the proposed approach gives acceptable results for the entire period of prediction, which are in range with other solutions in this area.

  3. Extended SVM algorithms for multilevel trans-Z-source inverter

    Directory of Open Access Journals (Sweden)

    Aida Baghbany Oskouei

    2016-03-01

    Full Text Available This paper suggests extended algorithms for multilevel trans-Z-source inverter. These algorithms are based on space vector modulation (SVM, which works with high switching frequency and does not generate the mean value of the desired load voltage in every switching interval. In this topology the output voltage is not limited to dc voltage source similar to traditional cascaded multilevel inverter and can be increased with trans-Z-network shoot-through state control. Besides, it is more reliable against short circuit, and due to several number of dc sources in each phase of this topology, it is possible to use it in hybrid renewable energy. Proposed SVM algorithms include the following: Combined modulation algorithm (SVPWM and shoot-through implementation in dwell times of voltage vectors algorithm. These algorithms are compared from viewpoint of simplicity, accuracy, number of switching, and THD. Simulation and experimental results are presented to demonstrate the expected representations.

  4. An SVM-Based Classifier for Estimating the State of Various Rotating Components in Agro-Industrial Machinery with a Vibration Signal Acquired from a Single Point on the Machine Chassis

    Directory of Open Access Journals (Sweden)

    Ruben Ruiz-Gonzalez

    2014-11-01

    Full Text Available The goal of this article is to assess the feasibility of estimating the state of various rotating components in agro-industrial machinery by employing just one vibration signal acquired from a single point on the machine chassis. To do so, a Support Vector Machine (SVM-based system is employed. Experimental tests evaluated this system by acquiring vibration data from a single point of an agricultural harvester, while varying several of its working conditions. The whole process included two major steps. Initially, the vibration data were preprocessed through twelve feature extraction algorithms, after which the Exhaustive Search method selected the most suitable features. Secondly, the SVM-based system accuracy was evaluated by using Leave-One-Out cross-validation, with the selected features as the input data. The results of this study provide evidence that (i accurate estimation of the status of various rotating components in agro-industrial machinery is possible by processing the vibration signal acquired from a single point on the machine structure; (ii the vibration signal can be acquired with a uniaxial accelerometer, the orientation of which does not significantly affect the classification accuracy; and, (iii when using an SVM classifier, an 85% mean cross-validation accuracy can be reached, which only requires a maximum of seven features as its input, and no significant improvements are noted between the use of either nonlinear or linear kernels.

  5. Research the Impacts of Classification Accuracy after Orthorectification with Different Grid Density DSM/DEM%不同格网密度的DSM/DEM对影像分类精度的影响研究

    Institute of Scientific and Technical Information of China (English)

    刘晓宏; 雷兵; 谭海; 郭建华

    2017-01-01

    DSM/DEM elevation data are used as assistant data to eliminate or limit deformation of terrain in orthorectification without control points .However , the grid density of DSM/DEM has different effect on subsequent processing , such as image classification . Based on this problem , we apply ChinaDSM 15 m DSM, ASTER GDEM 30 m DEM and SRTM 90 m DEM to do orthorectification on ZY-3 image.Then, classifying the orthorectified image by support vector machines (SVM), and comparing the classification accura-cy.It is shown that the classification accuracy after ChinaDSM 15 m DSM orthorectificated , with the same resample method ,is better than ASTER GDEM 30 m DEM and SRTM 90 m DEM.%在无控制点的卫星影像正射校正中,大多采用DSM/DEM数据作为辅助数据来消除或限制因地形起伏引起的形变,然而经不同格网密度的DSM/DEM正射校正后的影像对后续处理会产生不同程度的影响,如对地物分类精度产生影响.针对这一问题,本文分别采用不同的DSM/DEM数据(ChinaDSM 15 m、ASTER GDEM 30 m和SRTM 90 m)对资源三号影像进行正射校正,然后对正射校正后影像利用支持向量机进行分类,比较正射校正后影像结果的分类精度.结果表明:在相同重采样方法下,影像经ChinaDSM 15 m DSM正射校正后结果的分类精度优于ASTER GDEM 30 m DEM和SRTM 90 m DEM.

  6. FEATURE EXTRACTION BASED WAVELET TRANSFORM IN BREAST CANCER DIAGNOSIS USING FUZZY AND NON-FUZZY CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    Pelin GORGEL

    2013-01-01

    Full Text Available This study helps to provide a second eye to the expert radiologists for the classification of manually extracted breast masses taken from 60 digital mammıgrams. These mammograms have been acquired from Istanbul University Faculty of Medicine Hospital and have 78 masses. The diagnosis is implemented with pre-processing by using feature extraction based Fast Wavelet Transform (FWT. Afterwards Adaptive Neuro-Fuzzy Inference System (ANFIS based fuzzy subtractive clustering and Support Vector Machines (SVM methods are used for the classification. It is a comparative study which uses these methods respectively. According to the results of the study, ANFIS based subtractive clustering produces ??% while SVM produces ??% accuracy in malignant-benign classification. The results demonstrate that the developed system could help the radiologists for a true diagnosis and decrease the number of the missing cancerous regions or unnecessary biopsies.

  7. An artificial intelligence based improved classification of two-phase flow patterns with feature extracted from acquired images.

    Science.gov (United States)

    Shanthi, C; Pappa, N

    2017-05-01

    Flow pattern recognition is necessary to select design equations for finding operating details of the process and to perform computational simulations. Visual image processing can be used to automate the interpretation of patterns in two-phase flow. In this paper, an attempt has been made to improve the classification accuracy of the flow pattern of gas/ liquid two- phase flow using fuzzy logic and Support Vector Machine (SVM) with Principal Component Analysis (PCA). The videos of six different types of flow patterns namely, annular flow, bubble flow, churn flow, plug flow, slug flow and stratified flow are recorded for a period and converted to 2D images for processing. The textural and shape features extracted using image processing are applied as inputs to various classification schemes namely fuzzy logic, SVM and SVM with PCA in order to identify the type of flow pattern. The results obtained are compared and it is observed that SVM with features reduced using PCA gives the better classification accuracy and computationally less intensive than other two existing schemes. This study results cover industrial application needs including oil and gas and any other gas-liquid two-phase flows. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.

  8. Motor Oil Classification using Color Histograms and Pattern Recognition Techniques.

    Science.gov (United States)

    Ahmadi, Shiva; Mani-Varnosfaderani, Ahmad; Habibi, Biuck

    2018-04-20

    Motor oil classification is important for quality control and the identification of oil adulteration. In thiswork, we propose a simple, rapid, inexpensive and nondestructive approach based on image analysis and pattern recognition techniques for the classification of nine different types of motor oils according to their corresponding color histograms. For this, we applied color histogram in different color spaces such as red green blue (RGB), grayscale, and hue saturation intensity (HSI) in order to extract features that can help with the classification procedure. These color histograms and their combinations were used as input for model development and then were statistically evaluated by using linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and support vector machine (SVM) techniques. Here, two common solutions for solving a multiclass classification problem were applied: (1) transformation to binary classification problem using a one-against-all (OAA) approach and (2) extension from binary classifiers to a single globally optimized multilabel classification model. In the OAA strategy, LDA, QDA, and SVM reached up to 97% in terms of accuracy, sensitivity, and specificity for both the training and test sets. In extension from binary case, despite good performances by the SVM classification model, QDA and LDA provided better results up to 92% for RGB-grayscale-HSI color histograms and up to 93% for the HSI color map, respectively. In order to reduce the numbers of independent variables for modeling, a principle component analysis algorithm was used. Our results suggest that the proposed method is promising for the identification and classification of different types of motor oils.

  9. Asynchronous data-driven classification of weapon systems

    International Nuclear Information System (INIS)

    Jin, Xin; Mukherjee, Kushal; Gupta, Shalabh; Ray, Asok; Phoha, Shashi; Damarla, Thyagaraju

    2009-01-01

    This communication addresses real-time weapon classification by analysis of asynchronous acoustic data, collected from microphones on a sensor network. The weapon classification algorithm consists of two parts: (i) feature extraction from time-series data using symbolic dynamic filtering (SDF), and (ii) pattern classification based on the extracted features using the language measure (LM) and support vector machine (SVM). The proposed algorithm has been tested on field data, generated by firing of two types of rifles. The results of analysis demonstrate high accuracy and fast execution of the pattern classification algorithm with low memory requirements. Potential applications include simultaneous shooter localization and weapon classification with soldier-wearable networked sensors. (rapid communication)

  10. Fusion of Airborne Discrete-Return LiDAR and Hyperspectral Data for Land Cover Classification

    Directory of Open Access Journals (Sweden)

    Shezhou Luo

    2015-12-01

    Full Text Available Accurate land cover classification information is a critical variable for many applications. This study presents a method to classify land cover using the fusion data of airborne discrete return LiDAR (Light Detection and Ranging and CASI (Compact Airborne Spectrographic Imager hyperspectral data. Four LiDAR-derived images (DTM, DSM, nDSM, and intensity and CASI data (48 bands with 1 m spatial resolution were spatially resampled to 2, 4, 8, 10, 20 and 30 m resolutions using the nearest neighbor resampling method. These data were thereafter fused using the layer stacking and principal components analysis (PCA methods. Land cover was classified by commonly used supervised classifications in remote sensing images, i.e., the support vector machine (SVM and maximum likelihood (MLC classifiers. Each classifier was applied to four types of datasets (at seven different spatial resolutions: (1 the layer stacking fusion data; (2 the PCA fusion data; (3 the LiDAR data alone; and (4 the CASI data alone. In this study, the land cover category was classified into seven classes, i.e., buildings, road, water bodies, forests, grassland, cropland and barren land. A total of 56 classification results were produced, and the classification accuracies were assessed and compared. The results show that the classification accuracies produced from two fused datasets were higher than that of the single LiDAR and CASI data at all seven spatial resolutions. Moreover, we find that the layer stacking method produced higher overall classification accuracies than the PCA fusion method using both the SVM and MLC classifiers. The highest classification accuracy obtained (OA = 97.8%, kappa = 0.964 using the SVM classifier on the layer stacking fusion data at 1 m spatial resolution. Compared with the best classification results of the CASI and LiDAR data alone, the overall classification accuracies improved by 9.1% and 19.6%, respectively. Our findings also demonstrated that the

  11. Diagnostic performance of whole brain volume perfusion CT in intra-axial brain tumors: Preoperative classification accuracy and histopathologic correlation

    International Nuclear Information System (INIS)

    Xyda, Argyro; Haberland, Ulrike; Klotz, Ernst; Jung, Klaus; Bock, Hans Christoph; Schramm, Ramona; Knauth, Michael; Schramm, Peter

    2012-01-01

    Background: To evaluate the preoperative diagnostic power and classification accuracy of perfusion parameters derived from whole brain volume perfusion CT (VPCT) in patients with cerebral tumors. Methods: Sixty-three patients (31 male, 32 female; mean age 55.6 ± 13.9 years), with MRI findings suspected of cerebral lesions, underwent VPCT. Two readers independently evaluated VPCT data. Volumes of interest (VOIs) were marked circumscript around the tumor according to maximum intensity projection volumes, and then mapped automatically onto the cerebral blood volume (CBV), flow (CBF) and permeability Ktrans perfusion datasets. A second VOI was placed in the contra lateral cortex, as control. Correlations among perfusion values, tumor grade, cerebral hemisphere and VOIs were evaluated. Moreover, the diagnostic power of VPCT parameters, by means of positive and negative predictive value, was analyzed. Results: Our cohort included 32 high-grade gliomas WHO III/IV, 18 low-grade I/II, 6 primary cerebral lymphomas, 4 metastases and 3 tumor-like lesions. Ktrans demonstrated the highest sensitivity, specificity and positive predictive value, with a cut-off point of 2.21 mL/100 mL/min, for both the comparisons between high-grade versus low-grade and low-grade versus primary cerebral lymphomas. However, for the differentiation between high-grade and primary cerebral lymphomas, CBF and CBV proved to have 100% specificity and 100% positive predictive value, identifying preoperatively all the histopathologically proven high-grade gliomas. Conclusion: Volumetric perfusion data enable the hemodynamic assessment of the entire tumor extent and provide a method of preoperative differentiation among intra-axial cerebral tumors with promising diagnostic accuracy.

  12. PolSAR Land Cover Classification Based on Roll-Invariant and Selected Hidden Polarimetric Features in the Rotation Domain

    Directory of Open Access Journals (Sweden)

    Chensong Tao

    2017-07-01

    Full Text Available Land cover classification is an important application for polarimetric synthetic aperture radar (PolSAR. Target polarimetric response is strongly dependent on its orientation. Backscattering responses of the same target with different orientations to the SAR flight path may be quite different. This target orientation diversity effect hinders PolSAR image understanding and interpretation. Roll-invariant polarimetric features such as entropy, anisotropy, mean alpha angle, and total scattering power are independent of the target orientation and are commonly adopted for PolSAR image classification. On the other aspect, target orientation diversity also contains rich information which may not be sensed by roll-invariant polarimetric features. In this vein, only using the roll-invariant polarimetric features may limit the final classification accuracy. To address this problem, this work uses the recently reported uniform polarimetric matrix rotation theory and a visualization and characterization tool of polarimetric coherence pattern to investigate hidden polarimetric features in the rotation domain along the radar line of sight. Then, a feature selection scheme is established and a set of hidden polarimetric features are selected in the rotation domain. Finally, a classification method is developed using the complementary information between roll-invariant and selected hidden polarimetric features with a support vector machine (SVM/decision tree (DT classifier. Comparison experiments are carried out with NASA/JPL AIRSAR and multi-temporal UAVSAR data. For AIRSAR data, the overall classification accuracy of the proposed classification method is 95.37% (with SVM/96.38% (with DT, while that of the conventional classification method is 93.87% (with SVM/94.12% (with DT, respectively. Meanwhile, for multi-temporal UAVSAR data, the mean overall classification accuracy of the proposed method is up to 97.47% (with SVM/99.39% (with DT, which is also higher

  13. Towards understanding the influence of SVM hyperparameters

    CSIR Research Space (South Africa)

    Van Heerden, CJ

    2010-11-01

    Full Text Available corresponding drop in accuracy, even for large C. Small values of C lead to poor classification accuracy irrespective of the value of γ. −6 −3.5 −1 1.5 4 6.5 55 60 65 70 75 80 85 90 95 log(C) CV Accurac y Banana Titanic Thyroid Heart... support vector s Banana Titanic Thyroid Heart Flare Solar Diabetis Image German Fig. 4. Normalized number of support vectors vs log(C). It is clear that for small C, the algorithm does not learn much and assigns almost all points as support...

  14. Poster abstract: A machine learning approach for vehicle classification using passive infrared and ultrasonic sensors

    KAUST Repository

    Warriach, Ehsan Ullah; Claudel, Christian G.

    2013-01-01

    This article describes the implementation of four different machine learning techniques for vehicle classification in a dual ultrasonic/passive infrared traffic flow sensors. Using k-NN, Naive Bayes, SVM and KNN-SVM algorithms, we show that KNN

  15. The Effects of Point or Polygon Based Training Data on RandomForest Classification Accuracy of Wetlands

    Directory of Open Access Journals (Sweden)

    Jennifer Corcoran

    2015-04-01

    Full Text Available Wetlands are dynamic in space and time, providing varying ecosystem services. Field reference data for both training and assessment of wetland inventories in the State of Minnesota are typically collected as GPS points over wide geographical areas and at infrequent intervals. This status-quo makes it difficult to keep updated maps of wetlands with adequate accuracy, efficiency, and consistency to monitor change. Furthermore, point reference data may not be representative of the prevailing land cover type for an area, due to point location or heterogeneity within the ecosystem of interest. In this research, we present techniques for training a land cover classification for two study sites in different ecoregions by implementing the RandomForest classifier in three ways: (1 field and photo interpreted points; (2 fixed window surrounding the points; and (3 image objects that intersect the points. Additional assessments are made to identify the key input variables. We conclude that the image object area training method is the most accurate and the most important variables include: compound topographic index, summer season green and blue bands, and grid statistics from LiDAR point cloud data, especially those that relate to the height of the return.

  16. Gas Classification Using Deep Convolutional Neural Networks

    Science.gov (United States)

    Peng, Pai; Zhao, Xiaojin; Pan, Xiaofang; Ye, Wenbin

    2018-01-01

    In this work, we propose a novel Deep Convolutional Neural Network (DCNN) tailored for gas classification. Inspired by the great success of DCNN in the field of computer vision, we designed a DCNN with up to 38 layers. In general, the proposed gas neural network, named GasNet, consists of: six convolutional blocks, each block consist of six layers; a pooling layer; and a fully-connected layer. Together, these various layers make up a powerful deep model for gas classification. Experimental results show that the proposed DCNN method is an effective technique for classifying electronic nose data. We also demonstrate that the DCNN method can provide higher classification accuracy than comparable Support Vector Machine (SVM) methods and Multiple Layer Perceptron (MLP). PMID:29316723

  17. Gas Classification Using Deep Convolutional Neural Networks.

    Science.gov (United States)

    Peng, Pai; Zhao, Xiaojin; Pan, Xiaofang; Ye, Wenbin

    2018-01-08

    In this work, we propose a novel Deep Convolutional Neural Network (DCNN) tailored for gas classification. Inspired by the great success of DCNN in the field of computer vision, we designed a DCNN with up to 38 layers. In general, the proposed gas neural network, named GasNet, consists of: six convolutional blocks, each block consist of six layers; a pooling layer; and a fully-connected layer. Together, these various layers make up a powerful deep model for gas classification. Experimental results show that the proposed DCNN method is an effective technique for classifying electronic nose data. We also demonstrate that the DCNN method can provide higher classification accuracy than comparable Support Vector Machine (SVM) methods and Multiple Layer Perceptron (MLP).

  18. Static Voltage Stability Analysis by Using SVM and Neural Network

    Directory of Open Access Journals (Sweden)

    Mehdi Hajian

    2013-01-01

    Full Text Available Voltage stability is an important problem in power system networks. In this paper, in terms of static voltage stability, and application of Neural Networks (NN and Supported Vector Machine (SVM for estimating of voltage stability margin (VSM and predicting of voltage collapse has been investigated. This paper considers voltage stability in power system in two parts. The first part calculates static voltage stability margin by Radial Basis Function Neural Network (RBFNN. The advantage of the used method is high accuracy in online detecting the VSM. Whereas the second one, voltage collapse analysis of power system is performed by Probabilistic Neural Network (PNN and SVM. The obtained results in this paper indicate, that time and number of training samples of SVM, are less than NN. In this paper, a new model of training samples for detection system, using the normal distribution load curve at each load feeder, has been used. Voltage stability analysis is estimated by well-know L and VSM indexes. To demonstrate the validity of the proposed methods, IEEE 14 bus grid and the actual network of Yazd Province are used.

  19. F-SVM: Combination of Feature Transformation and SVM Learning via Convex Relaxation

    OpenAIRE

    Wu, Xiaohe; Zuo, Wangmeng; Zhu, Yuanyuan; Lin, Liang

    2015-01-01

    The generalization error bound of support vector machine (SVM) depends on the ratio of radius and margin, while standard SVM only considers the maximization of the margin but ignores the minimization of the radius. Several approaches have been proposed to integrate radius and margin for joint learning of feature transformation and SVM classifier. However, most of them either require the form of the transformation matrix to be diagonal, or are non-convex and computationally expensive. In this ...

  20. A Hybrid Vehicle Detection Method Based on Viola-Jones and HOG + SVM from UAV Images.

    Science.gov (United States)

    Xu, Yongzheng; Yu, Guizhen; Wang, Yunpeng; Wu, Xinkai; Ma, Yalong

    2016-08-19

    A new hybrid vehicle detection scheme which integrates the Viola-Jones (V-J) and linear SVM classifier with HOG feature (HOG + SVM) methods is proposed for vehicle detection from low-altitude unmanned aerial vehicle (UAV) images. As both V-J and HOG + SVM are sensitive to on-road vehicles' in-plane rotation, the proposed scheme first adopts a roadway orientation adjustment method, which rotates each UAV image to align the roads with the horizontal direction so the original V-J or HOG + SVM method can be directly applied to achieve fast detection and high accuracy. To address the issue of descending detection speed for V-J and HOG + SVM, the proposed scheme further develops an adaptive switching strategy which sophistically integrates V-J and HOG + SVM methods based on their different descending trends of detection speed to improve detection efficiency. A comprehensive evaluation shows that the switching strategy, combined with the road orientation adjustment method, can significantly improve the efficiency and effectiveness of the vehicle detection from UAV images. The results also show that the proposed vehicle detection method is competitive compared with other existing vehicle detection methods. Furthermore, since the proposed vehicle detection method can be performed on videos captured from moving UAV platforms without the need of image registration or additional road database, it has great potentials of field applications. Future research will be focusing on expanding the current method for detecting other transportation modes such as buses, trucks, motors, bicycles, and pedestrians.

  1. A Hybrid Vehicle Detection Method Based on Viola-Jones and HOG + SVM from UAV Images

    Science.gov (United States)

    Xu, Yongzheng; Yu, Guizhen; Wang, Yunpeng; Wu, Xinkai; Ma, Yalong

    2016-01-01

    A new hybrid vehicle detection scheme which integrates the Viola-Jones (V-J) and linear SVM classifier with HOG feature (HOG + SVM) methods is proposed for vehicle detection from low-altitude unmanned aerial vehicle (UAV) images. As both V-J and HOG + SVM are sensitive to on-road vehicles’ in-plane rotation, the proposed scheme first adopts a roadway orientation adjustment method, which rotates each UAV image to align the roads with the horizontal direction so the original V-J or HOG + SVM method can be directly applied to achieve fast detection and high accuracy. To address the issue of descending detection speed for V-J and HOG + SVM, the proposed scheme further develops an adaptive switching strategy which sophistically integrates V-J and HOG + SVM methods based on their different descending trends of detection speed to improve detection efficiency. A comprehensive evaluation shows that the switching strategy, combined with the road orientation adjustment method, can significantly improve the efficiency and effectiveness of the vehicle detection from UAV images. The results also show that the proposed vehicle detection method is competitive compared with other existing vehicle detection methods. Furthermore, since the proposed vehicle detection method can be performed on videos captured from moving UAV platforms without the need of image registration or additional road database, it has great potentials of field applications. Future research will be focusing on expanding the current method for detecting other transportation modes such as buses, trucks, motors, bicycles, and pedestrians. PMID:27548179

  2. Influence of multi-source and multi-temporal remotely sensed and ancillary data on the accuracy of random forest classification of wetlands in northern Minnesota

    Science.gov (United States)

    Corcoran, Jennifer M.; Knight, Joseph F.; Gallant, Alisa L.

    2013-01-01

    Wetland mapping at the landscape scale using remotely sensed data requires both affordable data and an efficient accurate classification method. Random forest classification offers several advantages over traditional land cover classification techniques, including a bootstrapping technique to generate robust estimations of outliers in the training data, as well as the capability of measuring classification confidence. Though the random forest classifier can generate complex decision trees with a multitude of input data and still not run a high risk of over fitting, there is a great need to reduce computational and operational costs by including only key input data sets without sacrificing a significant level of accuracy. Our main questions for this study site in Northern Minnesota were: (1) how does classification accuracy and confidence of mapping wetlands compare using different remote sensing platforms and sets of input data; (2) what are the key input variables for accurate differentiation of upland, water, and wetlands, including wetland type; and (3) which datasets and seasonal imagery yield the best accuracy for wetland classification. Our results show the key input variables include terrain (elevation and curvature) and soils descriptors (hydric), along with an assortment of remotely sensed data collected in the spring (satellite visible, near infrared, and thermal bands; satellite normalized vegetation index and Tasseled Cap greenness and wetness; and horizontal-horizontal (HH) and horizontal-vertical (HV) polarization using L-band satellite radar). We undertook this exploratory analysis to inform decisions by natural resource managers charged with monitoring wetland ecosystems and to aid in designing a system for consistent operational mapping of wetlands across landscapes similar to those found in Northern Minnesota.

  3. Improving urban land use and land cover classification from high-spatial-resolution hyperspectral imagery using contextual information

    Science.gov (United States)

    Yang, He; Ma, Ben; Du, Qian; Yang, Chenghai

    2010-08-01

    In this paper, we propose approaches to improve the pixel-based support vector machine (SVM) classification for urban land use and land cover (LULC) mapping from airborne hyperspectral imagery with high spatial resolution. Class spatial neighborhood relationship is used to correct the misclassified class pairs, such as roof and trail, road and roof. These classes may be difficult to be separated because they may have similar spectral signatures and their spatial features are not distinct enough to help their discrimination. In addition, misclassification incurred from within-class trivial spectral variation can be corrected by using pixel connectivity information in a local window so that spectrally homogeneous regions can be well preserved. Our experimental results demonstrate the efficiency of the proposed approaches in classification accuracy improvement. The overall performance is competitive to the object-based SVM classification.

  4. The accuracy of International Classification of Diseases coding for dental problems not associated with trauma in a hospital emergency department.

    Science.gov (United States)

    Figueiredo, Rafael L F; Singhal, Sonica; Dempster, Laura; Hwang, Stephen W; Quinonez, Carlos

    2015-01-01

    Emergency department (ED) visits for nontraumatic dental conditions (NTDCs) may be a sign of unmet need for dental care. The objective of this study was to determine the accuracy of the International Classification of Diseases codes (ICD-10-CA) for ED visits for NTDC. ED visits in 2008-2099 at one hospital in Toronto were identified if the discharge diagnosis in the administrative database system was an ICD-10-CA code for a NTDC (K00-K14). A random sample of 100 visits was selected, and the medical records for these visits were reviewed by a dentist. The description of the clinical signs and symptoms were evaluated, and a diagnosis was assigned. This diagnosis was compared with the diagnosis assigned by the physician and the code assigned to the visit. The 100 ED visits reviewed were associated with 16 different ICD-10-CA codes for NTDC. Only 2 percent of these visits were clearly caused by trauma. The code K0887 (toothache) was the most frequent diagnostic code (31 percent). We found 43.3 percent disagreement on the discharge diagnosis reported by the physician, and 58.0 percent disagreement on the code in the administrative database assigned by the abstractor, compared with what it was suggested by the dentist reviewing the chart. There are substantial discrepancies between the ICD-10-CA diagnosis assigned in administrative databases and the diagnosis assigned by a dentist reviewing the chart retrospectively. However, ICD-10-CA codes can be used to accurately identify ED visits for NTDC. © 2015 American Association of Public Health Dentistry.

  5. A novel application of wavelet based SVM to transient phenomena identification of power transformers

    International Nuclear Information System (INIS)

    Jazebi, S.; Vahidi, B.; Jannati, M.

    2011-01-01

    A novel differential protection approach is introduced in the present paper. The proposed scheme is a combination of Support Vector Machine (SVM) and wavelet transform theories. Two common transients such as magnetizing inrush current and internal fault are considered. A new wavelet feature is extracted which reduces the computational cost and enhances the discrimination accuracy of SVM. Particle swarm optimization technique (PSO) has been applied to tune SVM parameters. The suitable performance of this method is demonstrated by simulation of different faults and switching conditions on a power transformer in PSCAD/EMTDC software. The method has the advantages of high accuracy and low computational burden (less than a quarter of a cycle). The other advantage is that the method is not dependent on a specific threshold. Sympathetic and recovery inrush currents also have been simulated and investigated. Results show that the proposed method could remain stable even in noisy environments.

  6. VLSI Design of SVM-Based Seizure Detection System With On-Chip Learning Capability.

    Science.gov (United States)

    Feng, Lichen; Li, Zunchao; Wang, Yuanfa

    2018-02-01

    Portable automatic seizure detection system is very convenient for epilepsy patients to carry. In order to make the system on-chip trainable with high efficiency and attain high detection accuracy, this paper presents a very large scale integration (VLSI) design based on the nonlinear support vector machine (SVM). The proposed design mainly consists of a feature extraction (FE) module and an SVM module. The FE module performs the three-level Daubechies discrete wavelet transform to fit the physiological bands of the electroencephalogram (EEG) signal and extracts the time-frequency domain features reflecting the nonstationary signal properties. The SVM module integrates the modified sequential minimal optimization algorithm with the table-driven-based Gaussian kernel to enable efficient on-chip learning. The presented design is verified on an Altera Cyclone II field-programmable gate array and tested using the two publicly available EEG datasets. Experiment results show that the designed VLSI system improves the detection accuracy and training efficiency.

  7. Extraction of prostatic lumina and automated recognition for prostatic calculus image using PCA-SVM.

    Science.gov (United States)

    Wang, Zhuocai; Xu, Xiangmin; Ding, Xiaojun; Xiao, Hui; Huang, Yusheng; Liu, Jian; Xing, Xiaofen; Wang, Hua; Liao, D Joshua

    2011-01-01

    Identification of prostatic calculi is an important basis for determining the tissue origin. Computation-assistant diagnosis of prostatic calculi may have promising potential but is currently still less studied. We studied the extraction of prostatic lumina and automated recognition for calculus images. Extraction of lumina from prostate histology images was based on local entropy and Otsu threshold recognition using PCA-SVM and based on the texture features of prostatic calculus. The SVM classifier showed an average time 0.1432 second, an average training accuracy of 100%, an average test accuracy of 93.12%, a sensitivity of 87.74%, and a specificity of 94.82%. We concluded that the algorithm, based on texture features and PCA-SVM, can recognize the concentric structure and visualized features easily. Therefore, this method is effective for the automated recognition of prostatic calculi.

  8. Extraction of Prostatic Lumina and Automated Recognition for Prostatic Calculus Image Using PCA-SVM

    Science.gov (United States)

    Wang, Zhuocai; Xu, Xiangmin; Ding, Xiaojun; Xiao, Hui; Huang, Yusheng; Liu, Jian; Xing, Xiaofen; Wang, Hua; Liao, D. Joshua

    2011-01-01

    Identification of prostatic calculi is an important basis for determining the tissue origin. Computation-assistant diagnosis of prostatic calculi may have promising potential but is currently still less studied. We studied the extraction of prostatic lumina and automated recognition for calculus images. Extraction of lumina from prostate histology images was based on local entropy and Otsu threshold recognition using PCA-SVM and based on the texture features of prostatic calculus. The SVM classifier showed an average time 0.1432 second, an average training accuracy of 100%, an average test accuracy of 93.12%, a sensitivity of 87.74%, and a specificity of 94.82%. We concluded that the algorithm, based on texture features and PCA-SVM, can recognize the concentric structure and visualized features easily. Therefore, this method is effective for the automated recognition of prostatic calculi. PMID:21461364

  9. Hybrid NN/SVM Computational System for Optimizing Designs

    Science.gov (United States)

    Rai, Man Mohan

    2009-01-01

    A computational method and system based on a hybrid of an artificial neural network (NN) and a support vector machine (SVM) (see figure) has been conceived as a means of maximizing or minimizing an objective function, optionally subject to one or more constraints. Such maximization or minimization could be performed, for example, to optimize solve a data-regression or data-classification problem or to optimize a design associated with a response function. A response function can be considered as a subset of a response surface, which is a surface in a vector space of design and performance parameters. A typical example of a design problem that the method and system can be used to solve is that of an airfoil, for which a response function could be the spatial distribution of pressure over the airfoil. In this example, the response surface would describe the pressure distribution as a function of the operating conditions and the geometric parameters of the airfoil. The use of NNs to analyze physical objects in order to optimize their responses under specified physical conditions is well known. NN analysis is suitable for multidimensional interpolation of data that lack structure and enables the representation and optimization of a succession of numerical solutions of increasing complexity or increasing fidelity to the real world. NN analysis is especially useful in helping to satisfy multiple design objectives. Feedforward NNs can be used to make estimates based on nonlinear mathematical models. One difficulty associated with use of a feedforward NN arises from the need for nonlinear optimization to determine connection weights among input, intermediate, and output variables. It can be very expensive to train an NN in cases in which it is necessary to model large amounts of information. Less widely known (in comparison with NNs) are support vector machines (SVMs), which were originally applied in statistical learning theory. In terms that are necessarily

  10. Using a Feature Subset Selection method and Support Vector Machine to address curse of dimensionality and redundancy in Hyperion hyperspectral data classification

    Directory of Open Access Journals (Sweden)

    Amir Salimi

    2018-04-01

    Full Text Available The curse of dimensionality resulted from insufficient training samples and redundancy is considered as an important problem in the supervised classification of hyperspectral data. This problem can be handled by Feature Subset Selection (FSS methods and Support Vector Machine (SVM. The FSS methods can manage the redundancy by removing redundant spectral bands. Moreover, kernel based methods, especially SVM have a high ability to classify limited-sample data sets. This paper mainly aims to assess the capability of a FSS method and the SVM in curse of dimensional circumstances and to compare results with the Artificial Neural Network (ANN, when they are used to classify alteration zones of the Hyperion hyperspectral image acquired from the greatest Iranian porphyry copper complex. The results demonstrated that by decreasing training samples, the accuracy of SVM was just decreased 1.8% while the accuracy of ANN was highly reduced i.e. 14.01%. In addition, a hybrid FSS was applied to reduce the dimension of Hyperion. Accordingly, among the 165 useable spectral bands of Hyperion, 18 bands were only selected as the most important and informative bands. Although this dimensionality reduction could not intensively improve the performance of SVM, ANN revealed a significant improvement in the computational time and a slightly enhancement in the average accuracy. Therefore, SVM as a low-sensitive method respect to the size of training data set and feature space can be applied to classify the curse of dimensional problems. Also, the FSS methods can improve the performance of non-kernel based classifiers by eliminating redundant features. Keywords: Curse of dimensionality, Feature Subset Selection, Hydrothermal alteration, Hyperspectral, SVM

  11. Automated Tissue Classification Framework for Reproducible Chronic Wound Assessment

    Directory of Open Access Journals (Sweden)

    Rashmi Mukherjee

    2014-01-01

    Full Text Available The aim of this paper was to develop a computer assisted tissue classification (granulation, necrotic, and slough scheme for chronic wound (CW evaluation using medical image processing and statistical machine learning techniques. The red-green-blue (RGB wound images grabbed by normal digital camera were first transformed into HSI (hue, saturation, and intensity color space and subsequently the “S” component of HSI color channels was selected as it provided higher contrast. Wound areas from 6 different types of CW were segmented from whole images using fuzzy divergence based thresholding by minimizing edge ambiguity. A set of color and textural features describing granulation, necrotic, and slough tissues in the segmented wound area were extracted using various mathematical techniques. Finally, statistical learning algorithms, namely, Bayesian classification and support vector machine (SVM, were trained and tested for wound tissue classification in different CW images. The performance of the wound area segmentation protocol was further validated by ground truth images labeled by clinical experts. It was observed that SVM with 3rd order polynomial kernel provided the highest accuracies, that is, 86.94%, 90.47%, and 75.53%, for classifying granulation, slough, and necrotic tissues, respectively. The proposed automated tissue classification technique achieved the highest overall accuracy, that is, 87.61%, with highest kappa statistic value (0.793.

  12. Deteksi Penyakit Dengue Hemorrhagic Fever dengan Pendekatan One Class Classification

    Directory of Open Access Journals (Sweden)

    Zida Ziyan Azkiya

    2017-10-01

    Full Text Available Two class classification problem maps input into two target classes. In certain cases, training data is available only in the form of a single class, as in the case of Dengue Hemorrhagic Fever (DHF patients, where only data of positive patients is available. In this paper, we report our experiment in building a classification model for detecting DHF infection using One Class Classification (OCC approach. Data from this study is sourced from laboratory tests of patients with dengue fever. The OCC methods compared are One-Class Support Vector Machine and One-Class K-Means. The result shows SVM method obtained precision value = 1.0, recall = 0.993, f-1 score = 0.997, and accuracy of 99.7% while the K-Means method obtained precision value = 0.901, recall = 0.973, f- 1 score = 0.936, and accuracy of 93.3%. This indicates that the SVM method is slightly superior to K-Means for One-Class Classification of DHF patients.

  13. Enhanced land use/cover classification of heterogeneous tropical landscapes using support vector machines and textural homogeneity

    Science.gov (United States)

    Paneque-Gálvez, Jaime; Mas, Jean-François; Moré, Gerard; Cristóbal, Jordi; Orta-Martínez, Martí; Luz, Ana Catarina; Guèze, Maximilien; Macía, Manuel J.; Reyes-García, Victoria

    2013-08-01

    Land use/cover classification is a key research field in remote sensing and land change science as thematic maps derived from remotely sensed data have become the basis for analyzing many socio-ecological issues. However, land use/cover classification remains a difficult task and it is especially challenging in heterogeneous tropical landscapes where nonetheless such maps are of great importance. The present study aims at establishing an efficient classification approach to accurately map all broad land use/cover classes in a large, heterogeneous tropical area, as a basis for further studies (e.g., land use/cover change, deforestation and forest degradation). Specifically, we first compare the performance of parametric (maximum likelihood), non-parametric (k-nearest neighbor and four different support vector machines - SVM), and hybrid (unsupervised-supervised) classifiers, using hard and soft (fuzzy) accuracy assessments. We then assess, using the maximum likelihood algorithm, what textural indices from the gray-level co-occurrence matrix lead to greater classification improvements at the spatial resolution of Landsat imagery (30 m), and rank them accordingly. Finally, we use the textural index that provides the most accurate classification results to evaluate whether its usefulness varies significantly with the classifier used. We classified imagery corresponding to dry and wet seasons and found that SVM classifiers outperformed all the rest. We also found that the use of some textural indices, but particularly homogeneity and entropy, can significantly improve classifications. We focused on the use of the homogeneity index, which has so far been neglected in land use/cover classification efforts, and found that this index along with reflectance bands significantly increased the overall accuracy of all the classifiers, but particularly of SVM. We observed that improvements in producer's and user's accuracies through the inclusion of homogeneity were different

  14. A support vector machine classifier reduces interscanner variation in the HRCT classification of regional disease pattern in diffuse lung disease: Comparison to a Bayesian classifier

    Energy Technology Data Exchange (ETDEWEB)

    Chang, Yongjun; Lim, Jonghyuck; Kim, Namkug; Seo, Joon Beom [Department of Radiology, University of Ulsan College of Medicine, 388-1 Pungnap2-dong, Songpa-gu, Seoul 138-736 (Korea, Republic of); Lynch, David A. [Department of Radiology, National Jewish Medical and Research Center, Denver, Colorado 80206 (United States)

    2013-05-15

    Purpose: To investigate the effect of using different computed tomography (CT) scanners on the accuracy of high-resolution CT (HRCT) images in classifying regional disease patterns in patients with diffuse lung disease, support vector machine (SVM) and Bayesian classifiers were applied to multicenter data. Methods: Two experienced radiologists marked sets of 600 rectangular 20 Multiplication-Sign 20 pixel regions of interest (ROIs) on HRCT images obtained from two scanners (GE and Siemens), including 100 ROIs for each of local patterns of lungs-normal lung and five of regional pulmonary disease patterns (ground-glass opacity, reticular opacity, honeycombing, emphysema, and consolidation). Each ROI was assessed using 22 quantitative features belonging to one of the following descriptors: histogram, gradient, run-length, gray level co-occurrence matrix, low-attenuation area cluster, and top-hat transform. For automatic classification, a Bayesian classifier and a SVM classifier were compared under three different conditions. First, classification accuracies were estimated using data from each scanner. Next, data from the GE and Siemens scanners were used for training and testing, respectively, and vice versa. Finally, all ROI data were integrated regardless of the scanner type and were then trained and tested together. All experiments were performed based on forward feature selection and fivefold cross-validation with 20 repetitions. Results: For each scanner, better classification accuracies were achieved with the SVM classifier than the Bayesian classifier (92% and 82%, respectively, for the GE scanner; and 92% and 86%, respectively, for the Siemens scanner). The classification accuracies were 82%/72% for training with GE data and testing with Siemens data, and 79%/72% for the reverse. The use of training and test data obtained from the HRCT images of different scanners lowered the classification accuracy compared to the use of HRCT images from the same scanner. For

  15. A support vector machine classifier reduces interscanner variation in the HRCT classification of regional disease pattern in diffuse lung disease: Comparison to a Bayesian classifier

    International Nuclear Information System (INIS)

    Chang, Yongjun; Lim, Jonghyuck; Kim, Namkug; Seo, Joon Beom; Lynch, David A.

    2013-01-01

    Purpose: To investigate the effect of using different computed tomography (CT) scanners on the accuracy of high-resolution CT (HRCT) images in classifying regional disease patterns in patients with diffuse lung disease, support vector machine (SVM) and Bayesian classifiers were applied to multicenter data. Methods: Two experienced radiologists marked sets of 600 rectangular 20 × 20 pixel regions of interest (ROIs) on HRCT images obtained from two scanners (GE and Siemens), including 100 ROIs for each of local patterns of lungs—normal lung and five of regional pulmonary disease patterns (ground-glass opacity, reticular opacity, honeycombing, emphysema, and consolidation). Each ROI was assessed using 22 quantitative features belonging to one of the following descriptors: histogram, gradient, run-length, gray level co-occurrence matrix, low-attenuation area cluster, and top-hat transform. For automatic classification, a Bayesian classifier and a SVM classifier were compared under three different conditions. First, classification accuracies were estimated using data from each scanner. Next, data from the GE and Siemens scanners were used for training and testing, respectively, and vice versa. Finally, all ROI data were integrated regardless of the scanner type and were then trained and tested together. All experiments were performed based on forward feature selection and fivefold cross-validation with 20 repetitions. Results: For each scanner, better classification accuracies were achieved with the SVM classifier than the Bayesian classifier (92% and 82%, respectively, for the GE scanner; and 92% and 86%, respectively, for the Siemens scanner). The classification accuracies were 82%/72% for training with GE data and testing with Siemens data, and 79%/72% for the reverse. The use of training and test data obtained from the HRCT images of different scanners lowered the classification accuracy compared to the use of HRCT images from the same scanner. For integrated ROI

  16. Statistical Fractal Models Based on GND-PCA and Its Application on Classification of Liver Diseases

    Directory of Open Access Journals (Sweden)

    Huiyan Jiang

    2013-01-01

    Full Text Available A new method is proposed to establish the statistical fractal model for liver diseases classification. Firstly, the fractal theory is used to construct the high-order tensor, and then Generalized -dimensional Principal Component Analysis (GND-PCA is used to establish the statistical fractal model and select the feature from the region of liver; at the same time different features have different weights, and finally, Support Vector Machine Optimized Ant Colony (ACO-SVM algorithm is used to establish the classifier for the recognition of liver disease. In order to verify the effectiveness of the proposed method, PCA eigenface method and normal SVM method are chosen as the contrast methods. The experimental results show that the proposed method can reconstruct liver volume better and improve the classification accuracy of liver diseases.

  17. Examining applying high performance genetic data feature selection and classification algorithms for colon cancer diagnosis.

    Science.gov (United States)

    Al-Rajab, Murad; Lu, Joan; Xu, Qiang

    2017-07-01

    This paper examines the accuracy and efficiency (time complexity) of high performance genetic data feature selection and classification algorithms for colon cancer diagnosis. The need for this research derives from the urgent and increasing need for accurate and efficient algorithms. Colon cancer is a leading cause of death worldwide, hence it is vitally important for the cancer tissues to be expertly identified and classified in a rapid and timely manner, to assure both a fast detection of the disease and to expedite the drug discovery process. In this research, a three-phase approach was proposed and implemented: Phases One and Two examined the feature selection algorithms and classification algorithms employed separately, and Phase Three examined the performance of the combination of these. It was found from Phase One that the Particle Swarm Optimization (PSO) algorithm performed best with the colon dataset as a feature selection (29 genes selected) and from Phase Two that the Support Vector Machine (SVM) algorithm outperformed other classifications, with an accuracy of almost 86%. It was also found from Phase Three that the combined use of PSO and SVM surpassed other algorithms in accuracy and performance, and was faster in terms of time analysis (94%). It is concluded that applying feature selection algorithms prior to classification algorithms results in better accuracy than when the latter are applied alone. This conclusion is important and significant to industry and society. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. A Spectral-Texture Kernel-Based Classification Method for Hyperspectral Images

    Directory of Open Access Journals (Sweden)

    Yi Wang

    2016-11-01

    Full Text Available Classification of hyperspectral images always suffers from high dimensionality and very limited labeled samples. Recently, the spectral-spatial classification has attracted considerable attention and can achieve higher classification accuracy and smoother classification maps. In this paper, a novel spectral-spatial classification method for hyperspectral images by using kernel methods is investigated. For a given hyperspectral image, the principle component analysis (PCA transform is first performed. Then, the first principle component of the input image is segmented into non-overlapping homogeneous regions by using the entropy rate superpixel (ERS algorithm. Next, the local spectral histogram model is applied to each homogeneous region to obtain the corresponding texture features. Because this step is performed within each homogenous region, instead of within a fixed-size image window, the obtained local texture features in the image are more accurate, which can effectively benefit the improvement of classification accuracy. In the following step, a contextual spectral-texture kernel is constructed by combining spectral information in the image and the extracted texture information using the linearity property of the kernel methods. Finally, the classification map is achieved by the support vector machines (SVM classifier using the proposed spectral-texture kernel. Experiments on two benchmark airborne hyperspectral datasets demonstrate that our method can effectively improve classification accuracies, even though only a very limited training sample is available. Specifically, our method can achieve from 8.26% to 15.1% higher in terms of overall accuracy than the traditional SVM classifier. The performance of our method was further compared to several state-of-the-art classification methods of hyperspectral images using objective quantitative measures and a visual qualitative evaluation.

  19. Recursive SVM biomarker selection for early detection of breast cancer in peripheral blood.

    Science.gov (United States)

    Zhang, Fan; Kaufman, Howard L; Deng, Youping; Drabier, Renee

    2013-01-01

    Breast cancer is worldwide the second most common type of cancer after lung cancer. Traditional mammography and Tissue Microarray has been studied for early cancer detection and cancer prediction. However, there is a need for more reliable diagnostic tools for early detection of breast cancer. This can be a challenge due to a number of factors and logistics. First, obtaining tissue biopsies can be difficult. Second, mammography may not detect small tumors, and is often unsatisfactory for younger women who typically have dense breast tissue. Lastly, breast cancer is not a single homogeneous disease but consists of multiple disease states, each arising from a distinct molecular mechanism and having a distinct clinical progression path which makes the disease difficult to detect and predict in early stages. In the paper, we present a Support Vector Machine based on Recursive Feature Elimination and Cross Validation (SVM-RFE-CV) algorithm for early detection of breast cancer in peripheral blood and show how to use SVM-RFE-CV to model the classification and prediction problem of early detection of breast cancer in peripheral blood.The training set which consists of 32 health and 33 cancer samples and the testing set consisting of 31 health and 34 cancer samples were randomly separated from a dataset of peripheral blood of breast cancer that is downloaded from Gene Express Omnibus. First, we identified the 42 differentially expressed biomarkers between "normal" and "cancer". Then, with the SVM-RFE-CV we extracted 15 biomarkers that yield zero cross validation score. Lastly, we compared the classification and prediction performance of SVM-RFE-CV with that of SVM and SVM Recursive Feature Elimination (SVM-RFE). We found that 1) the SVM-RFE-CV is suitable for analyzing noisy high-throughput microarray data, 2) it outperforms SVM-RFE in the robustness to noise and in the ability to recover informative features, and 3) it can improve the prediction performance (Area Under

  20. SVM classifier to predict genes important for self-renewal and pluripotency of mouse embryonic stem cells

    Directory of Open Access Journals (Sweden)

    Xu Huilei

    2010-12-01

    Full Text Available Abstract Background Mouse embryonic stem cells (mESCs are derived from the inner cell mass of a developing blastocyst and can be cultured indefinitely in-vitro. Their distinct features are their ability to self-renew and to differentiate to all adult cell types. Genes that maintain mESCs self-renewal and pluripotency identity are of interest to stem cell biologists. Although significant steps have been made toward the identification and characterization of such genes, the list is still incomplete and controversial. For example, the overlap among candidate self-renewal and pluripotency genes across different RNAi screens is surprisingly small. Meanwhile, machine learning approaches have been used to analyze multi-dimensional experimental data and integrate results from many studies, yet they have not been applied to specifically tackle the task of predicting and classifying self-renewal and pluripotency gene membership. Results For this study we developed a classifier, a supervised machine learning framework for predicting self-renewal and pluripotency mESCs stemness membership genes (MSMG using support vector machines (SVM. The data used to train the classifier was derived from mESCs-related studies using mRNA microarrays, measuring gene expression in various stages of early differentiation, as well as ChIP-seq studies applied to mESCs profiling genome-wide binding of key transcription factors, such as Nanog, Oct4, and Sox2, to the regulatory regions of other genes. Comparison to other classification methods using the leave-one-out cross-validation method was employed to evaluate the accuracy and generality of the classification. Finally, two sets of candidate genes from genome-wide RNA interference screens are used to test the generality and potential application of the classifier. Conclusions Our results reveal that an SVM approach can be useful for prioritizing genes for functional validation experiments and complement the analyses of high

  1. Gradient Evolution-based Support Vector Machine Algorithm for Classification

    Science.gov (United States)

    Zulvia, Ferani E.; Kuo, R. J.

    2018-03-01

    This paper proposes a classification algorithm based on a support vector machine (SVM) and gradient evolution (GE) algorithms. SVM algorithm has been widely used in classification. However, its result is significantly influenced by the parameters. Therefore, this paper aims to propose an improvement of SVM algorithm which can find the best SVMs’ parameters automatically. The proposed algorithm employs a GE algorithm to automatically determine the SVMs’ parameters. The GE algorithm takes a role as a global optimizer in finding the best parameter which will be used by SVM algorithm. The proposed GE-SVM algorithm is verified using some benchmark datasets and compared with other metaheuristic-based SVM algorithms. The experimental results show that the proposed GE-SVM algorithm obtains better results than other algorithms tested in this paper.

  2. A DWT and SVM based method for rolling element bearing fault diagnosis and its comparison with Artificial Neural Networks

    Directory of Open Access Journals (Sweden)

    Sunil Tyagi

    2017-04-01

    Full Text Available A classification technique using Support Vector Machine (SVM classifier for detection of rolling element bearing fault is presented here.  The SVM was fed from features that were extracted from of vibration signals obtained from experimental setup consisting of rotating driveline that was mounted on rolling element bearings which were run in normal and with artificially faults induced conditions. The time-domain vibration signals were divided into 40 segments and simple features such as peaks in time domain and spectrum along with statistical features such as standard deviation, skewness, kurtosis etc. were extracted. Effectiveness of SVM classifier was compared with the performance of Artificial Neural Network (ANN classifier and it was found that the performance of SVM classifier is superior to that of ANN. The effect of pre-processing of the vibration signal by Discreet Wavelet Transform (DWT prior to feature extraction is also studied and it is shown that pre-processing of vibration signal with DWT enhances the effectiveness of both ANN and SVM classifiers. It has been demonstrated from experiment results that performance of SVM classifier is better than ANN in detection of bearing condition and pre-processing the vibration signal with DWT improves the performance of SVM classifier.

  3. gkmSVM: an R package for gapped-kmer SVM.

    Science.gov (United States)

    Ghandi, Mahmoud; Mohammad-Noori, Morteza; Ghareghani, Narges; Lee, Dongwon; Garraway, Levi; Beer, Michael A

    2016-07-15

    We present a new R package for training gapped-kmer SVM classifiers for DNA and protein sequences. We describe an improved algorithm for kernel matrix calculation that speeds run time by about 2 to 5-fold over our original gkmSVM algorithm. This package supports several sequence kernels, including: gkmSVM, kmer-SVM, mismatch kernel and wildcard kernel. gkmSVM package is freely available through the Comprehensive R Archive Network (CRAN), for Linux, Mac OS and Windows platforms. The C ++ implementation is available at www.beerlab.org/gkmsvm mghandi@gmail.com or mbeer@jhu.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  4. Hyperspectral Image Enhancement and Mixture Deep-Learning Classification of Corneal Epithelium Injuries.

    Science.gov (United States)

    Noor, Siti Salwa Md; Michael, Kaleena; Marshall, Stephen; Ren, Jinchang

    2017-11-16

    In our preliminary study, the reflectance signatures obtained from hyperspectral imaging (HSI) of normal and abnormal corneal epithelium tissues of porcine show similar morphology with subtle differences. Here we present image enhancement algorithms that can be used to improve the interpretability of data into clinically relevant information to facilitate diagnostics. A total of 25 corneal epithelium images without the application of eye staining were used. Three image feature extraction approaches were applied for image classification: (i) image feature classification from histogram using a support vector machine with a Gaussian radial basis function (SVM-GRBF); (ii) physical image feature classification using deep-learning Convolutional Neural Networks (CNNs) only; and (iii) the combined classification of CNNs and SVM-Linear. The performance results indicate that our chosen image features from the histogram and length-scale parameter were able to classify with up to 100% accuracy; particularly, at CNNs and CNNs-SVM, by employing 80% of the data sample for training and 20% for testing. Thus, in the assessment of corneal epithelium injuries, HSI has high potential as a method that could surpass current technologies regarding speed, objectivity, and reliability.

  5. Stellar Spectral Classification with Minimum Within-Class and ...

    Indian Academy of Sciences (India)

    Support Vector Machine (SVM) is one of the important stellar spectral classification methods, and it is widely used in practice. But its classification efficiencies cannot be greatly improved because it does not take the class distribution into consideration. In view of this, a modified SVM-named Minimum within-class and ...

  6. PSO-SVM-Based Online Locomotion Mode Identification for Rehabilitation Robotic Exoskeletons.

    Science.gov (United States)

    Long, Yi; Du, Zhi-Jiang; Wang, Wei-Dong; Zhao, Guang-Yu; Xu, Guo-Qiang; He, Long; Mao, Xi-Wang; Dong, Wei

    2016-09-02

    Locomotion mode identification is essential for the control of a robotic rehabilitation exoskeletons. This paper proposes an online support vector machine (SVM) optimized by particle swarm optimization (PSO) to identify different locomotion modes to realize a smooth and automatic locomotion transition. A PSO algorithm is used to obtain the optimal parameters of SVM for a better overall performance. Signals measured by the foot pressure sensors integrated in the insoles of wearable shoes and the MEMS-based attitude and heading reference systems (AHRS) attached on the shoes and shanks of leg segments are fused together as the input information of SVM. Based on the chosen window whose size is 200 ms (with sampling frequency of 40 Hz), a three-layer wavelet packet analysis (WPA) is used for feature extraction, after which, the kernel principal component analysis (kPCA) is utilized to reduce the dimension of the feature set to reduce computation cost of the SVM. Since the signals are from two types of different sensors, the normalization is conducted to scale the input into the interval of [0, 1]. Five-fold cross validation is adapted to train the classifier, which prevents the classifier over-fitting. Based on the SVM model obtained offline in MATLAB, an online SVM algorithm is constructed for locomotion mode identification. Experiments are performed for different locomotion modes and experimental results show the effectiveness of the proposed algorithm with an accuracy of 96.00% ± 2.45%. To improve its accuracy, majority vote algorithm (MVA) is used for post-processing, with which the identification accuracy is better than 98.35% ± 1.65%. The proposed algorithm can be extended and employed in the field of robotic rehabilitation and assistance.

  7. PSO-SVM-Based Online Locomotion Mode Identification for Rehabilitation Robotic Exoskeletons

    Directory of Open Access Journals (Sweden)

    Yi Long

    2016-09-01

    Full Text Available Locomotion mode identification is essential for the control of a robotic rehabilitation exoskeletons. This paper proposes an online support vector machine (SVM optimized by particle swarm optimization (PSO to identify different locomotion modes to realize a smooth and automatic locomotion transition. A PSO algorithm is used to obtain the optimal parameters of SVM for a better overall performance. Signals measured by the foot pressure sensors integrated in the insoles of wearable shoes and the MEMS-based attitude and heading reference systems (AHRS attached on the shoes and shanks of leg segments are fused together as the input information of SVM. Based on the chosen window whose size is 200 ms (with sampling frequency of 40 Hz, a three-layer wavelet packet analysis (WPA is used for feature extraction, after which, the kernel principal component analysis (kPCA is utilized to reduce the dimension of the feature set to reduce computation cost of the SVM. Since the signals are from two types of different sensors, the normalization is conducted to scale the input into the interval of [0, 1]. Five-fold cross validation is adapted to train the classifier, which prevents the classifier over-fitting. Based on the SVM model obtained offline in MATLAB, an online SVM algorithm is constructed for locomotion mode identification. Experiments are performed for different locomotion modes and experimental results show the effectiveness of the proposed algorithm with an accuracy of 96.00% ± 2.45%. To improve its accuracy, majority vote algorithm (MVA is used for post-processing, with which the identification accuracy is better than 98.35% ± 1.65%. The proposed algorithm can be extended and employed in the field of robotic rehabilitation and assistance.

  8. A Method for Aileron Actuator Fault Diagnosis Based on PCA and PGC-SVM

    Directory of Open Access Journals (Sweden)

    Wei-Li Qin

    2016-01-01

    Full Text Available Aileron actuators are pivotal components for aircraft flight control system. Thus, the fault diagnosis of aileron actuators is vital in the enhancement of the reliability and fault tolerant capability. This paper presents an aileron actuator fault diagnosis approach combining principal component analysis (PCA, grid search (GS, 10-fold cross validation (CV, and one-versus-one support vector machine (SVM. This method is referred to as PGC-SVM and utilizes the direct drive valve input, force motor current, and displacement feedback signal to realize fault detection and location. First, several common faults of aileron actuators, which include force motor coil break, sensor coil break, cylinder leakage, and amplifier gain reduction, are extracted from the fault quadrantal diagram; the corresponding fault mechanisms are analyzed. Second, the data feature extraction is performed with dimension reduction using PCA. Finally, the GS and CV algorithms are employed to train a one-versus-one SVM for fault classification, thus obtaining the optimal model parameters and assuring the generalization of the trained SVM, respectively. To verify the effectiveness of the proposed approach, four types of faults are introduced into the simulation model established by AMESim and Simulink. The results demonstrate its desirable diagnostic performance which outperforms that of the traditional SVM by comparison.

  9. Uav-Based Crops Classification with Joint Features from Orthoimage and Dsm Data

    Science.gov (United States)

    Liu, B.; Shi, Y.; Duan, Y.; Wu, W.

    2018-04-01

    Accurate crops classification remains a challenging task due to the same crop with different spectra and different crops with same spectrum phenomenon. Recently, UAV-based remote sensing approach gains popularity not only for its high spatial and temporal resolution, but also for its ability to obtain spectraand spatial data at the same time. This paper focus on how to take full advantages of spatial and spectrum features to improve crops classification accuracy, based on an UAV platform equipped with a general digital camera. Texture and spatial features extracted from the RGB orthoimage and the digital surface model of the monitoring area are analysed and integrated within a SVM classification framework. Extensive experiences results indicate that the overall classification accuracy is drastically improved from 72.9 % to 94.5 % when the spatial features are combined together, which verified the feasibility and effectiveness of the proposed method.

  10. Multiple kernel boosting framework based on information measure for classification

    International Nuclear Information System (INIS)

    Qi, Chengming; Wang, Yuping; Tian, Wenjie; Wang, Qun

    2016-01-01

    The performance of kernel-based method, such as support vector machine (SVM), is greatly affected by the choice of kernel function. Multiple kernel learning (MKL) is a promising family of machine learning algorithms and has attracted many attentions in recent years. MKL combines multiple sub-kernels to seek better results compared to single kernel learning. In order to improve the efficiency of SVM and MKL, in this paper, the Kullback–Leibler kernel function is derived to develop SVM. The proposed method employs an improved ensemble learning framework, named KLMKB, which applies Adaboost to learning multiple kernel-based classifier. In the experiment for hyperspectral remote sensing image classification, we employ feature selected through Optional Index Factor (OIF) to classify the satellite image. We extensively examine the performance of our approach in comparison to some relevant and state-of-the-art algorithms on a number of benchmark classification data sets and hyperspectral remote sensing image data set. Experimental results show that our method has a stable behavior and a noticeable accuracy for different data set.

  11. Intelligent Agent-Based Intrusion Detection System Using Enhanced Multiclass SVM

    Science.gov (United States)

    Ganapathy, S.; Yogesh, P.; Kannan, A.

    2012-01-01

    Intrusion detection systems were used in the past along with various techniques to detect intrusions in networks effectively. However, most of these systems are able to detect the intruders only with high false alarm rate. In this paper, we propose a new intelligent agent-based intrusion detection model for mobile ad hoc networks using a combination of attribute selection, outlier detection, and enhanced multiclass SVM classification methods. For this purpose, an effective preprocessing technique is proposed that improves the detection accuracy and reduces the processing time. Moreover, two new algorithms, namely, an Intelligent Agent Weighted Distance Outlier Detection algorithm and an Intelligent Agent-based Enhanced Multiclass Support Vector Machine algorithm are proposed for detecting the intruders in a distributed database environment that uses intelligent agents for trust management and coordination in transaction processing. The experimental results of the proposed model show that this system detects anomalies with low false alarm rate and high-detection rate when tested with KDD Cup 99 data set. PMID:23056036

  12. [Identification of varieties of cashmere by Vis/NIR spectroscopy technology based on PCA-SVM].

    Science.gov (United States)

    Wu, Gui-Fang; He, Yong

    2009-06-01

    One mixed algorithm was presented to discriminate cashmere varieties with principal component analysis (PCA) and support vector machine (SVM). Cashmere fiber has such characteristics as threadlike, softness, glossiness and high tensile strength. The quality characters and economic value of each breed of cashmere are very different. In order to safeguard the consumer's rights and guarantee the quality of cashmere product, quickly, efficiently and correctly identifying cashmere has significant meaning to the production and transaction of cashmere material. The present research adopts Vis/NIRS spectroscopy diffuse techniques to collect the spectral data of cashmere. The near infrared fingerprint of cashmere was acquired by principal component analysis (PCA), and support vector machine (SVM) methods were used to further identify the cashmere material. The result of PCA indicated that the score map made by the scores of PC1, PC2 and PC3 was used, and 10 principal components (PCs) were selected as the input of support vector machine (SVM) based on the reliabilities of PCs of 99.99%. One hundred cashmere samples were used for calibration and the remaining 75 cashmere samples were used for validation. A one-against-all multi-class SVM model was built, the capabilities of SVM with different kernel function were comparatively analyzed, and the result showed that SVM possessing with the Gaussian kernel function has the best identification capabilities with the accuracy of 100%. This research indicated that the data mining method of PCA-SVM has a good identification effect, and can work as a new method for rapid identification of cashmere material varieties.

  13. Predictive Manufacturing: A Classification Strategy to Predict Product Failures

    DEFF Research Database (Denmark)

    Khan, Abdul Rauf; Schiøler, Henrik; Kulahci, Murat

    2018-01-01

    manufacturing analytics model that employs a big data approach to predicting product failures; third, we illustrate the issue of high dimensionality, along with statistically redundant information; and, finally, our proposed method will be compared against the well-known classification methods (SVM, K......-nearest neighbor, artificial neural networks). The results from real data show that our predictive manufacturing analytics approach, using genetic algorithms and Voronoi tessellations, is capable of predicting product failure with reasonable accuracy. The potential application of this method contributes...... to accurately predicting product failures, which would enable manufacturers to reduce production costs without compromising product quality....

  14. A Classification Method for Seed Viability Assessment with Infrared Thermography

    Directory of Open Access Journals (Sweden)

    Sen Men

    2017-04-01

    Full Text Available This paper presents a viability assessment method for Pisum sativum L. seeds based on the infrared thermography technique. In this work, different artificial treatments were conducted to prepare seeds samples with different viability. Thermal images and visible images were recorded every five minutes during the standard five day germination test. After the test, the root length of each sample was measured, which can be used as the viability index of that seed. Each individual seed area in the visible images was segmented with an edge detection method, and the average temperature of the corresponding area in the infrared images was calculated as the representative temperature for this seed at that time. The temperature curve of each seed during germination was plotted. Thirteen characteristic parameters extracted from the temperature curve were analyzed to show the difference of the temperature fluctuations between the seeds samples with different viability. With above parameters, support vector machine (SVM was used to classify the seed samples into three categories: viable, aged and dead according to the root length, the classification accuracy rate was 95%. On this basis, with the temperature data of only the first three hours during the germination, another SVM model was proposed to classify the seed samples, and the accuracy rate was about 91.67%. From these experimental results, it can be seen that infrared thermography can be applied for the prediction of seed viability, based on the SVM algorithm.

  15. A RLS-SVM Aided Fusion Methodology for INS during GPS Outages.

    Science.gov (United States)

    Yao, Yiqing; Xu, Xiaosu

    2017-02-24

    In order to maintain a relatively high accuracy of navigation performance during global positioning system (GPS) outages, a novel robust least squares support vector machine (LS-SVM)-aided fusion methodology is explored to provide the pseudo-GPS position information for the inertial navigation system (INS). The relationship between the yaw, specific force, velocity, and the position increment is modeled. Rather than share the same weight in the traditional LS-SVM, the proposed algorithm allocates various weights for different data, which makes the system immune to the outliers. Field test data was collected to evaluate the proposed algorithm. The comparison results indicate that the proposed algorithm can effectively provide position corrections for standalone INS during the 300 s GPS outage, which outperforms the traditional LS-SVM method. Historical information is also involved to better represent the vehicle dynamics.

  16. A RLS-SVM Aided Fusion Methodology for INS during GPS Outages

    Directory of Open Access Journals (Sweden)

    Yiqing Yao

    2017-02-01

    Full Text Available In order to maintain a relatively high accuracy of navigation performance during global positioning system (GPS outages, a novel robust least squares support vector machine (LS-SVM-aided fusion methodology is explored to provide the pseudo-GPS position information for the inertial navigation system (INS. The relationship between the yaw, specific force, velocity, and the position increment is modeled. Rather than share the same weight in the traditional LS-SVM, the proposed algorithm allocates various weights for different data, which makes the system immune to the outliers. Field test data was collected to evaluate the proposed algorithm. The comparison results indicate that the proposed algorithm can effectively provide position corrections for standalone INS during the 300 s GPS outage, which outperforms the traditional LS-SVM method. Historical information is also involved to better represent the vehicle dynamics.

  17. An intelligent framework for medical image retrieval using MDCT and multi SVM.

    Science.gov (United States)

    Balan, J A Alex Rajju; Rajan, S Edward

    2014-01-01

    Volumes of medical images are rapidly generated in medical field and to manage them effectively has become a great challenge. This paper studies the development of innovative medical image retrieval based on texture features and accuracy. The objective of the paper is to analyze the image retrieval based on diagnosis of healthcare management systems. This paper traces the development of innovative medical image retrieval to estimate both the image texture features and accuracy. The texture features of medical images are extracted using MDCT and multi SVM. Both the theoretical approach and the simulation results revealed interesting observations and they were corroborated using MDCT coefficients and SVM methodology. All attempts to extract the data about the image in response to the query has been computed successfully and perfect image retrieval performance has been obtained. Experimental results on a database of 100 trademark medical images show that an integrated texture feature representation results in 98% of the images being retrieved using MDCT and multi SVM. Thus we have studied a multiclassification technique based on SVM which is prior suitable for medical images. The results show the retrieval accuracy of 98%, 99% for different sets of medical images with respect to the class of image.

  18. DCS-SVM: a novel semi-automated method for human brain MR image segmentation.

    Science.gov (United States)

    Ahmadvand, Ali; Daliri, Mohammad Reza; Hajiali, Mohammadtaghi

    2017-11-27

    In this paper, a novel method is proposed which appropriately segments magnetic resonance (MR) brain images into three main tissues. This paper proposes an extension of our previous work in which we suggested a combination of multiple classifiers (CMC)-based methods named dynamic classifier selection-dynamic local training local Tanimoto index (DCS-DLTLTI) for MR brain image segmentation into three main cerebral tissues. This idea is used here and a novel method is developed that tries to use more complex and accurate classifiers like support vector machine (SVM) in the ensemble. This work is challenging because the CMC-based methods are time consuming, especially on huge datasets like three-dimensional (3D) brain MR images. Moreover, SVM is a powerful method that is used for modeling datasets with complex feature space, but it also has huge computational cost for big datasets, especially those with strong interclass variability problems and with more than two classes such as 3D brain images; therefore, we cannot use SVM in DCS-DLTLTI. Therefore, we propose a novel approach named "DCS-SVM" to use SVM in DCS-DLTLTI to improve the accuracy of segmentation results. The proposed method is applied on well-known datasets of the Internet Brain Segmentation Repository (IBSR) and promising results are obtained.

  19. Application of texture analysis method for mammogram density classification

    Science.gov (United States)

    Nithya, R.; Santhi, B.

    2017-07-01

    Mammographic density is considered a major risk factor for developing breast cancer. This paper proposes an automated approach to classify breast tissue types in digital mammogram. The main objective of the proposed Computer-Aided Diagnosis (CAD) system is to investigate various feature extraction methods and classifiers to improve the diagnostic accuracy in mammogram density classification. Texture analysis methods are used to extract the features from the mammogram. Texture features are extracted by using histogram, Gray Level Co-Occurrence Matrix (GLCM), Gray Level Run Length Matrix (GLRLM), Gray Level Difference Matrix (GLDM), Local Binary Pattern (LBP), Entropy, Discrete Wavelet Transform (DWT), Wavelet Packet Transform (WPT), Gabor transform and trace transform. These extracted features are selected using Analysis of Variance (ANOVA). The features selected by ANOVA are fed into the classifiers to characterize the mammogram into two-class (fatty/dense) and three-class (fatty/glandular/dense) breast density classification. This work has been carried out by using the mini-Mammographic Image Analysis Society (MIAS) database. Five classifiers are employed namely, Artificial Neural Network (ANN), Linear Discriminant Analysis (LDA), Naive Bayes (NB), K-Nearest Neighbor (KNN), and Support Vector Machine (SVM). Experimental results show that ANN provides better performance than LDA, NB, KNN and SVM classifiers. The proposed methodology has achieved 97.5% accuracy for three-class and 99.37% for two-class density classification.

  20. Supervised learning methods for pathological arterial pulse wave differentiation: A SVM and neural networks approach.

    Science.gov (United States)

    Paiva, Joana S; Cardoso, João; Pereira, Tânia

    2018-01-01

    The main goal of this study was to develop an automatic method based on supervised learning methods, able to distinguish healthy from pathologic arterial pulse wave (APW), and those two from noisy waveforms (non-relevant segments of the signal), from the data acquired during a clinical examination with a novel optical system. The APW dataset analysed was composed by signals acquired in a clinical environment from a total of 213 subjects, including healthy volunteers and non-healthy patients. The signals were parameterised by means of 39pulse features: morphologic, time domain statistics, cross-correlation features, wavelet features. Multiclass Support Vector Machine Recursive Feature Elimination (SVM RFE) method was used to select the most relevant features. A comparative study was performed in order to evaluate the performance of the two classifiers: Support Vector Machine (SVM) and Artificial Neural Network (ANN). SVM achieved a statistically significant better performance for this problem with an average accuracy of 0.9917±0.0024 and a F-Measure of 0.9925±0.0019, in comparison with ANN, which reached the values of 0.9847±0.0032 and 0.9852±0.0031 for Accuracy and F-Measure, respectively. A significant difference was observed between the performances obtained with SVM classifier using a different number of features from the original set available. The comparison between SVM and NN allowed reassert the higher performance of SVM. The results obtained in this study showed the potential of the proposed method to differentiate those three important signal outcomes (healthy, pathologic and noise) and to reduce bias associated with clinical diagnosis of cardiovascular disease using APW. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. Assessment of fatty degeneration of the gluteal muscles in patients with THA using MRI: reliability and accuracy of the Goutallier and quartile classification systems.

    Science.gov (United States)

    Engelken, Florian; Wassilew, Georgi I; Köhlitz, Torsten; Brockhaus, Sebastian; Hamm, Bernd; Perka, Carsten; Diederichs, und Gerd

    2014-01-01

    The purpose of this study was to quantify the performance of the Goutallier classification for assessing fatty degeneration of the gluteus muscles from magnetic resonance (MR) images and to compare its performance to a newly proposed system. Eighty-four hips with clinical signs of gluteal insufficiency and 50 hips from asymptomatic controls were analyzed using a standard classification system (Goutallier) and a new scoring system (Quartile). Interobserver reliability and intraobserver repeatability were determined, and accuracy was assessed by comparing readers' scores with quantitative estimates of the proportion of intramuscular fat based on MR signal intensities (gold standard). The existing Goutallier classification system and the new Quartile system performed equally well in assessing fatty degeneration of the gluteus muscles, both showing excellent levels of interrater and intrarater agreement. While the Goutallier classification system has the advantage of being widely known, the benefit of the Quartile system is that it is based on more clearly defined grades of fatty degeneration. Copyright © 2014 Elsevier Inc. All rights reserved.

  2. Towards automatic lithological classification from remote sensing data using support vector machines

    Science.gov (United States)

    Yu, Le; Porwal, Alok; Holden, Eun-Jung; Dentith, Michael

    2010-05-01

    principal component bands, 14 independent component bands, 3 band ratios, 3 DEM derivatives: slope/curvatureroughness and 2 aeromagnetic derivatives: mean and variance of susceptibility) extracted from the ASTER, DEM and aeromagnetic data, in order to determine the optimal inputs that provide the highest classification accuracy. It was found that a combination of ASTER-derived independent components, principal components and band ratios, DEM-derived slope, curvature and roughness, and aeromagnetic-derived mean and variance of magnetic susceptibility provide the highest classification accuracy of 93.4% on independent test samples. A comparison of the classification results of the SVM with those of maximum likelihood (84.9%) and minimum distance (38.4%) classifiers clearly show that the SVM algorithm returns much higher classification accuracy. Therefore, the SVM method can be used to produce quick and reliable geological maps from scarce geological information, which is still the case with many under-developed frontier regions of the world.

  3. A Classification Framework Applied to Cancer Gene Expression Profiles

    Directory of Open Access Journals (Sweden)

    Hussein Hijazi

    2013-01-01

    Full Text Available Classification of cancer based on gene expression has provided insight into possible treatment strategies. Thus, developing machine learning methods that can successfully distinguish among cancer subtypes or normal versus cancer samples is important. This work discusses supervised learning techniques that have been employed to classify cancers. Furthermore, a two-step feature selection method based on an attribute estimation method (e.g., ReliefF and a genetic algorithm was employed to find a set of genes that can best differentiate between cancer subtypes or normal versus cancer samples. The application of different classification methods (e.g., decision tree, k-nearest neighbor, support vector machine (SVM, bagging, and random forest on 5 cancer datasets shows that no classification method universally outperforms all the others. However, k-nearest neighbor and linear SVM generally improve the classification performance over other classifiers. Finally, incorporating diverse types of genomic data (e.g., protein-protein interaction data and gene expression increase the prediction accuracy as compared to using gene expression alone.

  4. Forecasting Dry Bulk Freight Index with Improved SVM

    Directory of Open Access Journals (Sweden)

    Qianqian Han

    2014-01-01

    Full Text Available An improved SVM model is presented to forecast dry bulk freight index (BDI in this paper, which is a powerful tool for operators and investors to manage the market trend and avoid price risking shipping industry. The BDI is influenced by many factors, especially the random incidents in dry bulk market, inducing the difficulty in forecasting of BDI. Therefore, to eliminate the impact of random incidents in dry bulk market, wavelet transform is adopted to denoise the BDI data series. Hence, the combined model of wavelet transform and support vector machine is developed to forecast BDI in this paper. Lastly, the BDI data in 2005 to 2012 are presented to test the proposed model. The 84 prior consecutive monthly BDI data are the inputs of the model, and the last 12 monthly BDI data are the outputs of model. The parameters of the model are optimized by genetic algorithm and the final model is conformed through SVM training. This paper compares the forecasting result of proposed method and three other forecasting methods. The result shows that the proposed method has higher accuracy and could be used to forecast the short-term trend of the BDI.

  5. Penerapan Support Vector Machine (SVM untuk Pengkategorian Penelitian

    Directory of Open Access Journals (Sweden)

    Fithri Selva Jumeilah

    2017-07-01

    Full Text Available Research every college will continue to grow. Research will be stored in softcopy and hardcopy. The preparation of the research should be categorized in order to facilitate the search for people who need reference. To categorize the research, we need a method for text mining, one of them is with the implementation of Support Vector Machines (SVM. The data used to recognize the characteristics of each category then it takes secondary data which is a collection of abstracts of research. The data will be pre-processed with several stages: case folding converts all the letters into lowercase, stop words removal removal of very common words, tokenizing discard punctuation, and stemming searching for root words by removing the prefix and suffix. Further data that has undergone preprocessing will be converted into a numerical form with for the term weighting stage that is the weighting contribution of each word. From the results of term weighting then obtained data that can be used for data training and test data. The training process is done by providing input in the form of text data that is known to the class or category. Then by using the Support Vector Machines algorithm, the input data is transformed into a rule, function, or knowledge model that can be used in the prediction process. From the results of this study obtained that the categorization of research produced by SVM has been very good. This is proven by the results of the test which resulted in an accuracy of 90%.

  6. Real-Time Subject-Independent Pattern Classification of Overt and Covert Movements from fNIRS Signals.

    Directory of Open Access Journals (Sweden)

    Neethu Robinson

    Full Text Available Recently, studies have reported the use of Near Infrared Spectroscopy (NIRS for developing Brain-Computer Interface (BCI by applying online pattern classification of brain states from subject-specific fNIRS signals. The purpose of the present study was to develop and test a real-time method for subject-specific and subject-independent classification of multi-channel fNIRS signals using support-vector machines (SVM, so as to determine its feasibility as an online neurofeedback system. Towards this goal, we used left versus right hand movement execution and movement imagery as study paradigms in a series of experiments. In the first two experiments, activations in the motor cortex during movement execution and movement imagery were used to develop subject-dependent models that obtained high classification accuracies thereby indicating the robustness of our classification method. In the third experiment, a generalized classifier-model was developed from the first two experimental data, which was then applied for subject-independent neurofeedback training. Application of this method in new participants showed mean classification accuracy of 63% for movement imagery tasks and 80% for movement execution tasks. These results, and their corresponding offline analysis reported in this study demonstrate that SVM based real-time subject-independent classification of fNIRS signals is feasible. This method has important applications in the field of hemodynamic BCIs, and neuro-rehabilitation where patients can be trained to learn spatio-temporal patterns of healthy brain activity.

  7. Analyzing the diagnostic accuracy of the causes of spinal pain at neurology hospital in accordance with the International Classification of Diseases

    Directory of Open Access Journals (Sweden)

    I. G. Mikhailyuk

    2014-01-01

    Full Text Available Spinal pain is of great socioeconomic significance as it is widely prevalent and a common cause of disability. However, the diagnosis of its true causes frequently leads to problems. A study has been conducted to evaluate the accuracy of a clinical diagnosis and its coding in conformity with the International Classification of Diseases. The diagnosis of vertebral osteochondrosis and the hypodiagnosis of nonspecific and nonvertebrogenic pain syndromes have been found to be unreasonably widely used. Ways to solve these problems have been proposed, by applying approaches to diagnosing the causes of spinal pain in accordance with international practice.

  8. Testing the Potential of Vegetation Indices for Land Use/cover Classification Using High Resolution Data

    Science.gov (United States)

    Karakacan Kuzucu, A.; Bektas Balcik, F.

    2017-11-01

    Accurate and reliable land use/land cover (LULC) information obtained by remote sensing technology is necessary in many applications such as environmental monitoring, agricultural management, urban planning, hydrological applications, soil management, vegetation condition study and suitability analysis. But this information still remains a challenge especially in heterogeneous landscapes covering urban and rural areas due to spectrally similar LULC features. In parallel with technological developments, supplementary data such as satellite-derived spectral indices have begun to be used as additional bands in classification to produce data with high accuracy. The aim of this research is to test the potential of spectral vegetation indices combination with supervised classification methods and to extract reliable LULC information from SPOT 7 multispectral imagery. The Normalized Difference Vegetation Index (NDVI), the Ratio Vegetation Index (RATIO), the Soil Adjusted Vegetation Index (SAVI) were the three vegetation indices used in this study. The classical maximum likelihood classifier (MLC) and support vector machine (SVM) algorithm were applied to classify SPOT 7 image. Catalca is selected region located in the north west of the Istanbul in Turkey, which has complex landscape covering artificial surface, forest and natural area, agricultural field, quarry/mining area, pasture/scrubland and water body. Accuracy assessment of all classified images was performed through overall accuracy and kappa coefficient. The results indicated that the incorporation of these three different vegetation indices decrease the classification accuracy for the MLC and SVM classification. In addition, the maximum likelihood classification slightly outperformed the support vector machine classification approach in both overall accuracy and kappa statistics.

  9. Using support vector machines with tract-based spatial statistics for automated classification of Tourette syndrome children

    Science.gov (United States)

    Wen, Hongwei; Liu, Yue; Wang, Jieqiong; Zhang, Jishui; Peng, Yun; He, Huiguang

    2016-03-01

    Tourette syndrome (TS) is a developmental neuropsychiatric disorder with the cardinal symptoms of motor and vocal tics which emerges in early childhood and fluctuates in severity in later years. To date, the neural basis of TS is not fully understood yet and TS has a long-term prognosis that is difficult to accurately estimate. Few studies have looked at the potential of using diffusion tensor imaging (DTI) in conjunction with machine learning algorithms in order to automate the classification of healthy children and TS children. Here we apply Tract-Based Spatial Statistics (TBSS) method to 44 TS children and 48 age and gender matched healthy children in order to extract the diffusion values from each voxel in the white matter (WM) skeleton, and a feature selection algorithm (ReliefF) was used to select the most salient voxels for subsequent classification with support vector machine (SVM). We use a nested cross validation to yield an unbiased assessment of the classification method and prevent overestimation. The accuracy (88.04%), sensitivity (88.64%) and specificity (87.50%) were achieved in our method as peak performance of the SVM classifier was achieved using the axial diffusion (AD) metric, demonstrating the potential of a joint TBSS and SVM pipeline for fast, objective classification of healthy and TS children. These results support that our methods may be useful for the early identification of subjects with TS, and hold promise for predicting prognosis and treatment outcome for individuals with TS.

  10. Pre-cancer risk assessment in habitual smokers from DIC images of oral exfoliative cells using active contour and SVM analysis.

    Science.gov (United States)

    Dey, Susmita; Sarkar, Ripon; Chatterjee, Kabita; Datta, Pallab; Barui, Ananya; Maity, Santi P

    2017-04-01

    Habitual smokers are known to be at higher risk for developing oral cancer, which is increasing at an alarming rate globally. Conventionally, oral cancer is associated with high mortality rates, although recent reports show the improved survival outcomes by early diagnosis of disease. An effective prediction system which will enable to identify the probability of cancer development amongst the habitual smokers, is thus expected to benefit sizable number of populations. Present work describes a non-invasive, integrated method for early detection of cellular abnormalities based on analysis of different cyto-morphological features of exfoliative oral epithelial cells. Differential interference contrast (DIC) microscopy provides a potential optical tool as this mode provides a pseudo three dimensional (3-D) image with detailed morphological and textural features obtained from noninvasive, label free epithelial cells. For segmentation of DIC images, gradient vector flow snake model active contour process has been adopted. To evaluate cellular abnormalities amongst habitual smokers, the selected morphological and textural features of epithelial cells are compared with the non-smoker (-ve control group) group and clinically diagnosed pre-cancer patients (+ve control group) using support vector machine (SVM) classifier. Accuracy of the developed SVM based classification has been found to be 86% with 80% sensitivity and 89% specificity in classifying the features from the volunteers having smoking habit. Copyright © 2017 Elsevier Ltd. All rights reserved.

  11. Abnormal Gait Behavior Detection for Elderly Based on Enhanced Wigner-Ville Analysis and Cloud Incremental SVM Learning

    Directory of Open Access Journals (Sweden)

    Jian Luo

    2016-01-01

    Full Text Available A cloud based health care system is proposed in this paper for the elderly by providing abnormal gait behavior detection, classification, online diagnosis, and remote aid service. Intelligent mobile terminals with triaxial acceleration sensor embedded are used to capture the movement and ambulation information of elderly. The collected signals are first enhanced by a Kalman filter. And the magnitude of signal vector features is then extracted and decomposed into a linear combination of enhanced Gabor atoms. The Wigner-Ville analysis method is introduced and the problem is studied by joint time-frequency analysis. In order to solve the large-scale abnormal behavior data lacking problem in training process, a cloud based incremental SVM (CI-SVM learning method is proposed. The original abnormal behavior data are first used to get the initial SVM classifier. And the larger abnormal behavior data of elderly collected by mobile devices are then gathered in cloud platform to conduct incremental training and get the new SVM classifier. By the CI-SVM learning method, the knowledge of SVM classifier could be accumulated due to the dynamic incremental learning. Experimental results demonstrate that the proposed method is feasible and can be applied to aged care, emergency aid, and related fields.

  12. Classification of High-Mountain Vegetation Communities within a Diverse Giant Mountains Ecosystem Using Airborne APEX Hyperspectral Imagery

    Directory of Open Access Journals (Sweden)

    Adriana Marcinkowska-Ochtyra

    2018-04-01

    Full Text Available Mapping plant communities is a difficult and time consuming endeavor. Methods relying on field surveys deliver high quality data but are usually limited to relatively small areas. In this paper we apply airborne hyperspectral data to vegetation mapping in remote and hard to reach areas. We classified 22 vegetation communities in the Giant Mountains on 3.12-m Airborne Prism Experiment (APEX hyperspectral images, registered in 288 spectral bands (10 September 2012. As the classification algorithm, Support Vector Machines (SVM was used. APEX data were corrected geometrically and atmospherically, and three dimensionality reduction methods were performed to select the best dataset. As reference we used a non-forest vegetation map containing vegetation communities of Polish Karkonosze National Park from 2002, orthophotomap and field surveys data from 2013 to 2014. We obtained the post-classification maps of 22 vegetation communities, lakes and areas without any vegetation. Iterative accuracy assessment repeated 100 times was used to obtain the most objective results for individual communities. The median value of overall accuracy (OA was 84%. Fourteen out of twenty-four classes were classified of more than 80% of producer accuracy (PA and sixteen out of twenty-four of user accuracy (UA. APEX data and SVM with the use of iterative accuracy assessment are useful for the mountain communities classification. This can support both Polish and Czech national parks management by giving the information about diversity of communities in the whole transboundary area, helping with identification especially in changing environment caused by humans.

  13. Feature Selection Method Based on Artificial Bee Colony Algorithm and Support Vector Machines for Medical Datasets Classification

    Directory of Open Access Journals (Sweden)

    Mustafa Serter Uzer

    2013-01-01

    Full Text Available This paper offers a hybrid approach that uses the artificial bee colony (ABC algorithm for feature selection and support vector machines for classification. The purpose of this paper is to test the effect of elimination of the unimportant and obsolete features of the datasets on the success of the classification, using the SVM classifier. The developed approach conventionally used in liver diseases and diabetes diagnostics, which are commonly observed and reduce the quality of life, is developed. For the diagnosis of these diseases, hepatitis, liver disorders and diabetes datasets from the UCI database were used, and the proposed system reached a classification accuracies of 94.92%, 74.81%, and 79.29%, respectively. For these datasets, the classification accuracies were obtained by the help of the 10-fold cross-validation method. The results show that the performance of the method is highly successful compared to other results attained and seems very promising for pattern recognition applications.

  14. Incorporation of support vector machines in the LIBS toolbox for sensitive and robust classification amidst unexpected sample and system variability.

    Science.gov (United States)

    Dingari, Narahara Chari; Barman, Ishan; Myakalwar, Ashwin Kumar; Tewari, Surya P; Kumar Gundawar, Manoj

    2012-03-20

    Despite the intrinsic elemental analysis capability and lack of sample preparation requirements, laser-induced breakdown spectroscopy (LIBS) has not been extensively used for real-world applications, e.g., quality assurance and process monitoring. Specifically, variability in sample, system, and experimental parameters in LIBS studies present a substantive hurdle for robust classification, even when standard multivariate chemometric techniques are used for analysis. Considering pharmaceutical sample investigation as an example, we propose the use of support vector machines (SVM) as a nonlinear classification method over conventional linear techniques such as soft independent modeling of class analogy (SIMCA) and partial least-squares discriminant analysis (PLS-DA) for discrimination based on LIBS measurements. Using over-the-counter pharmaceutical samples, we demonstrate that the application of SVM enables statistically significant improvements in prospective classification accuracy (sensitivity), because of its ability to address variability in LIBS sample ablation and plasma self-absorption behavior. Furthermore, our results reveal that SVM provides nearly 10% improvement in correct allocation rate and a concomitant reduction in misclassification rates of 75% (cf. PLS-DA) and 80% (cf. SIMCA)-when measurements from samples not included in the training set are incorporated in the test data-highlighting its robustness. While further studies on a wider matrix of sample types performed using different LIBS systems is needed to fully characterize the capability of SVM to provide superior predictions, we anticipate that the improved sensitivity and robustness observed here will facilitate application of the proposed LIBS-SVM toolbox for screening drugs and detecting counterfeit samples, as well as in related areas of forensic and biological sample analysis.

  15. Classification Accuracy of MMPI-2 Validity Scales in the Detection of Pain-Related Malingering: A Known-Groups Study

    Science.gov (United States)

    Bianchini, Kevin J.; Etherton, Joseph L.; Greve, Kevin W.; Heinly, Matthew T.; Meyers, John E.

    2008-01-01

    The purpose of this study was to determine the accuracy of "Minnesota Multiphasic Personality Inventory" 2nd edition (MMPI-2; Butcher, Dahlstrom, Graham, Tellegen, & Kaemmer, 1989) validity indicators in the detection of malingering in clinical patients with chronic pain using a hybrid clinical-known groups/simulator design. The…

  16. Can we improve accuracy and reliability of MRI interpretation in children with optic pathway glioma? Proposal for a reproducible imaging classification

    Energy Technology Data Exchange (ETDEWEB)

    Lambron, Julien; Frampas, Eric; Toulgoat, Frederique [University Hospital, Department of Radiology, Nantes (France); Rakotonjanahary, Josue [University Hospital, Department of Pediatric Oncology, Angers (France); University Paris Diderot, INSERM CIE5 Robert Debre Hospital, Assistance Publique-Hopitaux de Paris (AP-HP), Paris (France); Loisel, Didier [University Hospital, Department of Radiology, Angers (France); Carli, Emilie de; Rialland, Xavier [University Hospital, Department of Pediatric Oncology, Angers (France); Delion, Matthieu [University Hospital, Department of Neurosurgery, Angers (France)

    2016-02-15

    Magnetic resonance (MR) images from children with optic pathway glioma (OPG) are complex. We initiated this study to evaluate the accuracy of MR imaging (MRI) interpretation and to propose a simple and reproducible imaging classification for MRI. We randomly selected 140 MRIs from among 510 MRIs performed on 104 children diagnosed with OPG in France from 1990 to 2004. These images were reviewed independently by three radiologists (F.T., 15 years of experience in neuroradiology; D.L., 25 years of experience in pediatric radiology; and J.L., 3 years of experience in radiology) using a classification derived from the Dodge and modified Dodge classifications. Intra- and interobserver reliabilities were assessed using the Bland-Altman method and the kappa coefficient. These reviews allowed the definition of reliable criteria for MRI interpretation. The reviews showed intraobserver variability and large discrepancies among the three radiologists (kappa coefficient varying from 0.11 to 1). These variabilities were too large for the interpretation to be considered reproducible over time or among observers. A consensual analysis, taking into account all observed variabilities, allowed the development of a definitive interpretation protocol. Using this revised protocol, we observed consistent intra- and interobserver results (kappa coefficient varying from 0.56 to 1). The mean interobserver difference for the solid portion of the tumor with contrast enhancement was 0.8 cm{sup 3} (limits of agreement = -16 to 17). We propose simple and precise rules for improving the accuracy and reliability of MRI interpretation for children with OPG. Further studies will be necessary to investigate the possible prognostic value of this approach. (orig.)

  17. CNN-SVM for Microvascular Morphological Type Recognition with Data Augmentation.

    Science.gov (United States)

    Xue, Di-Xiu; Zhang, Rong; Feng, Hui; Wang, Ya-Lei

    2016-01-01

    This paper focuses on the problem of feature extraction and the classification of microvascular morphological types to aid esophageal cancer detection. We present a patch-based system with a hybrid SVM model with data augmentation for intraepithelial papillary capillary loop recognition. A greedy patch-generating algorithm and a specialized CNN named NBI-Net are designed to extract hierarchical features from patches. We investigate a series of data augmentation techniques to progressively improve the prediction invariance of image scaling and rotation. For classifier boosting, SVM is used as an alternative to softmax to enhance generalization ability. The effectiveness of CNN feature representation ability is discussed for a set of widely used CNN models, including AlexNet, VGG-16, and GoogLeNet. Experiments are conducted on the NBI-ME dataset. The recognition rate is up to 92.74% on the patch level with data augmentation and classifier boosting. The results show that the combined CNN-SVM model beats models of traditional features with SVM as well as the original CNN with softmax. The synthesis results indicate that our system is able to assist clinical diagnosis to a certain extent.

  18. Classification of fibroglandular tissue distribution in the breast based on radiotherapy planning CT

    International Nuclear Information System (INIS)

    Juneja, Prabhjot; Evans, Philip; Windridge, David; Harris, Emma

    2016-01-01

    Accurate segmentation of breast tissues is required for a number of applications such as model based deformable registration in breast radiotherapy. The accuracy of breast tissue segmentation is affected by the spatial distribution (or pattern) of fibroglandular tissue (FT). The goal of this study was to develop and evaluate texture features, determined from planning computed tomography (CT) data, to classify the spatial distribution of FT in the breast. Planning CT data of 23 patients were evaluated in this study. Texture features were derived from the radial glandular fraction (RGF), which described the distribution of FT within three breast regions (posterior, middle, and anterior). Using visual assessment, experts grouped patients according to FT spatial distribution: sparse or non-sparse. Differences in the features between the two groups were investigated using the Wilcoxon rank test. Classification performance of the features was evaluated for a range of support vector machine (SVM) classifiers. Experts found eight patients and 15 patients had sparse and non-sparse spatial distribution of FT, respectively. A large proportion of features (>9 of 13) from the individual breast regions had significant differences (p <0.05) between the sparse and non-sparse group. The features from middle region had most significant differences and gave the highest classification accuracy for all the SVM kernels investigated. Overall, the features from middle breast region achieved highest accuracy (91 %) with the linear SVM kernel. This study found that features based on radial glandular fraction provide a means for discriminating between fibroglandular tissue distributions and could achieve a classification accuracy of 91 %

  19. The efficacy of support vector machines (SVM)

    Indian Academy of Sciences (India)

    (2006) by applying an SVM statistical learning machine on the time-scale wavelet decomposition methods. We used the data of 108 events in central Japan with magnitude ranging from 3 to 7.4 recorded at KiK-net network stations, for a source–receiver distance of up to 150 km during the period 1998–2011. We applied a ...

  20. Study on specificity of colon carcinoma-associated serum markers and establishment of SVM prediction model

    Directory of Open Access Journals (Sweden)

    Lu Li

    2017-03-01

    Full Text Available We aimed to evaluate the specificity of 12 tumor markers related to colon carcinoma and identify the most sensitive index. Logistic regression and Bhattacharyya distance were used to evaluate the index. Then, different index combinations were used to establish a support vector machine (SVM diagnosis model of malignant colon carcinoma. The accuracy of the model was checked. High accuracy was assumed to indicate the high specificity of the index. Through Logistic regression, three indexes, CEA, HSP60 and CA199, were screened out. Using Bhattacharyya distance, four indexes with the largest Bhattacharyya distance were screened out, including CEA, NSE, AFP, and CA724. The specificity of the combination of the above six indexes was higher than that of other combinations, so did the accuracy of the established SVM identification model. Using Logistic regression and Bhattacharyya distance for detection and establishing an SVM model based on different serum marker combinations can increase diagnostic accuracy, providing a theoretical basis for application of mathematical models in cancer diagnosis.

  1. lncRScan-SVM: A Tool for Predicting Long Non-Coding RNAs Using Support Vector Machine.

    Science.gov (United States)

    Sun, Lei; Liu, Hui; Zhang, Lin; Meng, Jia

    2015-01-01

    Functional long non-coding RNAs (lncRNAs) have been bringing novel insight into biological study, however it is still not trivial to accurately distinguish the lncRNA transcripts (LNCTs) from the protein coding ones (PCTs). As various information and data about lncRNAs are preserved by previous studies, it is appealing to develop novel methods to identify the lncRNAs more accurately. Our method lncRScan-SVM aims at classifying PCTs and LNCTs using support vector machine (SVM). The gold-standard datasets for lncRScan-SVM model training, lncRNA prediction and method comparison were constructed according to the GENCODE gene annotations of human and mouse respectively. By integrating features derived from gene structure, transcript sequence, potential codon sequence and conservation, lncRScan-SVM outperforms other approaches, which is evaluated by several criteria such as sensitivity, specificity, accuracy, Matthews correlation coefficient (MCC) and area under curve (AUC). In addition, several known human lncRNA datasets were assessed using lncRScan-SVM. LncRScan-SVM is an efficient tool for predicting the lncRNAs, and it is quite useful for current lncRNA study.

  2. Voice based gender classification using machine learning

    Science.gov (United States)

    Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.

    2017-11-01

    Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.

  3. Hyperspectral Image Classification Based on the Combination of Spatial-spectral Feature and Sparse Representation

    Directory of Open Access Journals (Sweden)

    YANG Zhaoxia

    2015-07-01

    Full Text Available In order to avoid the problem of being over-dependent on high-dimensional spectral feature in the traditional hyperspectral image classification, a novel approach based on the combination of spatial-spectral feature and sparse representation is proposed in this paper. Firstly, we extract the spatial-spectral feature by reorganizing the local image patch with the first d principal components(PCs into a vector representation, followed by a sorting scheme to make the vector invariant to local image rotation. Secondly, we learn the dictionary through a supervised method, and use it to code the features from test samples afterwards. Finally, we embed the resulting sparse feature coding into the support vector machine(SVM for hyperspectral image classification. Experiments using three hyperspectral data show that the proposed method can effectively improve the classification accuracy comparing with traditional classification methods.

  4. A Novel Vehicle Classification Using Embedded Strain Gauge Sensors

    Directory of Open Access Journals (Sweden)

    Qi Wang

    2008-11-01

    Full Text Available Abstract: This paper presents a new vehicle classification and develops a traffic monitoring detector to provide reliable vehicle classification to aid traffic management systems. The basic principle of this approach is based on measuring the dynamic strain caused by vehicles across pavement to obtain the corresponding vehicle parameters – wheelbase and number of axles – to then accurately classify the vehicle. A system prototype with five embedded strain sensors was developed to validate the accuracy and effectiveness of the classification method. According to the special arrangement of the sensors and the different time a vehicle arrived at the sensors one can estimate the vehicle’s speed accurately, corresponding to the estimated vehicle wheelbase and number of axles. Because of measurement errors and vehicle characteristics, there is a lot of overlap between vehicle wheelbase patterns. Therefore, directly setting up a fixed threshold for vehicle classification often leads to low-accuracy results. Using the machine learning pattern recognition method to deal with this problem is believed as one of the most effective tools. In this study, support vector machines (SVMs were used to integrate the classification features extracted from the strain sensors to automatically classify vehicles into five types, ranging from small vehicles to combination trucks, along the lines of the Federal Highway Administration vehicle classification guide. Test bench and field experiments will be introduced in this paper. Two support vector machines classification algorithms (one-against-all, one-against-one are used to classify single sensor data and multiple sensor combination data. Comparison of the two classification method results shows that the classification accuracy is very close using single data or multiple data. Our results indicate that using multiclass SVM-based fusion multiple sensor data significantly improves

  5. Retrospective assessment of interobserver agreement and accuracy in classifications and measurements in subsolid nodules with solid components less than 8mm: which window setting is better?

    Energy Technology Data Exchange (ETDEWEB)

    Yoo, Roh-Eul [Seoul National University College of Medicine, Department of Radiology, Seoul (Korea, Republic of); Seoul National University Medical Research Center, Institute of Radiation Medicine, Seoul (Korea, Republic of); Goo, Jin Mo; Park, Chang Min [Seoul National University College of Medicine, Department of Radiology, Seoul (Korea, Republic of); Seoul National University College of Medicine, Cancer Research Institute, Seoul (Korea, Republic of); Hwang, Eui Jin; Yoon, Soon Ho; Lee, Chang Hyun [Seoul National University College of Medicine, Department of Radiology, Seoul (Korea, Republic of); Ahn, Soyeon [Seoul National University Bundang Hospital, Medical Research Collaborating Center, Seongnam-si (Korea, Republic of)

    2017-04-15

    To compare interobserver agreements among multiple readers and accuracy for the assessment of solid components in subsolid nodules between the lung and mediastinal window settings. Seventy-seven surgically resected nodules with solid components smaller than 8 mm were included in this study. In both lung and mediastinal windows, five readers independently assessed the presence and size of solid component. Bootstrapping was used to compare the interobserver agreement between the two window settings. Imaging-pathology correlation was performed to evaluate the accuracy. There were no significant differences in the interobserver agreements between the two windows for both identification (lung windows, k = 0.51; mediastinal windows, k = 0.57) and measurements (lung windows, ICC = 0.70; mediastinal windows, ICC = 0.69) of solid components. The incidence of false negative results for the presence of invasive components and the median absolute difference between the solid component size and the invasive component size were significantly higher on mediastinal windows than on lung windows (P < 0.001 and P < 0.001, respectively). The lung window setting had a comparable reproducibility but a higher accuracy than the mediastinal window setting for nodule classifications and solid component measurements in subsolid nodules. (orig.)

  6. Structural Health Monitoring of Tall Buildings with Numerical Integrator and Convex-Concave Hull Classification

    Directory of Open Access Journals (Sweden)

    Suresh Thenozhi

    2012-01-01

    Full Text Available An important objective of health monitoring systems for tall buildings is to diagnose the state of the building and to evaluate its possible damage. In this paper, we use our prototype to evaluate our data-mining approach for the fault monitoring. The offset cancellation and high-pass filtering techniques are combined effectively to solve common problems in numerical integration of acceleration signals in real-time applications. The integration accuracy is improved compared with other numerical integrators. Then we introduce a novel method for support vector machine (SVM classification, called convex-concave hull. We use the Jarvis march method to decide the concave (nonconvex hull for the inseparable points. Finally the vertices of the convex-concave hull are applied for SVM training.

  7. Modulation transfer function (MTF) measurement method based on support vector machine (SVM)

    Science.gov (United States)

    Zhang, Zheng; Chen, Yueting; Feng, Huajun; Xu, Zhihai; Li, Qi

    2016-03-01

    An imaging system's spatial quality can be expressed by the system's modulation spread function (MTF) as a function of spatial frequency in terms of the linear response theory. Methods have been proposed to assess the MTF of an imaging system using point, slit or edge techniques. The edge method is widely used for the low requirement of targets. However, the traditional edge methods are limited by the edge angle. Besides, image noise will impair the measurement accuracy, making the measurement result unstable. In this paper, a novel measurement method based on the support vector machine (SVM) is proposed. Image patches with different edge angles and MTF levels are generated as the training set. Parameters related with MTF and image structure are extracted from the edge images. Trained with image parameters and the corresponding MTF, the SVM classifier can assess the MTF of any edge image. The result shows that the proposed method has an excellent performance on measuring accuracy and stability.

  8. CLASSIFICATION AND RECOGNITION OF TOMB INFORMATION IN HYPERSPECTRAL IMAGE

    Directory of Open Access Journals (Sweden)

    M. Gu

    2018-04-01

    Full Text Available There are a large number of materials with important historical information in ancient tombs. However, in many cases, these substances could become obscure and indistinguishable by human naked eye or true colour camera. In order to classify and identify materials in ancient tomb effectively, this paper applied hyperspectral imaging technology to archaeological research of ancient tomb in Shanxi province. Firstly, the feature bands including the main information at the bottom of the ancient tomb are selected by the Principal Component Analysis (PCA transformation to realize the data dimension. Then, the image classification was performed using Support Vector Machine (SVM based on feature bands. Finally, the material at the bottom of ancient tomb is identified by spectral analysis and spectral matching. The results show that SVM based on feature bands can not only ensure the classification accuracy, but also shorten the data processing time and improve the classification efficiency. In the material identification, it is found that the same matter identified in the visible light is actually two different substances. This research result provides a new reference and research idea for archaeological work.

  9. Machine Learning Classification of Buildings for Map Generalization

    Directory of Open Access Journals (Sweden)

    Jaeeun Lee

    2017-10-01

    Full Text Available A critical problem in mapping data is the frequent updating of large data sets. To solve this problem, the updating of small-scale data based on large-scale data is very effective. Various map generalization techniques, such as simplification, displacement, typification, elimination, and aggregation, must therefore be applied. In this study, we focused on the elimination and aggregation of the building layer, for which each building in a large scale was classified as “0-eliminated,” “1-retained,” or “2-aggregated.” Machine-learning classification algorithms were then used for classifying the buildings. The data of 1:1000 scale and 1:25,000 scale digital maps obtained from the National Geographic Information Institute were used. We applied to these data various machine-learning classification algorithms, including naive Bayes (NB, decision tree (DT, k-nearest neighbor (k-NN, and support vector machine (SVM. The overall accuracies of each algorithm were satisfactory: DT, 88.96%; k-NN, 88.27%; SVM, 87.57%; and NB, 79.50%. Although elimination is a direct part of the proposed process, generalization operations, such as simplification and aggregation of polygons, must still be performed for buildings classified as retained and aggregated. Thus, these algorithms can be used for building classification and can serve as preparatory steps for building generalization.

  10. CLASSIFICATION ALGORITHMS FOR BIG DATA ANALYSIS, A MAP REDUCE APPROACH

    Directory of Open Access Journals (Sweden)

    V. A. Ayma

    2015-03-01

    Full Text Available Since many years ago, the scientific community is concerned about how to increase the accuracy of different classification methods, and major achievements have been made so far. Besides this issue, the increasing amount of data that is being generated every day by remote sensors raises more challenges to be overcome. In this work, a tool within the scope of InterIMAGE Cloud Platform (ICP, which is an open-source, distributed framework for automatic image interpretation, is presented. The tool, named ICP: Data Mining Package, is able to perform supervised classification procedures on huge amounts of data, usually referred as big data, on a distributed infrastructure using Hadoop MapReduce. The tool has four classification algorithms implemented, taken from WEKA’s machine learning library, namely: Decision Trees, Naïve Bayes, Random Forest and Support Vector Machines (SVM. The results of an experimental analysis using a SVM classifier on data sets of different sizes for different cluster configurations demonstrates the potential of the tool, as well as aspects that affect its performance.

  11. Support Vector Machine Classification of Drunk Driving Behaviour.

    Science.gov (United States)

    Chen, Huiqin; Chen, Lei

    2017-01-23

    Alcohol is the root cause of numerous traffic accidents due to its pharmacological action on the human central nervous system. This study conducted a detection process to distinguish drunk driving from normal driving under simulated driving conditions. The classification was performed by a support vector machine (SVM) classifier trained to distinguish between these two classes by integrating both driving performance and physiological measurements. In addition, principal component analysis was conducted to rank the weights of the features. The standard deviation of R-R intervals (SDNN), the root mean square value of the difference of the adjacent R-R interval series (RMSSD), low frequency (LF), high frequency (HF), the ratio of the low and high frequencies (LF/HF), and average blink duration were the highest weighted features in the study. The results show that SVM classification can successfully distinguish drunk driving from normal driving with an accuracy of 70%. The driving performance data and the physiological measurements reported by this paper combined with air-alcohol concentration could be integrated using the support vector regression classification method to establish a better early warning model, thereby improving vehicle safety.

  12. Support Vector Machine Classification of Drunk Driving Behaviour

    Directory of Open Access Journals (Sweden)

    Huiqin Chen

    2017-01-01

    Full Text Available Alcohol is the root cause of numerous traffic accidents due to its pharmacological action on the human central nervous system. This study conducted a detection process to distinguish drunk driving from normal driving under simulated driving conditions. The classification was performed by a support vector machine (SVM classifier trained to distinguish between these two classes by integrating both driving performance and physiological measurements. In addition, principal component analysis was conducted to rank the weights of the features. The standard deviation of R–R intervals (SDNN, the root mean square value of the difference of the adjacent R–R interval series (RMSSD, low frequency (LF, high frequency (HF, the ratio of the low and high frequencies (LF/HF, and average blink duration were the highest weighted features in the study. The results show that SVM classification can successfully distinguish drunk driving from normal driving with an accuracy of 70%. The driving performance data and the physiological measurements reported by this paper combined with air-alcohol concentration could be integrated using the support vector regression classification method to establish a better early warning model, thereby improving vehicle safety.

  13. Classification and Recognition of Tomb Information in Hyperspectral Image

    Science.gov (United States)

    Gu, M.; Lyu, S.; Hou, M.; Ma, S.; Gao, Z.; Bai, S.; Zhou, P.

    2018-04-01

    There are a large number of materials with important historical information in ancient tombs. However, in many cases, these substances could become obscure and indistinguishable by human naked eye or true colour camera. In order to classify and identify materials in ancient tomb effectively, this paper applied hyperspectral imaging technology to archaeological research of ancient tomb in Shanxi province. Firstly, the feature bands including the main information at the bottom of the ancient tomb are selected by the Principal Component Analysis (PCA) transformation to realize the data dimension. Then, the image classification was performed using Support Vector Machine (SVM) based on feature bands. Finally, the material at the bottom of ancient tomb is identified by spectral analysis and spectral matching. The results show that SVM based on feature bands can not only ensure the classification accuracy, but also shorten the data processing time and improve the classification efficiency. In the material identification, it is found that the same matter identified in the visible light is actually two different substances. This research result provides a new reference and research idea for archaeological work.

  14. A method of distributed avionics data processing based on SVM classifier

    Science.gov (United States)

    Guo, Hangyu; Wang, Jinyan; Kang, Minyang; Xu, Guojing

    2018-03-01

    Under the environment of system combat, in order to solve the problem on management and analysis of the massive heterogeneous data on multi-platform avionics system, this paper proposes a management solution which called avionics "resource cloud" based on big data technology, and designs an aided decision classifier based on SVM algorithm. We design an experiment with STK simulation, the result shows that this method has a high accuracy and a broad application prospect.

  15. A Method to Integrate GMM, SVM and DTW for Speaker Recognition

    Directory of Open Access Journals (Sweden)

    Ing-Jr Ding

    2014-01-01

    Full Text Available This paper develops an effective and efficient scheme to integrate Gaussian mixture model (GMM, support vector machine (SVM, and dynamic time wrapping (DTW for automatic speaker recognition. GMM and SVM are two popular classifiers for speaker recognition applications. DTW is a fast and simple template matching method, and it is frequently seen in applications of speech recognition. In this work, DTW does not play a role to perform speech recognition, and it will be employed to be a verifier for verification of valid speakers. The proposed combination scheme of GMM, SVM and DTW, called SVMGMM-DTW, for speaker recognition in this study is a two-phase verification process task including GMM-SVM verification of the first phase and DTW verification of the second phase. By providing a double check to verify the identity of a speaker, it will be difficult for imposters to try to pass the security protection; therefore, the safety degree of speaker recognition systems will be largely increased. A series of experiments designed on door access control applications demonstrated that the superiority of the developed SVMGMM-DTW on speaker recognition accuracy.

  16. An up-to-date comparison of state-of-the-art classification algorithms

    KAUST Repository

    Zhang, Chongsheng

    2017-04-05

    Current benchmark reports of classification algorithms generally concern common classifiers and their variants but do not include many algorithms that have been introduced in recent years. Moreover, important properties such as the dependency on number of classes and features and CPU running time are typically not examined. In this paper, we carry out a comparative empirical study on both established classifiers and more recently proposed ones on 71 data sets originating from different domains, publicly available at UCI and KEEL repositories. The list of 11 algorithms studied includes Extreme Learning Machine (ELM), Sparse Representation based Classification (SRC), and Deep Learning (DL), which have not been thoroughly investigated in existing comparative studies. It is found that Stochastic Gradient Boosting Trees (GBDT) matches or exceeds the prediction performance of Support Vector Machines (SVM) and Random Forests (RF), while being the fastest algorithm in terms of prediction efficiency. ELM also yields good accuracy results, ranking in the top-5, alongside GBDT, RF, SVM, and C4.5 but this performance varies widely across all data sets. Unsurprisingly, top accuracy performers have average or slow training time efficiency. DL is the worst performer in terms of accuracy but second fastest in prediction efficiency. SRC shows good accuracy performance but it is the slowest classifier in both training and testing.

  17. An up-to-date comparison of state-of-the-art classification algorithms

    KAUST Repository

    Zhang, Chongsheng; Liu, Changchang; Zhang, Xiangliang; Almpanidis, George

    2017-01-01

    Current benchmark reports of classification algorithms generally concern common classifiers and their variants but do not include many algorithms that have been introduced in recent years. Moreover, important properties such as the dependency on number of classes and features and CPU running time are typically not examined. In this paper, we carry out a comparative empirical study on both established classifiers and more recently proposed ones on 71 data sets originating from different domains, publicly available at UCI and KEEL repositories. The list of 11 algorithms studied includes Extreme Learning Machine (ELM), Sparse Representation based Classification (SRC), and Deep Learning (DL), which have not been thoroughly investigated in existing comparative studies. It is found that Stochastic Gradient Boosting Trees (GBDT) matches or exceeds the prediction performance of Support Vector Machines (SVM) and Random Forests (RF), while being the fastest algorithm in terms of prediction efficiency. ELM also yields good accuracy results, ranking in the top-5, alongside GBDT, RF, SVM, and C4.5 but this performance varies widely across all data sets. Unsurprisingly, top accuracy performers have average or slow training time efficiency. DL is the worst performer in terms of accuracy but second fastest in prediction efficiency. SRC shows good accuracy performance but it is the slowest classifier in both training and testing.

  18. Age and gender estimation using Region-SIFT and multi-layered SVM

    Science.gov (United States)

    Kim, Hyunduk; Lee, Sang-Heon; Sohn, Myoung-Kyu; Hwang, Byunghun

    2018-04-01

    In this paper, we propose an age and gender estimation framework using the region-SIFT feature and multi-layered SVM classifier. The suggested framework entails three processes. The first step is landmark based face alignment. The second step is the feature extraction step. In this step, we introduce the region-SIFT feature extraction method based on facial landmarks. First, we define sub-regions of the face. We then extract SIFT features from each sub-region. In order to reduce the dimensions of features we employ a Principal Component Analysis (PCA) and a Linear Discriminant Analysis (LDA). Finally, we classify age and gender using a multi-layered Support Vector Machines (SVM) for efficient classification. Rather than performing gender estimation and age estimation independently, the use of the multi-layered SVM can improve the classification rate by constructing a classifier that estimate the age according to gender. Moreover, we collect a dataset of face images, called by DGIST_C, from the internet. A performance evaluation of proposed method was performed with the FERET database, CACD database, and DGIST_C database. The experimental results demonstrate that the proposed approach classifies age and performs gender estimation very efficiently and accurately.

  19. Evaluation of feature selection algorithms for classification in temporal lobe epilepsy based on MR images

    Science.gov (United States)

    Lai, Chunren; Guo, Shengwen; Cheng, Lina; Wang, Wensheng; Wu, Kai

    2017-02-01

    It's very important to differentiate the temporal lobe epilepsy (TLE) patients from healthy people and localize the abnormal brain regions of the TLE patients. The cortical features and changes can reveal the unique anatomical patterns of brain regions from the structural MR images. In this study, structural MR images from 28 normal controls (NC), 18 left TLE (LTLE), and 21 right TLE (RTLE) were acquired, and four types of cortical feature, namely cortical thickness (CTh), cortical surface area (CSA), gray matter volume (GMV), and mean curvature (MCu), were explored for discriminative analysis. Three feature selection methods, the independent sample t-test filtering, the sparse-constrained dimensionality reduction model (SCDRM), and the support vector machine-recursive feature elimination (SVM-RFE), were investigated to extract dominant regions with significant differences among the compared groups for classification using the SVM classifier. The results showed that the SVM-REF achieved the highest performance (most classifications with more than 92% accuracy), followed by the SCDRM, and the t-test. Especially, the surface area and gray volume matter exhibited prominent discriminative ability, and the performance of the SVM was improved significantly when the four cortical features were combined. Additionally, the dominant regions with higher classification weights were mainly located in temporal and frontal lobe, including the inferior temporal, entorhinal cortex, fusiform, parahippocampal cortex, middle frontal and frontal pole. It was demonstrated that the cortical features provided effective information to determine the abnormal anatomical pattern and the proposed method has the potential to improve the clinical diagnosis of the TLE.

  20. Radar target classification method with high accuracy and decision speed performance using MUSIC spectrum vectors and PCA projection

    Science.gov (United States)

    Secmen, Mustafa

    2011-10-01

    This paper introduces the performance of an electromagnetic target recognition method in resonance scattering region, which includes pseudo spectrum Multiple Signal Classification (MUSIC) algorithm and principal component analysis (PCA) technique. The aim of this method is to classify an "unknown" target as one of the "known" targets in an aspect-independent manner. The suggested method initially collects the late-time portion of noise-free time-scattered signals obtained from different reference aspect angles of known targets. Afterward, these signals are used to obtain MUSIC spectrums in real frequency domain having super-resolution ability and noise resistant feature. In the final step, PCA technique is applied to these spectrums in order to reduce dimensionality and obtain only one feature vector per known target. In the decision stage, noise-free or noisy scattered signal of an unknown (test) target from an unknown aspect angle is initially obtained. Subsequently, MUSIC algorithm is processed for this test signal and resulting test vector is compared with feature vectors of known targets one by one. Finally, the highest correlation gives the type of test target. The method is applied to wire models of airplane targets, and it is shown that it can tolerate considerable noise levels although it has a few different reference aspect angles. Besides, the runtime of the method for a test target is sufficiently low, which makes the method suitable for real-time applications.

  1. Comparing SVM and ANN based Machine Learning Methods for Species Identification of Food Contaminating Beetles.

    Science.gov (United States)

    Bisgin, Halil; Bera, Tanmay; Ding, Hongjian; Semey, Howard G; Wu, Leihong; Liu, Zhichao; Barnes, Amy E; Langley, Darryl A; Pava-Ripoll, Monica; Vyas, Himansu J; Tong, Weida; Xu, Joshua

    2018-04-25

    Insect pests, such as pantry beetles, are often associated with food contaminations and public health risks. Machine learning has the potential to provide a more accurate and efficient solution in detecting their presence in food products, which is currently done manually. In our previous research, we demonstrated such feasibility where Artificial Neural Network (ANN) based pattern recognition techniques could be implemented for species identification in the context of food safety. In this study, we present a Support Vector Machine (SVM) model which improved the average accuracy up to 85%. Contrary to this, the ANN method yielded ~80% accuracy after extensive parameter optimization. Both methods showed excellent genus level identification, but SVM showed slightly better accuracy  for most species. Highly accurate species level identification remains a challenge, especially in distinguishing between species from the same genus which may require improvements in both imaging and machine learning techniques. In summary, our work does illustrate a new SVM based technique and provides a good comparison with the ANN model in our context. We believe such insights will pave better way forward for the application of machine learning towards species identification and food safety.

  2. Use of the Diabetes Prevention Trial-Type 1 Risk Score (DPTRS) for improving the accuracy of the risk classification of type 1 diabetes.

    Science.gov (United States)

    Sosenko, Jay M; Skyler, Jay S; Mahon, Jeffrey; Krischer, Jeffrey P; Greenbaum, Carla J; Rafkin, Lisa E; Beam, Craig A; Boulware, David C; Matheson, Della; Cuthbertson, David; Herold, Kevan C; Eisenbarth, George; Palmer, Jerry P

    2014-04-01

    OBJECTIVE We studied the utility of the Diabetes Prevention Trial-Type 1 Risk Score (DPTRS) for improving the accuracy of type 1 diabetes (T1D) risk classification in TrialNet Natural History Study (TNNHS) participants. RESEARCH DESIGN AND METHODS The cumulative incidence of T1D was compared between normoglycemic individuals with DPTRS values >7.00 and dysglycemic individuals in the TNNHS (n = 991). It was also compared between individuals with DPTRS values 7.00 among those with dysglycemia and those with multiple autoantibodies in the TNNHS. DPTRS values >7.00 were compared with dysglycemia for characterizing risk in Diabetes Prevention Trial-Type 1 (DPT-1) (n = 670) and TNNHS participants. The reliability of DPTRS values >7.00 was compared with dysglycemia in the TNNHS. RESULTS The cumulative incidence of T1D for normoglycemic TNNHS participants with DPTRS values >7.00 was comparable to those with dysglycemia. Among those with dysglycemia, the cumulative incidence was much higher (P 7.00 than for those with values 7.00). Dysglycemic individuals in DPT-1 were at much higher risk for T1D than those with dysglycemia in the TNNHS (P 7.00. The proportion in the TNNHS reverting from dysglycemia to normoglycemia at the next visit was higher than the proportion reverting from DPTRS values >7.00 to values <7.00 (36 vs. 23%). CONCLUSIONS DPTRS thresholds can improve T1D risk classification accuracy by identifying high-risk normoglycemic and low-risk dysglycemic individuals. The 7.00 DPTRS threshold characterizes risk more consistently between populations and has greater reliability than dysglycemia.

  3. Electrode replacement does not affect classification accuracy in dual-session use of a passive brain-computer interface for assessing cognitive workload

    Directory of Open Access Journals (Sweden)

    Justin Ronald Estepp

    2015-03-01

    Full Text Available The passive brain-computer interface (pBCI framework has been shown to be a very promising construct for assessing cognitive and affective state in both individuals and teams. There is a growing body of work that focuses on solving the challenges of transitioning pBCI systems from the research laboratory environment to practical, everyday use. An interesting issue is what impact methodological variability may have on the ability to reliably identify (neurophysiological patterns that are useful for state assessment. This work aimed at quantifying the effects of methodological variability in a pBCI design for detecting changes in cognitive workload. Specific focus was directed toward the effects of replacing electrodes over dual sessions (thus inducing changes in placement, electromechanical properties, and/or impedance between the electrode and skin surface on the accuracy of several machine learning approaches in a binary classification problem. In investigating these methodological variables, it was determined that the removal and replacement of the electrode suite between sessions does not impact the accuracy of a number of learning approaches when trained on one session and tested on a second. This finding was confirmed by comparing to a control group for which the electrode suite was not replaced between sessions. This result suggests that sensors (both neurological and peripheral may be removed and replaced over the course of many interactions with a pBCI system without affecting its performance. Future work on multi-session and multi-day pBCI system use should seek to replicate this (lack of effect between sessions in other tasks, temporal time courses, and data analytic approaches while also focusing on non-stationarity and variable classification performance due to intrinsic factors.

  4. Electrode replacement does not affect classification accuracy in dual-session use of a passive brain-computer interface for assessing cognitive workload.

    Science.gov (United States)

    Estepp, Justin R; Christensen, James C

    2015-01-01

    The passive brain-computer interface (pBCI) framework has been shown to be a very promising construct for assessing cognitive and affective state in both individuals and teams. There is a growing body of work that focuses on solving the challenges of transitioning pBCI systems from the research laboratory environment to practical, everyday use. An interesting issue is what impact methodological variability may have on the ability to reliably identify (neuro)physiological patterns that are useful for state assessment. This work aimed at quantifying the effects of methodological variability in a pBCI design for detecting changes in cognitive workload. Specific focus was directed toward the effects of replacing electrodes over dual sessions (thus inducing changes in placement, electromechanical properties, and/or impedance between the electrode and skin surface) on the accuracy of several machine learning approaches in a binary classification problem. In investigating these methodological variables, it was determined that the removal and replacement of the electrode suite between sessions does not impact the accuracy of a number of learning approaches when trained on one session and tested on a second. This finding was confirmed by comparing to a control group for which the electrode suite was not replaced between sessions. This result suggests that sensors (both neurological and peripheral) may be removed and replaced over the course of many interactions with a pBCI system without affecting its performance. Future work on multi-session and multi-day pBCI system use should seek to replicate this (lack of) effect between sessions in other tasks, temporal time courses, and data analytic approaches while also focusing on non-stationarity and variable classification performance due to intrinsic factors.

  5. SPECTRAL RECONSTRUCTION BASED ON SVM FOR CROSS CALIBRATION

    Directory of Open Access Journals (Sweden)

    H. Gao

    2017-05-01

    Full Text Available Chinese HY-1C/1D satellites will use a 5nm/10nm-resolutional visible-near infrared(VNIR hyperspectral sensor with the solar calibrator to cross-calibrate with other sensors. The hyperspectral radiance data are composed of average radiance in the sensor’s passbands and bear a spectral smoothing effect, a transform from the hyperspectral radiance data to the 1-nm-resolution apparent spectral radiance by spectral reconstruction need to be implemented. In order to solve the problem of noise cumulation and deterioration after several times of iteration by the iterative algorithm, a novel regression method based on SVM is proposed, which can approach arbitrary complex non-linear relationship closely and provide with better generalization capability by learning. In the opinion of system, the relationship between the apparent radiance and equivalent radiance is nonlinear mapping introduced by spectral response function(SRF, SVM transform the low-dimensional non-linear question into high-dimensional linear question though kernel function, obtaining global optimal solution by virtue of quadratic form. The experiment is performed using 6S-simulated spectrums considering the SRF and SNR of the hyperspectral sensor, measured reflectance spectrums of water body and different atmosphere conditions. The contrastive result shows: firstly, the proposed method is with more reconstructed accuracy especially to the high-frequency signal; secondly, while the spectral resolution of the hyperspectral sensor reduces, the proposed method performs better than the iterative method; finally, the root mean square relative error(RMSRE which is used to evaluate the difference of the reconstructed spectrum and the real spectrum over the whole spectral range is calculated, it decreses by one time at least by proposed method.

  6. Uniform design based SVM model selection for face recognition

    Science.gov (United States)

    Li, Weihong; Liu, Lijuan; Gong, Weiguo

    2010-02-01

    Support vector machine (SVM) has been proved to be a powerful tool for face recognition. The generalization capacity of SVM depends on the model with optimal hyperparameters. The computational cost of SVM model selection results in application difficulty in face recognition. In order to overcome the shortcoming, we utilize the advantage of uniform design--space filling designs and uniformly scattering theory to seek for optimal SVM hyperparameters. Then we propose a face recognition scheme based on SVM with optimal model which obtained by replacing the grid and gradient-based method with uniform design. The experimental results on Yale and PIE face databases show that the proposed method significantly improves the efficiency of SVM model selection.

  7. Comparing writing style feature-based classification methods for estimating user reputations in social media.

    Science.gov (United States)

    Suh, Jong Hwan

    2016-01-01

    In recent years, the anonymous nature of the Internet has made it difficult to detect manipulated user reputations in social media, as well as to ensure the qualities of users and their posts. To deal with this, this study designs and examines an automatic approach that adopts writing style features to estimate user reputations in social media. Under varying ways of defining Good and Bad classes of user reputations based on the collected data, it evaluates the classification performance of the state-of-art methods: four writing style features, i.e. lexical, syntactic, structural, and content-specific, and eight classification techniques, i.e. four base learners-C4.5, Neural Network (NN), Support Vector Machine (SVM), and Naïve Bayes (NB)-and four Random Subspace (RS) ensemble methods based on the four base learners. When South Korea's Web forum, Daum Agora, was selected as a test bed, the experimental results show that the configuration of the full feature set containing content-specific features and RS-SVM combining RS and SVM gives the best accuracy for classification if the test bed poster reputations are segmented strictly into Good and Bad classes by portfolio approach. Pairwise t tests on accuracy confirm two expectations coming from the literature reviews: first, the feature set adding content-specific features outperform the others; second, ensemble learning methods are more viable than base learners. Moreover, among the four ways on defining the classes of user reputations, i.e. like, dislike, sum, and portfolio, the results show that the portfolio approach gives the highest accuracy.

  8. Computer-aided diagnosis of lung cancer: the effect of training data sets on classification accuracy of lung nodules

    Science.gov (United States)

    Gong, Jing; Liu, Ji-Yu; Sun, Xi-Wen; Zheng, Bin; Nie, Sheng-Dong

    2018-02-01

    This study aims to develop a computer-aided diagnosis (CADx) scheme for classification between malignant and benign lung nodules, and also assess whether CADx performance changes in detecting nodules associated with early and advanced stage lung cancer. The study involves 243 biopsy-confirmed pulmonary nodules. Among them, 76 are benign, 81 are stage I and 86 are stage III malignant nodules. The cases are separated into three data sets involving: (1) all nodules, (2) benign and stage I malignant nodules, and (3) benign and stage III malignant nodules. A CADx scheme is applied to segment lung nodules depicted on computed tomography images and we initially computed 66 3D image features. Then, three machine learning models namely, a support vector machine, naïve Bayes classifier and linear discriminant analysis, are separately trained and tested by using three data sets and a leave-one-case-out cross-validation method embedded with a Relief-F feature selection algorithm. When separately using three data sets to train and test three classifiers, the average areas under receiver operating characteristic curves (AUC) are 0.94, 0.90 and 0.99, respectively. When using the classifiers trained using data sets with all nodules, average AUC values are 0.88 and 0.99 for detecting early and advanced stage nodules, respectively. AUC values computed from three classifiers trained using the same data set are consistent without statistically significant difference (p  >  0.05). This study demonstrates (1) the feasibility of applying a CADx scheme to accurately distinguish between benign and malignant lung nodules, and (2) a positive trend between CADx performance and cancer progression stage. Thus, in order to increase CADx performance in detecting subtle and early cancer, training data sets should include more diverse early stage cancer cases.

  9. A Statistical Parameter Analysis and SVM Based Fault Diagnosis Strategy for Dynamically Tuned Gyroscopes

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    Gyro's fault diagnosis plays a critical role in inertia navigation systems for higher reliability and precision. A new fault diagnosis strategy based on the statistical parameter analysis (SPA) and support vector machine(SVM) classification model was proposed for dynamically tuned gyroscopes (DTG). The SPA, a kind of time domain analysis approach, was introduced to compute a set of statistical parameters of vibration signal as the state features of DTG, with which the SVM model, a novel learning machine based on statistical learning theory (SLT), was applied and constructed to train and identify the working state of DTG. The experimental results verify that the proposed diagnostic strategy can simply and effectively extract the state features of DTG, and it outperforms the radial-basis function (RBF) neural network based diagnostic method and can more reliably and accurately diagnose the working state of DTG.

  10. Intelligent Recognition of Lung Nodule Combining Rule-based and C-SVM Classifiers

    Directory of Open Access Journals (Sweden)

    Bin Li

    2011-10-01

    Full Text Available Computer-aided detection(CAD system for lung nodules plays the important role in the diagnosis of lung cancer. In this paper, an improved intelligent recognition method of lung nodule in HRCT combing rule-based and costsensitive support vector machine(C-SVM classifiers is proposed for detecting both solid nodules and ground-glass opacity(GGO nodules(part solid and nonsolid. This method consists of several steps. Firstly, segmentation of regions of interest(ROIs, including pulmonary parenchyma and lung nodule candidates, is a difficult task. On one side, the presence of noise lowers the visibility of low-contrast objects. On the other side, different types of nodules, including small nodules, nodules connecting to vasculature or other structures, part-solid or nonsolid nodules, are complex, noisy, weak edge or difficult to define the boundary. In order to overcome the difficulties of obvious boundary-leak and slow evolvement speed problem in segmentatioin of weak edge, an overall segmentation method is proposed, they are: the lung parenchyma is extracted based on threshold and morphologic segmentation method; the image denoising and enhancing is realized by nonlinear anisotropic diffusion filtering(NADF method;candidate pulmonary nodules are segmented by the improved C-V level set method, in which the segmentation result of EM-based fuzzy threshold method is used as the initial contour of active contour model and a constrained energy term is added into the PDE of level set function. Then, lung nodules are classified by using the intelligent classifiers combining rules and C-SVM. Rule-based classification is first used to remove easily dismissible nonnodule objects, then C-SVM classification are used to further classify nodule candidates and reduce the number of false positive(FP objects. In order to increase the efficiency of SVM, an improved training method is used to train SVM, which uses the grid search method to search the optimal parameters

  11. Intelligent Recognition of Lung Nodule Combining Rule-based and C-SVM Classifiers

    Directory of Open Access Journals (Sweden)

    Bin Li

    2012-02-01

    Full Text Available Computer-aided detection(CAD system for lung nodules plays the important role in the diagnosis of lung cancer. In this paper, an improved intelligent recognition method of lung nodule in HRCT combing rule-based and cost-sensitive support vector machine(C-SVM classifiers is proposed for detecting both solid nodules and ground-glass opacity(GGO nodules(part solid and nonsolid. This method consists of several steps. Firstly, segmentation of regions of interest(ROIs, including pulmonary parenchyma and lung nodule candidates, is a difficult task. On one side, the presence of noise lowers the visibility of low-contrast objects. On the other side, different types of nodules, including small nodules, nodules connecting to vasculature or other structures, part-solid or nonsolid nodules, are complex, noisy, weak edge or difficult to define the boundary. In order to overcome the difficulties of obvious boundary-leak and slow evolvement speed problem in segmentatioin of weak edge, an overall segmentation method is proposed, they are: the lung parenchyma is extracted based on threshold and morphologic segmentation method; the image denoising and enhancing is realized by nonlinear anisotropic diffusion filtering(NADF method; candidate pulmonary nodules are segmented by the improved C-V level set method, in which the segmentation result of EM-based fuzzy threshold method is used as the initial contour of active contour model and a constrained energy term is added into the PDE of level set function. Then, lung nodules are classified by using the intelligent classifiers combining rules and C-SVM. Rule-based classification is first used to remove easily dismissible nonnodule objects, then C-SVM classification are used to further classify nodule candidates and reduce the number of false positive(FP objects. In order to increase the efficiency of SVM, an improved training method is used to train SVM, which uses the grid search method to search the optimal

  12. Absolute cosine-based SVM-RFE feature selection method for prostate histopathological grading.

    Science.gov (United States)

    Sahran, Shahnorbanun; Albashish, Dheeb; Abdullah, Azizi; Shukor, Nordashima Abd; Hayati Md Pauzi, Suria

    2018-04-18

    Feature selection (FS) methods are widely used in grading and diagnosing prostate histopathological images. In this context, FS is based on the texture features obtained from the lumen, nuclei, cytoplasm and stroma, all of which are important tissue components. However, it is difficult to represent the high-dimensional textures of these tissue components. To solve this problem, we propose a new FS method that enables the selection of features with minimal redundancy in the tissue components. We categorise tissue images based on the texture of individual tissue components via the construction of a single classifier and also construct an ensemble learning model by merging the values obtained by each classifier. Another issue that arises is overfitting due to the high-dimensional texture of individual tissue components. We propose a new FS method, SVM-RFE(AC), that integrates a Support Vector Machine-Recursive Feature Elimination (SVM-RFE) embedded procedure with an absolute cosine (AC) filter method to prevent redundancy in the selected features of the SV-RFE and an unoptimised classifier in the AC. We conducted experiments on H&E histopathological prostate and colon cancer images with respect to three prostate classifications, namely benign vs. grade 3, benign vs. grade 4 and grade 3 vs. grade 4. The colon benchmark dataset requires a distinction between grades 1 and 2, which are the most difficult cases to distinguish in the colon domain. The results obtained by both the single and ensemble classification models (which uses the product rule as its merging method) confirm that the proposed SVM-RFE(AC) is superior to the other SVM and SVM-RFE-based methods. We developed an FS method based on SVM-RFE and AC and successfully showed that its use enabled the identification of the most crucial texture feature of each tissue component. Thus, it makes possible the distinction between multiple Gleason grades (e.g. grade 3 vs. grade 4) and its performance is far superior to

  13. Mapping of High Value Crops Through AN Object-Based Svm Model Using LIDAR Data and Orthophoto in Agusan del Norte Philippines

    Science.gov (United States)

    Candare, Rudolph Joshua; Japitana, Michelle; Cubillas, James Earl; Ramirez, Cherry Bryan

    2016-06-01

    This research describes the methods involved in the mapping of different high value crops in Agusan del Norte Philippines using LiDAR. This project is part of the Phil-LiDAR 2 Program which aims to conduct a nationwide resource assessment using LiDAR. Because of the high resolution data involved, the methodology described here utilizes object-based image analysis and the use of optimal features from LiDAR data and Orthophoto. Object-based classification was primarily done by developing rule-sets in eCognition. Several features from the LiDAR data and Orthophotos were used in the development of rule-sets for classification. Generally, classes of objects can't be separated by simple thresholds from different features making it difficult to develop a rule-set. To resolve this problem, the image-objects were subjected to Support Vector Machine learning. SVMs have gained popularity because of their ability to generalize well given a limited number of training samples. However, SVMs also suffer from parameter assignment issues that can significantly affect the classification results. More specifically, the regularization parameter C in linear SVM has to be optimized through cross validation to increase the overall accuracy. After performing the segmentation in eCognition, the optimization procedure as well as the extraction of the equations of the hyper-planes was done in Matlab. The learned hyper-planes separating one class from another in the multi-dimensional feature-space can be thought of as super-features which were then used in developing the classifier rule set in eCognition. In this study, we report an overall classification accuracy of greater than 90% in different areas.

  14. Hybrid PSO–SVM-based method for forecasting of the remaining useful life for aircraft engines and evaluation of its reliability

    International Nuclear Information System (INIS)

    García Nieto, P.J.; García-Gonzalo, E.; Sánchez Lasheras, F.; Cos Juez, F.J. de

    2015-01-01

    The present paper describes a hybrid PSO–SVM-based model for the prediction of the remaining useful life of aircraft engines. The proposed hybrid model combines support vector machines (SVMs), which have been successfully adopted for regression problems, with the particle swarm optimization (PSO) technique. This optimization technique involves kernel parameter setting in the SVM training procedure, which significantly influences the regression accuracy. However, its use in reliability applications has not been yet widely explored. Bearing this in mind, remaining useful life values have been predicted here by using the hybrid PSO–SVM-based model from the remaining measured parameters (input variables) for aircraft engines with success. A coefficient of determination equal to 0.9034 was obtained when this hybrid PSO–RBF–SVM-based model was applied to experimental data. The agreement of this model with experimental data confirmed its good performance. One of the main advantages of this predictive model is that it does not require information about the previous operation states of the engine. Finally, the main conclusions of this study are exposed. - Highlights: • A hybrid PSO–SVM-based model is built as a predictive model of the RUL values for aircraft engines. • The remaining physical–chemical variables in this process are studied in depth. • The obtained regression accuracy of our method is about 95%. • The results show that PSO–SVM-based model can assist in the diagnosis of the RUL values with accuracy

  15. Cerebral 18F-FDG PET in macrophagic myofasciitis: An individual SVM-based approach.

    Science.gov (United States)

    Blanc-Durand, Paul; Van Der Gucht, Axel; Guedj, Eric; Abulizi, Mukedaisi; Aoun-Sebaiti, Mehdi; Lerman, Lionel; Verger, Antoine; Authier, François-Jérôme; Itti, Emmanuel

    2017-01-01

    Macrophagic myofasciitis (MMF) is an emerging condition with highly specific myopathological alterations. A peculiar spatial pattern of a cerebral glucose hypometabolism involving occipito-temporal cortex and cerebellum have been reported in patients with MMF; however, the full pattern is not systematically present in routine interpretation of scans, and with varying degrees of severity depending on the cognitive profile of patients. Aim was to generate and evaluate a support vector machine (SVM) procedure to classify patients between healthy or MMF 18F-FDG brain profiles. 18F-FDG PET brain images of 119 patients with MMF and 64 healthy subjects were retrospectively analyzed. The whole-population was divided into two groups; a training set (100 MMF, 44 healthy subjects) and a testing set (19 MMF, 20 healthy subjects). Dimensionality reduction was performed using a t-map from statistical parametric mapping (SPM) and a SVM with a linear kernel was trained on the training set. To evaluate the performance of the SVM classifier, values of sensitivity (Se), specificity (Sp), positive predictive value (PPV), negative predictive value (NPV) and accuracy (Acc) were calculated. The SPM12 analysis on the training set exhibited the already reported hypometabolism pattern involving occipito-temporal and fronto-parietal cortices, limbic system and cerebellum. The SVM procedure, based on the t-test mask generated from the training set, correctly classified MMF patients of the testing set with following Se, Sp, PPV, NPV and Acc: 89%, 85%, 85%, 89%, and 87%. We developed an original and individual approach including a SVM to classify patients between healthy or MMF metabolic brain profiles using 18F-FDG-PET. Machine learning algorithms are promising for computer-aided diagnosis but will need further validation in prospective cohorts.

  16. Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests

    Directory of Open Access Journals (Sweden)

    Santana Isabel

    2011-08-01

    Full Text Available Abstract Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI, but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p Conclusions When taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests. These methods may be used to improve accuracy, sensitivity and specificity of Dementia predictions from neuropsychological testing.

  17. Statistical analysis of textural features for improved classification of oral histopathological images.

    Science.gov (United States)

    Muthu Rama Krishnan, M; Shah, Pratik; Chakraborty, Chandan; Ray, Ajoy K

    2012-04-01

    The objective of this paper is to provide an improved technique, which can assist oncopathologists in correct screening of oral precancerous conditions specially oral submucous fibrosis (OSF) with significant accuracy on the basis of collagen fibres in the sub-epithelial connective tissue. The proposed scheme is composed of collagen fibres segmentation, its textural feature extraction and selection, screening perfomance enhancement under Gaussian transformation and finally classification. In this study, collagen fibres are segmented on R,G,B color channels using back-probagation neural network from 60 normal and 59 OSF histological images followed by histogram specification for reducing the stain intensity variation. Henceforth, textural features of collgen area are extracted using fractal approaches viz., differential box counting and brownian motion curve . Feature selection is done using Kullback-Leibler (KL) divergence criterion and the screening performance is evaluated based on various statistical tests to conform Gaussian nature. Here, the screening performance is enhanced under Gaussian transformation of the non-Gaussian features using hybrid distribution. Moreover, the routine screening is designed based on two statistical classifiers viz., Bayesian classification and support vector machines (SVM) to classify normal and OSF. It is observed that SVM with linear kernel function provides better classification accuracy (91.64%) as compared to Bayesian classifier. The addition of fractal features of collagen under Gaussian transformation improves Bayesian classifier's performance from 80.69% to 90.75%. Results are here studied and discussed.

  18. Frequency Optimization for Enhancement of Surface Defect Classification Using the Eddy Current Technique

    Science.gov (United States)

    Fan, Mengbao; Wang, Qi; Cao, Binghua; Ye, Bo; Sunny, Ali Imam; Tian, Guiyun

    2016-01-01

    Eddy current testing is quite a popular non-contact and cost-effective method for nondestructive evaluation of product quality and structural integrity. Excitation frequency is one of the key performance factors for defect characterization. In the literature, there are many interesting papers dealing with wide spectral content and optimal frequency in terms of detection sensitivity. However, research activity on frequency optimization with respect to characterization performances is lacking. In this paper, an investigation into optimum excitation frequency has been conducted to enhance surface defect classification performance. The influences of excitation frequency for a group of defects were revealed in terms of detection sensitivity, contrast between defect features, and classification accuracy using kernel principal component analysis (KPCA) and a support vector machine (SVM). It is observed that probe signals are the most sensitive on the whole for a group of defects when excitation frequency is set near the frequency at which maximum probe signals are retrieved for the largest defect. After the use of KPCA, the margins between the defect features are optimum from the perspective of the SVM, which adopts optimal hyperplanes for structure risk minimization. As a result, the best classification accuracy is obtained. The main contribution is that the influences of excitation frequency on defect characterization are interpreted, and experiment-based procedures are proposed to determine the optimal excitation frequency for a group of defects rather than a single defect with respect to optimal characterization performances. PMID:27164112

  19. [Object-oriented segmentation and classification of forest gap based on QuickBird remote sensing image.

    Science.gov (United States)

    Mao, Xue Gang; Du, Zi Han; Liu, Jia Qian; Chen, Shu Xin; Hou, Ji Yu

    2018-01-01

    Traditional field investigation and artificial interpretation could not satisfy the need of forest gaps extraction at regional scale. High spatial resolution remote sensing image provides the possibility for regional forest gaps extraction. In this study, we used object-oriented classification method to segment and classify forest gaps based on QuickBird high resolution optical remote sensing image in Jiangle National Forestry Farm of Fujian Province. In the process of object-oriented classification, 10 scales (10-100, with a step length of 10) were adopted to segment QuickBird remote sensing image; and the intersection area of reference object (RA or ) and intersection area of segmented object (RA os ) were adopted to evaluate the segmentation result at each scale. For segmentation result at each scale, 16 spectral characteristics and support vector machine classifier (SVM) were further used to classify forest gaps, non-forest gaps and others. The results showed that the optimal segmentation scale was 40 when RA or was equal to RA os . The accuracy difference between the maximum and minimum at different segmentation scales was 22%. At optimal scale, the overall classification accuracy was 88% (Kappa=0.82) based on SVM classifier. Combining high resolution remote sensing image data with object-oriented classification method could replace the traditional field investigation and artificial interpretation method to identify and classify forest gaps at regional scale.

  20. A New Hybrid Model FPA-SVM Considering Cointegration for Particular Matter Concentration Forecasting: A Case Study of Kunming and Yuxi, China.

    Science.gov (United States)

    Li, Weide; Kong, Demeng; Wu, Jinran

    2017-01-01

    Air pollution in China is becoming more serious especially for the particular matter (PM) because of rapid economic growth and fast expansion of urbanization. To solve the growing environment problems, daily PM2.5 and PM10 concentration data form January 1, 2015, to August 23, 2016, in Kunming and Yuxi (two important cities in Yunnan Province, China) are used to present a new hybrid model CI-FPA-SVM to forecast air PM2.5 and PM10 concentration in this paper. The proposed model involves two parts. Firstly, due to its deficiency to assess the possible correlation between different variables, the cointegration theory is introduced to get the input-output relationship and then obtain the nonlinear dynamical system with support vector machine (SVM), in which the parameters c and g are optimized by flower pollination algorithm (FPA). Six benchmark models, including FPA-SVM, CI-SVM, CI-GA-SVM, CI-PSO-SVM, CI-FPA-NN, and multiple linear regression model, are considered to verify the superiority of the proposed hybrid model. The empirical study results demonstrate that the proposed model CI-FPA-SVM is remarkably superior to all considered benchmark models for its high prediction accuracy, and the application of the model for forecasting can give effective monitoring and management of further air quality.

  1. A New Hybrid Model FPA-SVM Considering Cointegration for Particular Matter Concentration Forecasting: A Case Study of Kunming and Yuxi, China

    Directory of Open Access Journals (Sweden)

    Weide Li

    2017-01-01

    Full Text Available Air pollution in China is becoming more serious especially for the particular matter (PM because of rapid economic growth and fast expansion of urbanization. To solve the growing environment problems, daily PM2.5 and PM10 concentration data form January 1, 2015, to August 23, 2016, in Kunming and Yuxi (two important cities in Yunnan Province, China are used to present a new hybrid model CI-FPA-SVM to forecast air PM2.5 and PM10 concentration in this paper. The proposed model involves two parts. Firstly, due to its deficiency to assess the possible correlation between different variables, the cointegration theory is introduced to get the input-output relationship and then obtain the nonlinear dynamical system with support vector machine (SVM, in which the parameters c and g are optimized by flower pollination algorithm (FPA. Six benchmark models, including FPA-SVM, CI-SVM, CI-GA-SVM, CI-PSO-SVM, CI-FPA-NN, and multiple linear regression model, are considered to verify the superiority of the proposed hybrid model. The empirical study results demonstrate that the proposed model CI-FPA-SVM is remarkably superior to all considered benchmark models for its high prediction accuracy, and the application of the model for forecasting can give effective monitoring and management of further air quality.

  2. Improving Classification of Airborne Laser Scanning Echoes in the Forest-Tundra Ecotone Using Geostatistical and Statistical Measures

    Directory of Open Access Journals (Sweden)

    Nadja Stumberg

    2014-05-01

    Full Text Available The vegetation in the forest-tundra ecotone zone is expected to be highly affected by climate change and requires effective monitoring techniques. Airborne laser scanning (ALS has been proposed as a tool for the detection of small pioneer trees for such vast areas using laser height and intensity data. The main objective of the present study was to assess a possible improvement in the performance of classifying tree and nontree laser echoes from high-density ALS data. The data were collected along a 1000 km long transect stretching from southern to northern Norway. Different geostatistical and statistical measures derived from laser height and intensity values were used to extent and potentially improve more simple models ignoring the spatial context. Generalised linear models (GLM and support vector machines (SVM were employed as classification methods. Total accuracies and Cohen’s kappa coefficients were calculated and compared to those of simpler models from a previous study. For both classification methods, all models revealed total accuracies similar to the results of the simpler models. Concerning classification performance, however, the comparison of the kappa coefficients indicated a significant improvement for some models both using GLM and SVM, with classification accuracies >94%.

  3. Predicting the Types of Ion Channel-Targeted Conotoxins Based on AVC-SVM Model.

    Science.gov (United States)

    Xianfang, Wang; Junmei, Wang; Xiaolei, Wang; Yue, Zhang

    2017-01-01

    The conotoxin proteins are disulfide-rich small peptides. Predicting the types of ion channel-targeted conotoxins has great value in the treatment of chronic diseases, epilepsy, and cardiovascular diseases. To solve the problem of information redundancy existing when using current methods, a new model is presented to predict the types of ion channel-targeted conotoxins based on AVC (Analysis of Variance and Correlation) and SVM (Support Vector Machine). First, the F value is used to measure the significance level of the feature for the result, and the attribute with smaller F value is filtered by rough selection. Secondly, redundancy degree is calculated by Pearson Correlation Coefficient. And the threshold is set to filter attributes with weak independence to get the result of the refinement. Finally, SVM is used to predict the types of ion channel-targeted conotoxins. The experimental results show the proposed AVC-SVM model reaches an overall accuracy of 91.98%, an average accuracy of 92.17%, and the total number of parameters of 68. The proposed model provides highly useful information for further experimental research. The prediction model will be accessed free of charge at our web server.

  4. Fault diagnosis method based on FFT-RPCA-SVM for Cascaded-Multilevel Inverter.

    Science.gov (United States)

    Wang, Tianzhen; Qi, Jie; Xu, Hao; Wang, Yide; Liu, Lei; Gao, Diju

    2016-01-01

    Thanks to reduced switch stress, high quality of load wave, easy packaging and good extensibility, the cascaded H-bridge multilevel inverter is widely used in wind power system. To guarantee stable operation of system, a new fault diagnosis method, based on Fast Fourier Transform (FFT), Relative Principle Component Analysis (RPCA) and Support Vector Machine (SVM), is proposed for H-bridge multilevel inverter. To avoid the influence of load variation on fault diagnosis, the output voltages of the inverter is chosen as the fault characteristic signals. To shorten the time of diagnosis and improve the diagnostic accuracy, the main features of the fault characteristic signals are extracted by FFT. To further reduce the training time of SVM, the feature vector is reduced based on RPCA that can get a lower dimensional feature space. The fault classifier is constructed via SVM. An experimental prototype of the inverter is built to test the proposed method. Compared to other fault diagnosis methods, the experimental results demonstrate the high accuracy and efficiency of the proposed method. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.

  5. Classification of ECG beats using deep belief network and active learning.

    Science.gov (United States)

    G, Sayantan; T, Kien P; V, Kadambari K

    2018-04-12

    A new semi-supervised approach based on deep learning and active learning for classification of electrocardiogram signals (ECG) is proposed. The objective of the proposed work is to model a scientific method for classification of cardiac irregularities using electrocardiogram beats. The model follows the Association for the Advancement of medical instrumentation (AAMI) standards and consists of three phases. In phase I, feature representation of ECG is learnt using Gaussian-Bernoulli deep belief network followed by a linear support vector machine (SVM) training in the consecutive phase. It yields three deep models which are based on AAMI-defined classes, namely N, V, S, and F. In the last phase, a query generator is introduced to interact with the expert to label few beats to improve accuracy and sensitivity. The proposed approach depicts significant improvement in accuracy with minimal queries posed to the expert and fast online training as tested on the MIT-BIH Arrhythmia Database and the MIT-BIH Supra-ventricular Arrhythmia Database (SVDB). With 100 queries labeled by the expert in phase III, the method achieves an accuracy of 99.5% in "S" versus all classifications (SVEB) and 99.4% accuracy in "V " versus all classifications (VEB) on MIT-BIH Arrhythmia Database. In a similar manner, it is attributed that an accuracy of 97.5% for SVEB and 98.6% for VEB on SVDB database is achieved respectively. Graphical Abstract Reply- Deep belief network augmented by active learning for efficient prediction of arrhythmia.

  6. SUPPORT VECTOR MACHINE CLASSIFICATION OF OBJECT-BASED DATA FOR CROP MAPPING, USING MULTI-TEMPORAL LANDSAT IMAGERY

    Directory of Open Access Journals (Sweden)

    R. Devadas

    2012-07-01

    Full Text Available Crop mapping and time series analysis of agronomic cycles are critical for monitoring land use and land management practices, and analysing the issues of agro-environmental impacts and climate change. Multi-temporal Landsat data can be used to analyse decadal changes in cropping patterns at field level, owing to its medium spatial resolution and historical availability. This study attempts to develop robust remote sensing techniques, applicable across a large geographic extent, for state-wide mapping of cropping history in Queensland, Australia. In this context, traditional pixel-based classification was analysed in comparison with image object-based classification using advanced supervised machine-learning algorithms such as Support Vector Machine (SVM. For the Darling Downs region of southern Queensland we gathered a set of Landsat TM images from the 2010–2011 cropping season. Landsat data, along with the vegetation index images, were subjected to multiresolution segmentation to obtain polygon objects. Object-based methods enabled the analysis of aggregated sets of pixels, and exploited shape-related and textural variation, as well as spectral characteristics. SVM models were chosen after examining three shape-based parameters, twenty-three textural parameters and ten spectral parameters of the objects. We found that the object-based methods were superior to the pixel-based methods for classifying 4 major landuse/land cover classes, considering the complexities of within field spectral heterogeneity and spectral mixing. Comparative analysis clearly revealed that higher overall classification accuracy (95% was observed in the object-based SVM compared with that of traditional pixel-based classification (89% using maximum likelihood classifier (MLC. Object-based classification also resulted speckle-free images. Further, object-based SVM models were used to classify different broadacre crop types for summer and winter seasons. The influence of

  7. A prediction model of drug-induced ototoxicity developed by an optimal support vector machine (SVM) method.

    Science.gov (United States)

    Zhou, Shu; Li, Guo-Bo; Huang, Lu-Yi; Xie, Huan-Zhang; Zhao, Ying-Lan; Chen, Yu-Zong; Li, Lin-Li; Yang, Sheng-Yong

    2014-08-01

    Drug-induced ototoxicity, as a toxic side effect, is an important issue needed to be considered in drug discovery. Nevertheless, current experimental methods used to evaluate drug-induced ototoxicity are often time-consuming and expensive, indicating that they are not suitable for a large-scale evaluation of drug-induced ototoxicity in the early stage of drug discovery. We thus, in this investigation, established an effective computational prediction model of drug-induced ototoxicity using an optimal support vector machine (SVM) method, GA-CG-SVM. Three GA-CG-SVM models were developed based on three training sets containing agents bearing different risk levels of drug-induced ototoxicity. For comparison, models based on naïve Bayesian (NB) and recursive partitioning (RP) methods were also used on the same training sets. Among all the prediction models, the GA-CG-SVM model II showed the best performance, which offered prediction accuracies of 85.33% and 83.05% for two independent test sets, respectively. Overall, the good performance of the GA-CG-SVM model II indicates that it could be used for the prediction of drug-induced ototoxicity in the early stage of drug discovery. Copyright © 2014 Elsevier Ltd. All rights reserved.

  8. A Realistic Seizure Prediction Study Based on Multiclass SVM.

    Science.gov (United States)

    Direito, Bruno; Teixeira, César A; Sales, Francisco; Castelo-Branco, Miguel; Dourado, António

    2017-05-01

    A patient-specific algorithm, for epileptic seizure prediction, based on multiclass support-vector machines (SVM) and using multi-channel high-dimensional feature sets, is presented. The feature sets, combined with multiclass classification and post-processing schemes aim at the generation of alarms and reduced influence of false positives. This study considers 216 patients from the European Epilepsy Database, and includes 185 patients with scalp EEG recordings and 31 with intracranial data. The strategy was tested over a total of 16,729.80[Formula: see text]h of inter-ictal data, including 1206 seizures. We found an overall sensitivity of 38.47% and a false positive rate per hour of 0.20. The performance of the method achieved statistical significance in 24 patients (11% of the patients). Despite the encouraging results previously reported in specific datasets, the prospective demonstration on long-term EEG recording has been limited. Our study presents a prospective analysis of a large heterogeneous, multicentric dataset. The statistical framework based on conservative assumptions, reflects a realistic approach compared to constrained datasets, and/or in-sample evaluations. The improvement of these results, with the definition of an appropriate set of features able to improve the distinction between the pre-ictal and nonpre-ictal states, hence minimizing the effect of confounding variables, remains a key aspect.

  9. Spectral-spatial classification of hyperspectral data with mutual information based segmented stacked autoencoder approach

    Science.gov (United States)

    Paul, Subir; Nagesh Kumar, D.

    2018-04-01

    Hyperspectral (HS) data comprises of continuous spectral responses of hundreds of narrow spectral bands with very fine spectral resolution or bandwidth, which offer feature identification and classification with high accuracy. In the present study, Mutual Information (MI) based Segmented Stacked Autoencoder (S-SAE) approach for spectral-spatial classification of the HS data is proposed to reduce the complexity and computational time compared to Stacked Autoencoder (SAE) based feature extraction. A non-parametric dependency measure (MI) based spectral segmentation is proposed instead of linear and parametric dependency measure to take care of both linear and nonlinear inter-band dependency for spectral segmentation of the HS bands. Then morphological profiles are created corresponding to segmented spectral features to assimilate the spatial information in the spectral-spatial classification approach. Two non-parametric classifiers, Support Vector Machine (SVM) with Gaussian kernel and Random Forest (RF) are used for classification of the three most popularly used HS datasets. Results of the numerical experiments carried out in this study have shown that SVM with a Gaussian kernel is providing better results for the Pavia University and Botswana datasets whereas RF is performing better for Indian Pines dataset. The experiments performed with the proposed methodology provide encouraging results compared to numerous existing approaches.

  10. Predicting enhancer activity and variant impact using gkm-SVM.

    Science.gov (United States)

    Beer, Michael A

    2017-09-01

    We participated in the Critical Assessment of Genome Interpretation eQTL challenge to further test computational models of regulatory variant impact and their association with human disease. Our prediction model is based on a discriminative gapped-kmer SVM (gkm-SVM) trained on genome-wide chromatin accessibility data in the cell type of interest. The comparisons with massively parallel reporter assays (MPRA) in lymphoblasts show that gkm-SVM is among the most accurate prediction models even though all other models used the MPRA data for model training, and gkm-SVM did not. In addition, we compare gkm-SVM with other MPRA datasets and show that gkm-SVM is a reliable predictor of expression and that deltaSVM is a reliable predictor of variant impact in K562 cells and mouse retina. We further show that DHS (DNase-I hypersensitive sites) and ATAC-seq (assay for transposase-accessible chromatin using sequencing) data are equally predictive substrates for training gkm-SVM, and that DHS regions flanked by H3K27Ac and H3K4me1 marks are more predictive than DHS regions alone. © 2017 Wiley Periodicals, Inc.

  11. Graph-Based Semi-Supervised Hyperspectral Image Classification Using Spatial Information

    Science.gov (United States)

    Jamshidpour, N.; Homayouni, S.; Safari, A.

    2017-09-01

    Hyperspectral image classification has been one of the most popular research areas in the remote sensing community in the past decades. However, there are still some problems that need specific attentions. For example, the lack of enough labeled samples and the high dimensionality problem are two most important issues which degrade the performance of supervised classification dramatically. The main idea of semi-supervised learning is to overcome these issues by the contribution of unlabeled samples, which are available in an enormous amount. In this paper, we propose a graph-based semi-supervised classification method, which uses both spectral and spatial information for hyperspectral image classification. More specifically, two graphs were designed and constructed in order to exploit the relationship among pixels in spectral and spatial spaces respectively. Then, the Laplacians of both graphs were merged to form a weighted joint graph. The experiments were carried out on two different benchmark hyperspectral data sets. The proposed method performed significantly better than the well-known supervised classification methods, such as SVM. The assessments consisted of both accuracy and homogeneity analyses of the produced classification maps. The proposed spectral-spatial SSL method considerably increased the classification accuracy when the labeled training data set is too scarce.When there were only five labeled samples for each class, the performance improved 5.92% and 10.76% compared to spatial graph-based SSL, for AVIRIS Indian Pine and Pavia University data sets respectively.

  12. GRAPH-BASED SEMI-SUPERVISED HYPERSPECTRAL IMAGE CLASSIFICATION USING SPATIAL INFORMATION

    Directory of Open Access Journals (Sweden)

    N. Jamshidpour

    2017-09-01

    Full Text Available Hyperspectral image classification has been one of the most popular research areas in the remote sensing community in the past decades. However, there are still some problems that need specific attentions. For example, the lack of enough labeled samples and the high dimensionality problem are two most important issues which degrade the performance of supervised classification dramatically. The main idea of semi-supervised learning is to overcome these issues by the contribution of unlabeled samples, which are available in an enormous amount. In this paper, we propose a graph-based semi-supervised classification method, which uses both spectral and spatial information for hyperspectral image classification. More specifically, two graphs were designed and constructed in order to exploit the relationship among pixels in spectral and spatial spaces respectively. Then, the Laplacians of both graphs were merged to form a weighted joint graph. The experiments were carried out on two different benchmark hyperspectral data sets. The proposed method performed significantly better than the well-known supervised classification methods, such as SVM. The assessments consisted of both accuracy and homogeneity analyses of the produced classification maps. The proposed spectral-spatial SSL method considerably increased the classification accuracy when the labeled training data set is too scarce.When there were only five labeled samples for each class, the performance improved 5.92% and 10.76% compared to spatial graph-based SSL, for AVIRIS Indian Pine and Pavia University data sets respectively.

  13. A comprehensive simulation study on classification of RNA-Seq data.

    Directory of Open Access Journals (Sweden)

    Gökmen Zararsız

    Full Text Available RNA sequencing (RNA-Seq is a powerful technique for the gene-expression profiling of organisms that uses the capabilities of next-generation sequencing technologies. Developing gene-expression-based classification algorithms is an emerging powerful method for diagnosis, disease classification and monitoring at molecular level, as well as providing potential markers of diseases. Most of the statistical methods proposed for the classification of gene-expression data are either based on a continuous scale (eg. microarray data or require a normal distribution assumption. Hence, these methods cannot be directly applied to RNA-Seq data since they violate both data structure and distributional assumptions. However, it is possible to apply these algorithms with appropriate modifications to RNA-Seq data. One way is to develop count-based classifiers, such as Poisson linear discriminant analysis and negative binomial linear discriminant analysis. Another way is to bring the data closer to microarrays and apply microarray-based classifiers. In this study, we compared several classifiers including PLDA with and without power transformation, NBLDA, single SVM, bagging SVM (bagSVM, classification and regression trees (CART, and random forests (RF. We also examined the effect of several parameters such as overdispersion, sample size, number of genes, number of classes, differential-expression rate, and the transformation method on model performances. A comprehensive simulation study is conducted and the results are compared with the results of two miRNA and two mRNA experimental datasets. The results revealed that increasing the sample size, differential-expression rate and decreasing the dispersion parameter and number of groups lead to an increase in classification accuracy. Similar with differential-expression studies, the classification of RNA-Seq data requires careful attention when handling data overdispersion. We conclude that, as a count

  14. The accuracy of echocardiography versus surgical and pathological classification of patients with ruptured mitral chordae tendineae: a large study in a Chinese cardiovascular center

    Science.gov (United States)

    2011-01-01

    Background The accuracy of echocardiography versus surgical and pathological classification of patients with ruptured mitral chordae tendineae (RMCT) has not yet been investigated with a large study. Methods Clinical, hemodynamic, surgical, and pathological findings were reviewed for 242 patients with a preoperative diagnosis of RMCT that required mitral valvular surgery. Subjects were consecutive in-patients at Fuwai Hospital in 2002-2008. Patients were evaluated by thoracic echocardiography (TTE) and transesophageal echocardiography (TEE). RMCT cases were classified by location as anterior or posterior, and classified by degree as partial or complete RMCT, according to surgical findings. RMCT cases were also classified by pathology into four groups: myxomatous degeneration, chronic rheumatic valvulitis (CRV), infective endocarditis and others. Results Echocardiography showed that most patients had a flail mitral valve, moderate to severe mitral regurgitation, a dilated heart chamber, mild to moderate pulmonary artery hypertension and good heart function. The diagnostic accuracy for RMCT was 96.7% for TTE and 100% for TEE compared with surgical findings. Preliminary experiments demonstrated that the sensitivity and specificity of diagnosing anterior, posterior and partial RMCT were high, but the sensitivity of diagnosing complete RMCT was low. Surgical procedures for RMCT depended on the location of ruptured chordae tendineae, with no relationship between surgical procedure and complete or partial RMCT. The echocardiographic characteristics of RMCT included valvular thickening, extended subvalvular chordae, echo enhancement, abnormal echo or vegetation, combined with aortic valve damage in the four groups classified by pathology. The incidence of extended subvalvular chordae in the myxomatous group was higher than that in the other groups, and valve thickening in combination with AV damage in the CRV group was higher than that in the other groups. Infective

  15. Alexnet Feature Extraction and Multi-Kernel Learning for Objectoriented Classification

    Science.gov (United States)

    Ding, L.; Li, H.; Hu, C.; Zhang, W.; Wang, S.

    2018-04-01

    In view of the fact that the deep convolutional neural network has stronger ability of feature learning and feature expression, an exploratory research is done on feature extraction and classification for high resolution remote sensing images. Taking the Google image with 0.3 meter spatial resolution in Ludian area of Yunnan Province as an example, the image segmentation object was taken as the basic unit, and the pre-trained AlexNet deep convolution neural network model was used for feature extraction. And the spectral features, AlexNet features and GLCM texture features are combined with multi-kernel learning and SVM classifier, finally the classification results were compared and analyzed. The results show that the deep convolution neural network can extract more accurate remote sensing image features, and significantly improve the overall accuracy of classification, and provide a reference value for earthquake disaster investigation and remote sensing disaster evaluation.

  16. ALEXNET FEATURE EXTRACTION AND MULTI-KERNEL LEARNING FOR OBJECTORIENTED CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    L. Ding

    2018-04-01

    Full Text Available In view of the fact that the deep convolutional neural network has stronger ability of feature learning and feature expression, an exploratory research is done on feature extraction and classification for high resolution remote sensing images. Taking the Google image with 0.3 meter spatial resolution in Ludian area of Yunnan Province as an example, the image segmentation object was taken as the basic unit, and the pre-trained AlexNet deep convolution neural network model was used for feature extraction. And the spectral features, AlexNet features and GLCM texture features are combined with multi-kernel learning and SVM classifier, finally the classification results were compared and analyzed. The results show that the deep convolution neural network can extract more accurate remote sensing image features, and significantly improve the overall accuracy of classification, and provide a reference value for earthquake disaster investigation and remote sensing disaster evaluation.

  17. LaSVM-based big data learning system for dynamic prediction of air pollution in Tehran.

    Science.gov (United States)

    Ghaemi, Z; Alimohammadi, A; Farnaghi, M

    2018-04-20

    Due to critical impacts of air pollution, prediction and monitoring of air quality in urban areas are important tasks. However, because of the dynamic nature and high spatio-temporal variability, prediction of the air pollutant concentrations is a complex spatio-temporal problem. Distribution of pollutant concentration is influenced by various factors such as the historical pollution data and weather conditions. Conventional methods such as the support vector machine (SVM) or artificial neural networks (ANN) show some deficiencies when huge amount of streaming data have to be analyzed for urban air pollution prediction. In order to overcome the limitations of the conventional methods and improve the performance of urban air pollution prediction in Tehran, a spatio-temporal system is designed using a LaSVM-based online algorithm. Pollutant concentration and meteorological data along with geographical parameters are continually fed to the developed online forecasting system. Performance of the system is evaluated by comparing the prediction results of the Air Quality Index (AQI) with those of a traditional SVM algorithm. Results show an outstanding increase of speed by the online algorithm while preserving the accuracy of the SVM classifier. Comparison of the hourly predictions for next coming 24 h, with those of the measured pollution data in Tehran pollution monitoring stations shows an overall accuracy of 0.71, root mean square error of 0.54 and coefficient of determination of 0.81. These results are indicators of the practical usefulness of the online algorithm for real-time spatial and temporal prediction of the urban air quality.

  18. SVM models for analysing the headstreams of mine water inrush

    Energy Technology Data Exchange (ETDEWEB)

    Yan Zhi-gang; Du Pei-jun; Guo Da-zhi [China University of Science and Technology, Xuzhou (China). School of Environmental Science and Spatial Informatics

    2007-08-15

    The support vector machine (SVM) model was introduced to analyse the headstrean of water inrush in a coal mine. The SVM model, based on a hydrogeochemical method, was constructed for recognising two kinds of headstreams and the H-SVMs model was constructed for recognising multi- headstreams. The SVM method was applied to analyse the conditions of two mixed headstreams and the value of the SVM decision function was investigated as a means of denoting the hydrogeochemical abnormality. The experimental results show that the SVM is based on a strict mathematical theory, has a simple structure and a good overall performance. Moreover the parameter W in the decision function can describe the weights of discrimination indices of the headstream of water inrush. The value of the decision function can denote hydrogeochemistry abnormality, which is significant in the prevention of water inrush in a coal mine. 9 refs., 1 fig., 7 tabs.

  19. Administrative database concerns: accuracy of International Classification of Diseases, Ninth Revision coding is poor for preoperative anemia in patients undergoing spinal fusion.

    Science.gov (United States)

    Golinvaux, Nicholas S; Bohl, Daniel D; Basques, Bryce A; Grauer, Jonathan N

    2014-11-15

    Cross-sectional study. To objectively evaluate the ability of International Classification of Diseases, Ninth Revision (ICD-9) codes, which are used as the foundation for administratively coded national databases, to identify preoperative anemia in patients undergoing spinal fusion. National database research in spine surgery continues to rise. However, the validity of studies based on administratively coded data, such as the Nationwide Inpatient Sample, are dependent on the accuracy of ICD-9 coding. Such coding has previously been found to have poor sensitivity to conditions such as obesity and infection. A cross-sectional study was performed at an academic medical center. Hospital-reported anemia ICD-9 codes (those used for administratively coded databases) were directly compared with the chart-documented preoperative hematocrits (true laboratory values). A patient was deemed to have preoperative anemia if the preoperative hematocrit was less than the lower end of the normal range (36.0% for females and 41.0% for males). The study included 260 patients. Of these, 37 patients (14.2%) were anemic; however, only 10 patients (3.8%) received an "anemia" ICD-9 code. Of the 10 patients coded as anemic, 7 were anemic by definition, whereas 3 were not, and thus were miscoded. This equates to an ICD-9 code sensitivity of 0.19, with a specificity of 0.99, and positive and negative predictive values of 0.70 and 0.88, respectively. This study uses preoperative anemia to demonstrate the potential inaccuracies of ICD-9 coding. These results have implications for publications using databases that are compiled from ICD-9 coding data. Furthermore, the findings of the current investigation raise concerns regarding the accuracy of additional comorbidities. Although administrative databases are powerful resources that provide large sample sizes, it is crucial that we further consider the quality of the data source relative to its intended purpose.

  20. PCA-MLP SVM distinction of salivary Raman spectra of dengue fever infection.

    Science.gov (United States)

    Radzol, A R M; Lee, Khuan Y; Mansor, W; Wong, P S; Looi, I

    2017-07-01

    Dengue fever (DF) is a disease of major concern caused by flavivirus infection. Delayed diagnosis leads to severe stages, which could be deadly. Of recent, non-structural protein (NS1) has been acknowledged as a biomarker, alternative to immunoglobulins for early detection of dengue in blood. Further, non-invasive detection of NS1 in saliva makes the approach more appealing. However, since its concentration in saliva is less than blood, a sensitive and specific technique, Surface Enhanced Raman Spectroscopy (SERS), is employed. Our work here intends to define an optimal PCA-SVM (Principal Component Analysis-Support Vector Machine) with Multilayer Layer Perceptron (MLP) kernel model to distinct between positive and negative NS1 infected samples from salivary SERS spectra, which, to the best of our knowledge, has never been explored. Salivary samples of DF positive and negative subjects were collected, pre-processed and analyzed. PCA and SVM classifier were then used to differentiate the SERS analyzed spectra. Since performance of the model depends on the PCA criterion and MLP parameters, both are examined in tandem. Its performance is also compared to our previous works on simulated NS1 salivary samples. It is found that the best PCA-SVM (MLP) model can be defined by 95 PCs from CPV criterion with P1 and P2 values of 0.01 and -0.2 respectively. A classification performance of [76.88%, 85.92%, 67.83%] is achieved.

  1. Relative significance of heat transfer processes to quantify tradeoffs between complexity and accuracy of energy simulations with a building energy use patterns classification

    Science.gov (United States)

    Heidarinejad, Mohammad

    This dissertation develops rapid and accurate building energy simulations based on a building classification that identifies and focuses modeling efforts on most significant heat transfer processes. The building classification identifies energy use patterns and their contributing parameters for a portfolio of buildings. The dissertation hypothesis is "Building classification can provide minimal required inputs for rapid and accurate energy simulations for a large number of buildings". The critical literature review indicated there is lack of studies to (1) Consider synoptic point of view rather than the case study approach, (2) Analyze influence of different granularities of energy use, (3) Identify key variables based on the heat transfer processes, and (4) Automate the procedure to quantify model complexity with accuracy. Therefore, three dissertation objectives are designed to test out the dissertation hypothesis: (1) Develop different classes of buildings based on their energy use patterns, (2) Develop different building energy simulation approaches for the identified classes of buildings to quantify tradeoffs between model accuracy and complexity, (3) Demonstrate building simulation approaches for case studies. Penn State's and Harvard's campus buildings as well as high performance LEED NC office buildings are test beds for this study to develop different classes of buildings. The campus buildings include detailed chilled water, electricity, and steam data, enabling to classify buildings into externally-load, internally-load, or mixed-load dominated. The energy use of the internally-load buildings is primarily a function of the internal loads and their schedules. Externally-load dominated buildings tend to have an energy use pattern that is a function of building construction materials and outdoor weather conditions. However, most of the commercial medium-sized office buildings have a mixed-load pattern, meaning the HVAC system and operation schedule dictate

  2. Unraveling the linguistic nature of specific autobiographical memories using a computerized classification algorithm.

    Science.gov (United States)

    Takano, Keisuke; Ueno, Mayumi; Moriya, Jun; Mori, Masaki; Nishiguchi, Yuki; Raes, Filip

    2017-06-01

    In the present study, we explored the linguistic nature of specific memories generated with the Autobiographical Memory Test (AMT) by developing a computerized classifier that distinguishes between specific and nonspecific memories. The AMT is regarded as one of the most important assessment tools to study memory dysfunctions (e.g., difficulty recalling the specific details of memories) in psychopathology. In Study 1, we utilized the Japanese corpus data of 12,400 cue-recalled memories tagged with observer-rated specificity. We extracted linguistic features of particular relevance to memory specificity, such as past tense, negation, and adverbial words and phrases pertaining to time and location. On the basis of these features, a support vector machine (SVM) was trained to classify the memories into specific and nonspecific categories, which achieved an area under the curve (AUC) of .92 in a performance test. In Study 2, the trained SVM was tested in terms of its robustness in classifying novel memories (n = 8,478) that were retrieved in response to cue words that were different from those used in Study 1. The SVM showed an AUC of .89 in classifying the new memories. In Study 3, we extended the binary SVM to a five-class classification of the AMT, which achieved 64%-65% classification accuracy, against the chance level (20%) in the performance tests. Our data suggest that memory specificity can be identified with a relatively small number of words, capturing the universal linguistic features of memory specificity across memories in diverse contents.

  3. CLASSIFICATION OF WATER SURFACES USING AIRBORNE TOPOGRAPHIC LIDAR DATA

    Directory of Open Access Journals (Sweden)

    J. Smeeckaert

    2013-05-01

    Full Text Available Accurate Digital Terrain Models (DTM are inevitable inputs for mapping areas subject to natural hazards. Topographic airborne laser scanning has become an established technique to characterize the Earth surface: lidar provides 3D point clouds allowing a fine reconstruction of the topography. For flood hazard modeling, the key step before terrain modeling is the discrimination of land and water surfaces within the delivered point clouds. Therefore, instantaneous shoreline, river borders, inland waters can be extracted as a basis for more reliable DTM generation. This paper presents an automatic, efficient, and versatile workflow for land/water classification of airborne topographic lidar data. For that purpose, a classification framework based on Support Vector Machines (SVM is designed. First, a restricted set of features, based only 3D lidar point coordinates and flightline information, is defined. Then, the SVM learning step is performed on small but well-targeted areas thanks to an automatic region growing strategy. Finally, label probabilities given by the SVM are merged during a probabilistic relaxation step in order to remove pixel-wise misclassification. Results show that survey of millions of points are labelled with high accuracy (>95% in most cases for coastal areas, and >89% for rivers and that small natural and anthropic features of interest are still well classified though we work at low point densities (0.5–4 pts/m2. Our approach is valid for coasts and rivers, and provides a strong basis for further discrimination of land-cover classes and coastal habitats.

  4. Data classification based on the hybrid intellectual technology

    Directory of Open Access Journals (Sweden)

    Demidova Liliya

    2018-01-01

    Full Text Available In this paper the data classification technique, implying the consistent application of the SVM and Parzen classifiers, has been suggested. The Parser classifier applies to data which can be both correctly and erroneously classified using the SVM classifier, and are located in the experimentally defined subareas near the hyperplane which separates the classes. A herewith, the SVM classifier is used with the default parameters values, and the optimal parameters values of the Parser classifier are determined using the genetic algorithm. The experimental results confirming the effectiveness of the proposed hybrid intellectual data classification technology have been presented.

  5. Learning machines and sleeping brains: Automatic sleep stage classification using decision-tree multi-class support vector machines.

    Science.gov (United States)

    Lajnef, Tarek; Chaibi, Sahbi; Ruby, Perrine; Aguera, Pierre-Emmanuel; Eichenlaub, Jean-Baptiste; Samet, Mounir; Kachouri, Abdennaceur; Jerbi, Karim

    2015-07-30

    Sleep staging is a critical step in a range of electrophysiological signal processing pipelines used in clinical routine as well as in sleep research. Although the results currently achievable with automatic sleep staging methods are promising, there is need for improvement, especially given the time-consuming and tedious nature of visual sleep scoring. Here we propose a sleep staging framework that consists of a multi-class support vector machine (SVM) classification based on a decision tree approach. The performance of the method was evaluated using polysomnographic data from 15 subjects (electroencephalogram (EEG), electrooculogram (EOG) and electromyogram (EMG) recordings). The decision tree, or dendrogram, was obtained using a hierarchical clustering technique and a wide range of time and frequency-domain features were extracted. Feature selection was carried out using forward sequential selection and classification was evaluated using k-fold cross-validation. The dendrogram-based SVM (DSVM) achieved mean specificity, sensitivity and overall accuracy of 0.92, 0.74 and 0.88 respectively, compared to expert visual scoring. Restricting DSVM classification to data where both experts' scoring was consistent (76.73% of the data) led to a mean specificity, sensitivity and overall accuracy of 0.94, 0.82 and 0.92 respectively. The DSVM framework outperforms classification with more standard multi-class "one-against-all" SVM and linear-discriminant analysis. The promising results of the proposed methodology suggest that it may be a valuable alternative to existing automatic methods and that it could accelerate visual scoring by providing a robust starting hypnogram that can be further fine-tuned by expert inspection. Copyright © 2015 Elsevier B.V. All rights reserved.

  6. Optimizing Multiple Kernel Learning for the Classification of UAV Data

    Directory of Open Access Journals (Sweden)

    Caroline M. Gevaert

    2016-12-01

    Full Text Available Unmanned Aerial Vehicles (UAVs are capable of providing high-quality orthoimagery and 3D information in the form of point clouds at a relatively low cost. Their increasing popularity stresses the necessity of understanding which algorithms are especially suited for processing the data obtained from UAVs. The features that are extracted from the point cloud and imagery have different statistical characteristics and can be considered as heterogeneous, which motivates the use of Multiple Kernel Learning (MKL for classification problems. In this paper, we illustrate the utility of applying MKL for the classification of heterogeneous features obtained from UAV data through a case study of an informal settlement in Kigali, Rwanda. Results indicate that MKL can achieve a classification accuracy of 90.6%, a 5.2% increase over a standard single-kernel Support Vector Machine (SVM. A comparison of seven MKL methods indicates that linearly-weighted kernel combinations based on simple heuristics are competitive with respect to computationally-complex, non-linear kernel combination methods. We further underline the importance of utilizing appropriate feature grouping strategies for MKL, which has not been directly addressed in the literature, and we propose a novel, automated feature grouping method that achieves a high classification accuracy for various MKL methods.

  7. Classification of breast cancer cytological specimen using convolutional neural network

    Science.gov (United States)

    Żejmo, Michał; Kowal, Marek; Korbicz, Józef; Monczak, Roman

    2017-01-01

    The paper presents a deep learning approach for automatic classification of breast tumors based on fine needle cytology. The main aim of the system is to distinguish benign from malignant cases based on microscopic images. Experiment was carried out on cytological samples derived from 50 patients (25 benign cases + 25 malignant cases) diagnosed in Regional Hospital in Zielona Góra. To classify microscopic images, we used convolutional neural networks (CNN) of two types: GoogLeNet and AlexNet. Due to the very large size of images of cytological specimen (on average 200000 × 100000 pixels), they were divided into smaller patches of size 256 × 256 pixels. Breast cancer classification usually is based on morphometric features of nuclei. Therefore, training and validation patches were selected using Support Vector Machine (SVM) so that suitable amount of cell material was depicted. Neural classifiers were tuned using GPU accelerated implementation of gradient descent algorithm. Training error was defined as a cross-entropy classification loss. Classification accuracy was defined as the percentage ratio of successfully classified validation patches to the total number of validation patches. The best accuracy rate of 83% was obtained by GoogLeNet model. We observed that more misclassified patches belong to malignant cases.

  8. Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods.

    Science.gov (United States)

    Tuo, Youlin; An, Ning; Zhang, Ming

    2018-03-01

    The aim of the present study was to investigate the feature genes in metastatic breast cancer samples. A total of 5 expression profiles of metastatic breast cancer samples were downloaded from the Gene Expression Omnibus database, which were then analyzed using the MetaQC and MetaDE packages in R language. The feature genes between metastasis and non‑metastasis samples were screened under the threshold of PSVM) classifier training and verification. The accuracy of the SVM classifier was then evaluated using another independent dataset from The Cancer Genome Atlas database. Finally, function and pathway enrichment analyses for genes in the SVM classifier were performed. A total of 541 feature genes were identified between metastatic and non‑metastatic samples. The top 10 genes with the highest betweenness centrality values in the PPI network of feature genes were Nuclear RNA Export Factor 1, cyclin‑dependent kinase 2 (CDK2), myelocytomatosis proto‑oncogene protein (MYC), Cullin 5, SHC Adaptor Protein 1, Clathrin heavy chain, Nucleolin, WD repeat domain 1, proteasome 26S subunit non‑ATPase 2 and telomeric repeat binding factor 2. The cyclin‑dependent kinase inhibitor 1A (CDKN1A), E2F transcription factor 1 (E2F1), and MYC interacted with CDK2. The SVM classifier constructed by the top 30 feature genes was able to distinguish metastatic samples from non‑metastatic samples [correct rate, specificity, positive predictive value and negative predictive value >0.89; sensitivity >0.84; area under the receiver operating characteristic curve (AUROC) >0.96]. The verification of the SVM classifier in an independent dataset (35 metastatic samples and 143 non‑metastatic samples) revealed an accuracy of 94.38% and AUROC of 0.958. Cell cycle associated functions and pathways were the most significant terms of the 30 feature genes. A SVM classifier was constructed to assess the possibility of breast cancer metastasis, which presented high accuracy in several

  9. An Automated and Intelligent Medical Decision Support System for Brain MRI Scans Classification.

    Directory of Open Access Journals (Sweden)

    Muhammad Faisal Siddiqui

    Full Text Available A wide interest has been observed in the medical health care applications that interpret neuroimaging scans by machine learning systems. This research proposes an intelligent, automatic, accurate, and robust classification technique to classify the human brain magnetic resonance image (MRI as normal or abnormal, to cater down the human error during identifying the diseases in brain MRIs. In this study, fast discrete wavelet transform (DWT, principal component analysis (PCA, and least squares support vector machine (LS-SVM are used as basic components. Firstly, fast DWT is employed to extract the salient features of brain MRI, followed by PCA, which reduces the dimensions of the features. These reduced feature vectors also shrink the memory storage consumption by 99.5%. At last, an advanced classification technique based on LS-SVM is applied to brain MR image classification using reduced features. For improving the efficiency, LS-SVM is used with non-linear radial basis function (RBF kernel. The proposed algorithm intelligently determines the optimized values of the hyper-parameters of the RBF kernel and also applied k-fold stratified cross validation to enhance the generalization of the system. The method was tested by 340 patients' benchmark datasets of T1-weighted and T2-weighted scans. From the analysis of experimental results and performance comparisons, it is observed that the proposed medical decision support system outperformed all other modern classifiers and achieves 100% accuracy rate (specificity/sensitivity 100%/100%. Furthermore, in terms of computation time, the proposed technique is significantly faster than the recent well-known methods, and it improves the efficiency by 71%, 3%, and 4% on feature extraction stage, feature reduction stage, and classification stage, respectively. These results indicate that the proposed well-trained machine learning system has the potential to make accurate predictions about brain abnormalities

  10. Seeing It All: Evaluating Supervised Machine Learning Methods for the Classification of Diverse Otariid Behaviours.

    Directory of Open Access Journals (Sweden)

    Monique A Ladds

    Full Text Available Constructing activity budgets for marine animals when they are at sea and cannot be directly observed is challenging, but recent advances in bio-logging technology offer solutions to this problem. Accelerometers can potentially identify a wide range of behaviours for animals based on unique patterns of acceleration. However, when analysing data derived from accelerometers, there are many statistical techniques available which when applied to different data sets produce different classification accuracies. We investigated a selection of supervised machine learning methods for interpreting behavioural data from captive otariids (fur seals and sea lions. We conducted controlled experiments with 12 seals, where their behaviours were filmed while they were wearing 3-axis accelerometers. From video we identified 26 behaviours that could be grouped into one of four categories (foraging, resting, travelling and grooming representing key behaviour states for wild seals. We used data from 10 seals to train four predictive classification models: stochastic gradient boosting (GBM, random forests, support vector machine using four different kernels and a baseline model: penalised logistic regression. We then took the best parameters from each model and cross-validated the results on the two seals unseen so far. We also investigated the influence of feature statistics (describing some characteristic of the seal, testing the models both with and without these. Cross-validation accuracies were lower than training accuracy, but the SVM with a polynomial kernel was still able to classify seal behaviour with high accuracy (>70%. Adding feature statistics improved accuracies across all models tested. Most categories of behaviour -resting, grooming and feeding-were all predicted with reasonable accuracy (52-81% by the SVM while travelling was poorly categorised (31-41%. These results show that model selection is important when classifying behaviour and that by using

  11. Combination of minimum enclosing balls classifier with SVM in coal-rock recognition

    Science.gov (United States)

    Song, QingJun; Jiang, HaiYan; Song, Qinghui; Zhao, XieGuang; Wu, Xiaoxuan

    2017-01-01

    Top-coal caving technology is a productive and efficient method in modern mechanized coal mining, the study of coal-rock recognition is key to realizing automation in comprehensive mechanized coal mining. In this paper we propose a new discriminant analysis framework for coal-rock recognition. In the framework, a data acquisition model with vibration and acoustic signals is designed and the caving dataset with 10 feature variables and three classes is got. And the perfect combination of feature variables can be automatically decided by using the multi-class F-score (MF-Score) feature selection. In terms of nonlinear mapping in real-world optimization problem, an effective minimum enclosing ball (MEB) algorithm plus Support vector machine (SVM) is proposed for rapid detection of coal-rock in the caving process. In particular, we illustrate how to construct MEB-SVM classifier in coal-rock recognition which exhibit inherently complex distribution data. The proposed method is examined on UCI data sets and the caving dataset, and compared with some new excellent SVM classifiers. We conduct experiments with accuracy and Friedman test for comparison of more classifiers over multiple on the UCI data sets. Experimental results demonstrate that the proposed algorithm has good robustness and generalization ability. The results of experiments on the caving dataset show the better performance which leads to a promising feature selection and multi-class recognition in coal-rock recognition. PMID:28937987

  12. Combination of minimum enclosing balls classifier with SVM in coal-rock recognition.

    Science.gov (United States)

    Song, QingJun; Jiang, HaiYan; Song, Qinghui; Zhao, XieGuang; Wu, Xiaoxuan

    2017-01-01

    Top-coal caving technology is a productive and efficient method in modern mechanized coal mining, the study of coal-rock recognition is key to realizing automation in comprehensive mechanized coal mining. In this paper we propose a new discriminant analysis framework for coal-rock recognition. In the framework, a data acquisition model with vibration and acoustic signals is designed and the caving dataset with 10 feature variables and three classes is got. And the perfect combination of feature variables can be automatically decided by using the multi-class F-score (MF-Score) feature selection. In terms of nonlinear mapping in real-world optimization problem, an effective minimum enclosing ball (MEB) algorithm plus Support vector machine (SVM) is proposed for rapid detection of coal-rock in the caving process. In particular, we illustrate how to construct MEB-SVM classifier in coal-rock recognition which exhibit inherently complex distribution data. The proposed method is examined on UCI data sets and the caving dataset, and compared with some new excellent SVM classifiers. We conduct experiments with accuracy and Friedman test for comparison of more classifiers over multiple on the UCI data sets. Experimental results demonstrate that the proposed algorithm has good robustness and generalization ability. The results of experiments on the caving dataset show the better performance which leads to a promising feature selection and multi-class recognition in coal-rock recognition.

  13. Combination of minimum enclosing balls classifier with SVM in coal-rock recognition.

    Directory of Open Access Journals (Sweden)

    QingJun Song

    Full Text Available Top-coal caving technology is a productive and efficient method in modern mechanized coal mining, the study of coal-rock recognition is key to realizing automation in comprehensive mechanized coal mining. In this paper we propose a new discriminant analysis framework for coal-rock recognition. In the framework, a data acquisition model with vibration and acoustic signals is designed and the caving dataset with 10 feature variables and three classes is got. And the perfect combination of feature variables can be automatically decided by using the multi-class F-score (MF-Score feature selection. In terms of nonlinear mapping in real-world optimization problem, an effective minimum enclosing ball (MEB algorithm plus Support vector machine (SVM is proposed for rapid detection of coal-rock in the caving process. In particular, we illustrate how to construct MEB-SVM classifier in coal-rock recognition which exhibit inherently complex distribution data. The proposed method is examined on UCI data sets and the caving dataset, and compared with some new excellent SVM classifiers. We conduct experiments with accuracy and Friedman test for comparison of more classifiers over multiple on the UCI data sets. Experimental results demonstrate that the proposed algorithm has good robustness and generalization ability. The results of experiments on the caving dataset show the better performance which leads to a promising feature selection and multi-class recognition in coal-rock recognition.

  14. Predication of Crane Condition Parameters Based on SVM and AR

    International Nuclear Information System (INIS)

    Xu Xiuzhong; Hu Xiong; Zhou Congxiao

    2011-01-01

    Through statistic analysis of vibration signals of motor on the container crane hoisting mechanism in a port, the feature vectors with vibration are obtained. Through data preprocessing and training data, Training models of condition parameters based on support vector machine (SVM) are established. The testing data of condition monitoring parameters can be predicted by the training models. During training the models, the penalty parameter and kernel function of model are optimized by cross validation. In order to analysis the accurate of SVM model, autoregressive model is used to predict the trend of vibration. The research showed the predicted results of model using SVM are better than the results by autoregressive (AR) modeling.

  15. Wavelet-based multicomponent denoising on GPU to improve the classification of hyperspectral images

    Science.gov (United States)

    Quesada-Barriuso, Pablo; Heras, Dora B.; Argüello, Francisco; Mouriño, J. C.

    2017-10-01

    Supervised classification allows handling a wide range of remote sensing hyperspectral applications. Enhancing the spatial organization of the pixels over the image has proven to be beneficial for the interpretation of the image content, thus increasing the classification accuracy. Denoising in the spatial domain of the image has been shown as a technique that enhances the structures in the image. This paper proposes a multi-component denoising approach in order to increase the classification accuracy when a classification method is applied. It is computed on multicore CPUs and NVIDIA GPUs. The method combines feature extraction based on a 1Ddiscrete wavelet transform (DWT) applied in the spectral dimension followed by an Extended Morphological Profile (EMP) and a classifier (SVM or ELM). The multi-component noise reduction is applied to the EMP just before the classification. The denoising recursively applies a separable 2D DWT after which the number of wavelet coefficients is reduced by using a threshold. Finally, inverse 2D-DWT filters are applied to reconstruct the noise free original component. The computational cost of the classifiers as well as the cost of the whole classification chain is high but it is reduced achieving real-time behavior for some applications through their computation on NVIDIA multi-GPU platforms.

  16. Pattern classification of brain activation during emotional processing in subclinical depression: psychosis proneness as potential confounding factor

    Directory of Open Access Journals (Sweden)

    Gemma Modinos

    2013-02-01

    Full Text Available We used Support Vector Machine (SVM to perform multivariate pattern classification based on brain activation during emotional processing in healthy participants with subclinical depressive symptoms. Six-hundred undergraduate students completed the Beck Depression Inventory II (BDI-II. Two groups were subsequently formed: (i subclinical (mild mood disturbance (n = 17 and (ii no mood disturbance (n = 17. Participants also completed a self-report questionnaire on subclinical psychotic symptoms, the Community Assessment of Psychic Experiences Questionnaire (CAPE positive subscale. The functional magnetic resonance imaging (fMRI paradigm entailed passive viewing of negative emotional and neutral scenes. The pattern of brain activity during emotional processing allowed correct group classification with an overall accuracy of 77% (p = 0.002, within a network of regions including the amygdala, insula, anterior cingulate cortex and medial prefrontal cortex. However, further analysis suggested that the classification accuracy could also be explained by subclinical psychotic symptom scores (correlation with SVM weights r = 0.459, p = 0.006. Psychosis proneness may thus be a confounding factor for neuroimaging studies in subclinical depression.

  17. Pattern classification of brain activation during emotional processing in subclinical depression: psychosis proneness as potential confounding factor.

    Science.gov (United States)

    Modinos, Gemma; Mechelli, Andrea; Pettersson-Yeo, William; Allen, Paul; McGuire, Philip; Aleman, Andre

    2013-01-01

    We used Support Vector Machine (SVM) to perform multivariate pattern classification based on brain activation during emotional processing in healthy participants with subclinical depressive symptoms. Six-hundred undergraduate students completed the Beck Depression Inventory II (BDI-II). Two groups were subsequently formed: (i) subclinical (mild) mood disturbance (n = 17) and (ii) no mood disturbance (n = 17). Participants also completed a self-report questionnaire on subclinical psychotic symptoms, the Community Assessment of Psychic Experiences Questionnaire (CAPE) positive subscale. The functional magnetic resonance imaging (fMRI) paradigm entailed passive viewing of negative emotional and neutral scenes. The pattern of brain activity during emotional processing allowed correct group classification with an overall accuracy of 77% (p = 0.002), within a network of regions including the amygdala, insula, anterior cingulate cortex and medial prefrontal cortex. However, further analysis suggested that the classification accuracy could also be explained by subclinical psychotic symptom scores (correlation with SVM weights r = 0.459, p = 0.006). Psychosis proneness may thus be a confounding factor for neuroimaging studies in subclinical depression.

  18. pDHS-SVM: A prediction method for plant DNase I hypersensitive sites based on support vector machine.

    Science.gov (United States)

    Zhang, Shanxin; Zhou, Zhiping; Chen, Xinmeng; Hu, Yong; Yang, Lindong

    2017-08-07

    DNase I hypersensitive sites (DHSs) are accessible chromatin regions hypersensitive to cleavages by DNase I endonucleases. DHSs are indicative of cis-regulatory DNA elements (CREs), all of which play important roles in global gene expression regulation. It is helpful for discovering CREs by recognition of DHSs in genome. To accelerate the investigation, it is an important complement to develop cost-effective computational methods to identify DHSs. However, there is a lack of tools used for identifying DHSs in plant genome. Here we presented pDHS-SVM, a computational predictor to identify plant DHSs. To integrate the global sequence-order information and local DNA properties, reverse complement kmer and dinucleotide-based auto covariance of DNA sequences were applied to construct the feature space. In this work, fifteen physical-chemical properties of dinucleotides were used and Support Vector Machine (SVM) was employed. To further improve the performance of the predictor and extract an optimized subset of nucleotide physical-chemical properties positive for the DHSs, a heuristic nucleotide physical-chemical property selection algorithm was introduced. With the optimized subset of properties, experimental results of Arabidopsis thaliana and rice (Oryza sativa) showed that pDHS-SVM could achieve accuracies up to 87.00%, and 85.79%, respectively. The results indicated the effectiveness of proposed method for predicting DHSs. Furthermore, pDHS-SVM could provide a helpful complement for predicting CREs in plant genome. Our implementation of the novel proposed method pDHS-SVM is freely available as source code, at https://github.com/shanxinzhang/pDHS-SVM. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. Stacked Denoising Autoencoders Applied to Star/Galaxy Classification

    Science.gov (United States)

    Qin, Hao-ran; Lin, Ji-ming; Wang, Jun-yi

    2017-04-01

    In recent years, the deep learning algorithm, with the characteristics of strong adaptability, high accuracy, and structural complexity, has become more and more popular, but it has not yet been used in astronomy. In order to solve the problem that the star/galaxy classification accuracy is high for the bright source set, but low for the faint source set of the Sloan Digital Sky Survey (SDSS) data, we introduced the new deep learning algorithm, namely the SDA (stacked denoising autoencoder) neural network and the dropout fine-tuning technique, which can greatly improve the robustness and antinoise performance. We randomly selected respectively the bright source sets and faint source sets from the SDSS DR12 and DR7 data with spectroscopic measurements, and made preprocessing on them. Then, we randomly selected respectively the training sets and testing sets without replacement from the bright source sets and faint source sets. At last, using these training sets we made the training to obtain the SDA models of the bright sources and faint sources in the SDSS DR7 and DR12, respectively. We compared the test result of the SDA model on the DR12 testing set with the test results of the Library for Support Vector Machines (LibSVM), J48 decision tree, Logistic Model Tree (LMT), Support Vector Machine (SVM), Logistic Regression, and Decision Stump algorithm, and compared the test result of the SDA model on the DR7 testing set with the test results of six kinds of decision trees. The experiments show that the SDA has a better classification accuracy than other machine learning algorithms for the faint source sets of DR7 and DR12. Especially, when the completeness function is used as the evaluation index, compared with the decision tree algorithms, the correctness rate of SDA has improved about 15% for the faint source set of SDSS-DR7.

  20. Prediction of protein-protein interactions between viruses and human by an SVM model

    Directory of Open Access Journals (Sweden)

    Cui Guangyu

    2012-05-01

    Full Text Available Abstract Background Several computational methods have been developed to predict protein-protein interactions from amino acid sequences, but most of those methods are intended for the interactions within a species rather than for interactions across different species. Methods for predicting interactions between homogeneous proteins are not appropriate for finding those between heterogeneous proteins since they do not distinguish the interactions between proteins of the same species from those of different species. Results We developed a new method for representing a protein sequence of variable length in a frequency vector of fixed length, which encodes the relative frequency of three consecutive amino acids of a sequence. We built a support vector machine (SVM model to predict human proteins that interact with virus proteins. In two types of viruses, human papillomaviruses (HPV and hepatitis C virus (HCV, our SVM model achieved an average accuracy above 80%, which is higher than that of another SVM model with a different representation scheme. Using the SVM model and Gene Ontology (GO annotations of proteins, we predicted new interactions between virus proteins and human proteins. Conclusions Encoding the relative frequency of amino acid triplets of a protein sequence is a simple yet powerful representation method for predicting protein-protein interactions across different species. The representation method has several advantages: (1 it enables a prediction model to achieve a better performance than other representations, (2 it generates feature vectors of fixed length regardless of the sequence length, and (3 the same representation is applicable to different types of proteins.

  1. A new classification scheme of plastic wastes based upon recycling labels

    Energy Technology Data Exchange (ETDEWEB)

    Özkan, Kemal, E-mail: kozkan@ogu.edu.tr [Computer Engineering Dept., Eskişehir Osmangazi University, 26480 Eskişehir (Turkey); Ergin, Semih, E-mail: sergin@ogu.edu.tr [Electrical Electronics Engineering Dept., Eskişehir Osmangazi University, 26480 Eskişehir (Turkey); Işık, Şahin, E-mail: sahini@ogu.edu.tr [Computer Engineering Dept., Eskişehir Osmangazi University, 26480 Eskişehir (Turkey); Işıklı, İdil, E-mail: idil.isikli@bilecik.edu.tr [Electrical Electronics Engineering Dept., Bilecik University, 11210 Bilecik (Turkey)

    2015-01-15

    experimental setup with a camera and homogenous backlighting. Due to the giving global solution for a classification problem, Support Vector Machine (SVM) is selected to achieve the classification task and majority voting technique is used as the decision mechanism. This technique equally weights each classification result and assigns the given plastic object to the class that the most classification results agree on. The proposed classification scheme provides high accuracy rate, and also it is able to run in real-time applications. It can automatically classify the plastic bottle types with approximately 90% recognition accuracy. Besides this, the proposed methodology yields approximately 96% classification rate for the separation of PET or non-PET plastic types. It also gives 92% accuracy for the categorization of non-PET plastic types into HPDE or PP.

  2. A new classification scheme of plastic wastes based upon recycling labels

    International Nuclear Information System (INIS)

    Özkan, Kemal; Ergin, Semih; Işık, Şahin; Işıklı, İdil

    2015-01-01

    experimental setup with a camera and homogenous backlighting. Due to the giving global solution for a classification problem, Support Vector Machine (SVM) is selected to achieve the classification task and majority voting technique is used as the decision mechanism. This technique equally weights each classification result and assigns the given plastic object to the class that the most classification results agree on. The proposed classification scheme provides high accuracy rate, and also it is able to run in real-time applications. It can automatically classify the plastic bottle types with approximately 90% recognition accuracy. Besides this, the proposed methodology yields approximately 96% classification rate for the separation of PET or non-PET plastic types. It also gives 92% accuracy for the categorization of non-PET plastic types into HPDE or PP

  3. Functional assignment to JEV proteins using SVM.

    Science.gov (United States)

    Sahoo, Ganesh Chandra; Dikhit, Manas Ranjan; Das, Pradeep

    2008-01-01

    Identification of different protein functions facilitates a mechanistic understanding of Japanese encephalitis virus (JEV) infection and opens novel means for drug development. Support vector machines (SVM), useful for predicting the functional class of distantly related proteins, is employed to ascribe a possible functional class to Japanese encephalitis virus protein. Our study from SVMProt and available JE virus sequences suggests that structural and nonstructural proteins of JEV genome possibly belong to diverse protein functions, are expected to occur in the life cycle of JE virus. Protein functions common to both structural and non-structural proteins are iron-binding, metal-binding, lipid-binding, copper-binding, transmembrane, outer membrane, channels/Pores - Pore-forming toxins (proteins and peptides) group of proteins. Non-structural proteins perform functions like actin binding, zinc-binding, calcium-binding, hydrolases, Carbon-Oxygen Lyases, P-type ATPase, proteins belonging to major facilitator family (MFS), secreting main terminal branch (MTB) family, phosphotransfer-driven group translocators and ATP-binding cassette (ABC) family group of proteins. Whereas structural proteins besides belonging to same structural group of proteins (capsid, structural, envelope), they also perform functions like nuclear receptor, antibiotic resistance, RNA-binding, DNA-binding, magnesium-binding, isomerase (intra-molecular), oxidoreductase and participate in type II (general) secretory pathway (IISP).

  4. Elucidation of Metallic Plume and Spatter Characteristics Based on SVM During High-Power Disk Laser Welding

    International Nuclear Information System (INIS)

    Gao Xiangdong; Liu Guiqian

    2015-01-01

    During deep penetration laser welding, there exist plume (weak plasma) and spatters, which are the results of weld material ejection due to strong laser heating. The characteristics of plume and spatters are related to welding stability and quality. Characteristics of metallic plume and spatters were investigated during high-power disk laser bead-on-plate welding of Type 304 austenitic stainless steel plates at a continuous wave laser power of 10 kW. An ultraviolet and visible sensitive high-speed camera was used to capture the metallic plume and spatter images. Plume area, laser beam path through the plume, swing angle, distance between laser beam focus and plume image centroid, abscissa of plume centroid and spatter numbers are defined as eigenvalues, and the weld bead width was used as a characteristic parameter that reflected welding stability. Welding status was distinguished by SVM (support vector machine) after data normalization and characteristic analysis. Also, PCA (principal components analysis) feature extraction was used to reduce the dimensions of feature space, and PSO (particle swarm optimization) was used to optimize the parameters of SVM. Finally a classification model based on SVM was established to estimate the weld bead width and welding stability. Experimental results show that the established algorithm based on SVM could effectively distinguish the variation of weld bead width, thus providing an experimental example of monitoring high-power disk laser welding quality. (plasma technology)

  5. Accurate crop classification using hierarchical genetic fuzzy rule-based systems

    Science.gov (United States)

    Topaloglou, Charalampos A.; Mylonas, Stelios K.; Stavrakoudis, Dimitris G.; Mastorocostas, Paris A.; Theocharis, John B.

    2014-10-01

    This paper investigates the effectiveness of an advanced classification system for accurate crop classification using very high resolution (VHR) satellite imagery. Specifically, a recently proposed genetic fuzzy rule-based classification system (GFRBCS) is employed, namely, the Hierarchical Rule-based Linguistic Classifier (HiRLiC). HiRLiC's model comprises a small set of simple IF-THEN fuzzy rules, easily interpretable by humans. One of its most important attributes is that its learning algorithm requires minimum user interaction, since the most important learning parameters affecting the classification accuracy are determined by the learning algorithm automatically. HiRLiC is applied in a challenging crop classification task, using a SPOT5 satellite image over an intensively cultivated area in a lake-wetland ecosystem in northern Greece. A rich set of higher-order spectral and textural features is derived from the initial bands of the (pan-sharpened) image, resulting in an input space comprising 119 features. The experimental analysis proves that HiRLiC compares favorably to other interpretable classifiers of the literature, both in terms of structural complexity and classification accuracy. Its testing accuracy was very close to that obtained by complex state-of-the-art classification systems, such as the support vector machines (SVM) and random forest (RF) classifiers. Nevertheless, visual inspection of the derived classification maps shows that HiRLiC is characterized by higher generalization properties, providing more homogeneous classifications that the competitors. Moreover, the runtime requirements for producing the thematic map was orders of magnitude lower than the respective for the competitors.

  6. Fusion of LiDAR and aerial imagery for the estimation of downed tree volume using Support Vector Machines classification and region based object fitting

    Science.gov (United States)

    Selvarajan, Sowmya

    The study classifies 3D small footprint full waveform digitized LiDAR fused with aerial imagery to downed trees using Support Vector Machines (SVM) algorithm. Using small footprint waveform LiDAR, airborne LiDAR systems can provide better canopy penetration and very high spatial resolution. The small footprint waveform scanner system Riegl LMS-Q680 is addition with an UltraCamX aerial camera are used to measure and map downed trees in a forest. The various data preprocessing steps helped in the identification of ground points from the dense LiDAR dataset and segment the LiDAR data to help reduce the complexity of the algorithm. The haze filtering process helped to differentiate the spectral signatures of the various classes within the aerial image. Such processes, helped to better select the features from both sensor data. The six features: LiDAR height, LiDAR intensity, LiDAR echo, and three image intensities are utilized. To do so, LiDAR derived, aerial image derived and fused LiDAR-aerial image derived features are used to organize the data for the SVM hypothesis formulation. Several variations of the SVM algorithm with different kernels and soft margin parameter C are experimented. The algorithm is implemented to classify downed trees over a pine trees zone. The LiDAR derived features provided an overall accuracy of 98% of downed trees but with no classification error of 86%. The image derived features provided an overall accuracy of 65% and fusion derived features resulted in an overall accuracy of 88%. The results are observed to be stable and robust. The SVM accuracies were accompanied by high false alarm rates, with the LiDAR classification producing 58.45%, image classification producing 95.74% and finally the fused classification producing 93% false alarm rates The Canny edge correction filter helped control the LiDAR false alarm to 35.99%, image false alarm to 48.56% and fused false alarm to 37.69% The implemented classifiers provided a powerful tool for

  7. Highly Accurate Classification of Watson-Crick Basepairs on Termini of Single DNA Molecules

    Science.gov (United States)

    Winters-Hilt, Stephen; Vercoutere, Wenonah; DeGuzman, Veronica S.; Deamer, David; Akeson, Mark; Haussler, David

    2003-01-01

    We introduce a computational method for classification of individual DNA molecules measured by an α-hemolysin channel detector. We show classification with better than 99% accuracy for DNA hairpin molecules that differ only in their terminal Watson-Crick basepairs. Signal classification was done in silico to establish performance metrics (i.e., where train and test data were of known type, via single-species data files). It was then performed in solution to assay real mixtures of DNA hairpins. Hidden Markov Models (HMMs) were used with Expectation/Maximization for denoising and for associating a feature vector with the ionic current blockade of the DNA molecule. Support Vector Machines (SVMs) were used as discriminators, and were the focus of off-line training. A multiclass SVM architecture was designed to place less discriminatory load on weaker discriminators, and novel SVM kernels were used to boost discrimination strength. The tuning on HMMs and SVMs enabled biophysical analysis of the captured molecule states and state transitions; structure revealed in the biophysical analysis was used for better feature selection. PMID:12547778

  8. Classification of Partial Discharge Measured under Different Levels of Noise Contamination.

    Directory of Open Access Journals (Sweden)

    Wong Jee Keen Raymond

    Full Text Available Cable joint insulation breakdown may cause a huge loss to power companies. Therefore, it is vital to diagnose the insulation quality to detect early signs of insulation failure. It is well known that there is a correlation between Partial discharge (PD and the insulation quality. Although many works have been done on PD pattern recognition, it is usually performed in a noise free environment. Also, works on PD pattern recognition in actual cable joint are less likely to be found in literature. Therefore, in this work, classifications of actual cable joint defect types from partial discharge data contaminated by noise were performed. Five cross-linked polyethylene (XLPE cable joints with artificially created defects were prepared based on the defects commonly encountered on site. Three different types of input feature were extracted from the PD pattern under artificially created noisy environment. These include statistical features, fractal features and principal component analysis (PCA features. These input features were used to train the classifiers to classify each PD defect types. Classifications were performed using three different artificial intelligence classifiers, which include Artificial Neural Networks (ANN, Adaptive Neuro-Fuzzy Inference System (ANFIS and Support Vector Machine (SVM. It was found that the classification accuracy decreases with higher noise level but PCA features used in SVM and ANN showed the strongest tolerance against noise contamination.

  9. Machine learning in infrared object classification - an all-sky selection of YSO candidates

    Science.gov (United States)

    Marton, Gabor; Zahorecz, Sarolta; Toth, L. Viktor; Magnus McGehee, Peregrine; Kun, Maria

    2015-08-01

    Object classification is a fundamental and challenging problem in the era of big data. I will discuss up-to-date methods and their application to classify infrared point sources.We analysed the ALLWISE catalogue, the most recent public source catalogue of the Wide-field Infrared Survey Explorer (WISE) to compile a reliable list of Young Stellar Object (YSO) candidates. We tested and compared classical and up-to-date statistical methods as well, to discriminate source types like extragalactic objects, evolved stars, main sequence stars, objects related to the interstellar medium and YSO candidates by using their mid-IR WISE properties and associated near-IR 2MASS data.In the particular classification problem the Support Vector Machines (SVM), a class of supervised learning algorithm turned out to be the best tool. As a result we classify Class I and II YSOs with >90% accuracy while the fraction of contaminating extragalactic objects remains well below 1%, based on the number of known objects listed in the SIMBAD and VizieR databases. We compare our results to other classification schemes from the literature and show that the SVM outperforms methods that apply linear cuts on the colour-colour and colour-magnitude space. Our homogenous YSO candidate catalog can serve as an excellent pathfinder for future detailed observations of individual objects and a starting point of statistical studies that aim to add pieces to the big picture of star formation theory.

  10. Drug-related webpages classification based on multi-modal local decision fusion

    Science.gov (United States)

    Hu, Ruiguang; Su, Xiaojing; Liu, Yanxin

    2018-03-01

    In this paper, multi-modal local decision fusion is used for drug-related webpages classification. First, meaningful text are extracted through HTML parsing, and effective images are chosen by the FOCARSS algorithm. Second, six SVM classifiers are trained for six kinds of drug-taking instruments, which are represented by PHOG. One SVM classifier is trained for the cannabis, which is represented by the mid-feature of BOW model. For each instance in a webpage, seven SVMs give seven labels for its image, and other seven labels are given by searching the names of drug-taking instruments and cannabis in its related text. Concatenating seven labels of image and seven labels of text, the representation of those instances in webpages are generated. Last, Multi-Instance Learning is used to classify those drugrelated webpages. Experimental results demonstrate that the classification accuracy of multi-instance learning with multi-modal local decision fusion is much higher than those of single-modal classification.

  11. Entropy-based gene ranking without selection bias for the predictive classification of microarray data

    Directory of Open Access Journals (Sweden)

    Serafini Maria

    2003-11-01

    Full Text Available Abstract Background We describe the E-RFE method for gene ranking, which is useful for the identification of markers in the predictive classification of array data. The method supports a practical modeling scheme designed to avoid the construction of classification rules based on the selection of too small gene subsets (an effect known as the selection bias, in which the estimated predictive errors are too optimistic due to testing on samples already considered in the feature selection process. Results With E-RFE, we speed up the recursive feature elimination (RFE with SVM classifiers by eliminating chunks of uninteresting genes using an entropy measure of the SVM weights distribution. An optimal subset of genes is selected according to a two-strata model evaluation procedure: modeling is replicated by an external stratified-partition resampling scheme, and, within each run, an internal K-fold cross-validation is used for E-RFE ranking. Also, the optimal number of genes can be estimated according to the saturation of Zipf's law profiles. Conclusions Without a decrease of classification accuracy, E-RFE allows a speed-up factor of 100 with respect to standard RFE, while improving on alternative parametric RFE reduction strategies. Thus, a process for gene selection and error estimation is made practical, ensuring control of the selection bias, and providing additional diagnostic indicators of gene importance.

  12. Stroke localization and classification using microwave tomography with k-means clustering and support vector machine.

    Science.gov (United States)

    Guo, Lei; Abbosh, Amin

    2018-05-01

    For any chance for stroke patients to survive, the stroke type should be classified to enable giving medication within a few hours of the onset of symptoms. In this paper, a microwave-based stroke localization and classification framework is proposed. It is based on microwave tomography, k-means clustering, and a support vector machine (SVM) method. The dielectric profile of the brain is first calculated using the Born iterative method, whereas the amplitude of the dielectric profile is then taken as the input to k-means clustering. The cluster is selected as the feature vector for constructing and testing the SVM. A database of MRI-derived realistic head phantoms at different signal-to-noise ratios is used in the classification procedure. The performance of the proposed framework is evaluated using the receiver operating characteristic (ROC) curve. The results based on a two-dimensional framework show that 88% classification accuracy, with a sensitivity of 91% and a specificity of 87%, can be achieved. Bioelectromagnetics. 39:312-324, 2018. © 2018 Wiley Periodicals, Inc. © 2018 Wiley Periodicals, Inc.

  13. An Object-Based Classification of Mangroves Using a Hybrid Decision Tree—Support Vector Machine Approach

    Directory of Open Access Journals (Sweden)

    Benjamin W. Heumann

    2011-11-01

    Full Text Available Mangroves provide valuable ecosystem goods and services such as carbon sequestration, habitat for terrestrial and marine fauna, and coastal hazard mitigation. The use of satellite remote sensing to map mangroves has become widespread as it can provide accurate, efficient, and repeatable assessments. Traditional remote sensing approaches have failed to accurately map fringe mangroves and true mangrove species due to relatively coarse spatial resolution and/or spectral confusion with landward vegetation. This study demonstrates the use of the new Worldview-2 sensor, Object-based image analysis (OBIA, and support vector machine (SVM classification to overcome both of these limitations. An exploratory spectral separability showed that individual mangrove species could not be spectrally separated, but a distinction between true and associate mangrove species could be made. An OBIA classification was used that combined a decision-tree classification with the machine-learning SVM classification. Results showed an overall accuracy greater than 94% (kappa = 0.863 for classifying true mangroves species and other dense coastal vegetation at the object level. There remain serious challenges to accurately mapping fringe mangroves using remote sensing data due to spectral similarity of mangrove and associate species, lack of clear zonation between species, and mixed pixel effects, especially when vegetation is sparse or degraded.

  14. A SVM bases AI design for interactive gaming

    OpenAIRE

    Jiang, Yang; Jiang, Jianmin; Palmer, Ian

    2008-01-01

    Interactive gaming requires automatic processing on large volume of random data produced by players on spot, such as shooting, football kicking, boxing etc. In this paper, we describe an artificial intelligence approach in processing such random data for interactive gaming by using a one-class support vector machine (OC-SVM). In comparison with existing techniques, our OC-SVM based interactive gaming design has the features of: (i): high speed processing, providing instant response to the pla...

  15. Parameters Optimization and Application to Glutamate Fermentation Model Using SVM

    OpenAIRE

    Zhang, Xiangsheng; Pan, Feng

    2015-01-01

    Aimed at the parameters optimization in support vector machine (SVM) for glutamate fermentation modelling, a new method is developed. It optimizes the SVM parameters via an improved particle swarm optimization (IPSO) algorithm which has better global searching ability. The algorithm includes detecting and handling the local convergence and exhibits strong ability to avoid being trapped in local minima. The material step of the method was shown. Simulation experiments demonstrate the effective...

  16. Parameters Optimization and Application to Glutamate Fermentation Model Using SVM

    Directory of Open Access Journals (Sweden)

    Xiangsheng Zhang

    2015-01-01

    Full Text Available Aimed at the parameters optimization in support vector machine (SVM for glutamate fermentation modelling, a new method is developed. It optimizes the SVM parameters via an improved particle swarm optimization (IPSO algorithm which has better global searching ability. The algorithm includes detecting and handling the local convergence and exhibits strong ability to avoid being trapped in local minima. The material step of the method was shown. Simulation experiments demonstrate the effectiveness of the proposed algorithm.

  17. PCA criterion for SVM (MLP) classifier for flavivirus biomarker from salivary SERS spectra at febrile stage.

    Science.gov (United States)

    Radzol, A R M; Lee, Khuan Y; Mansor, W; Omar, I S

    2016-08-01

    Non-structural protein (NS1) has been conceded as one of the biomarkers for flavivirus that causes diseases with life threatening consequences. NS1 is an antigen that allows detection of the illness at febrile stage, mostly from blood samples currently. Our work here intends to define an optimum model for PCA-SVM with MLP kernel for classification of flavivirus biomarker, NS1 molecule, from SERS spectra of saliva, which to the best of our knowledge has never been explored. Since performance of the model depends on the PCA criterion and MLP parameters, both are examined in tandem. Input vector to classifier determined by each PCA criterion is subjected to brute force tuning of MLP parameters for entirety. Its performance is also compared to our previous works where a Linear and RBF kernel are used. It is found that the best PCA-SVM (MLP) model can be defined by 5 PCs from Cattel's Scree test for PCA, together with P1 and P2 values of 0.1 and -0.2 respectively, with a classification performance of [96.9%, 93.8%, 100.0%].

  18. Classification across gene expression microarray studies

    Directory of Open Access Journals (Sweden)

    Kuner Ruprecht

    2009-12-01

    Full Text Available Abstract Background The increasing number of gene expression microarray studies represents an important resource in biomedical research. As a result, gene expression based diagnosis has entered clinical practice for patient stratification in breast cancer. However, the integration and combined analysis of microarray studies remains still a challenge. We assessed the potential benefit of data integration on the classification accuracy and systematically evaluated the generalization performance of selected methods on four breast cancer studies comprising almost 1000 independent samples. To this end, we introduced an evaluation framework which aims to establish good statistical practice and a graphical way to monitor differences. The classification goal was to correctly predict estrogen receptor status (negative/positive and histological grade (low/high of each tumor sample in an independent study which was not used for the training. For the classification we chose support vector machines (SVM, predictive analysis of microarrays (PAM, random forest (RF and k-top scoring pairs (kTSP. Guided by considerations relevant for classification across studies we developed a generalization of kTSP which we evaluated in addition. Our derived version (DV aims to improve the robustness of the intrinsic invariance of kTSP with respect to technologies and preprocessing. Results For each individual study the generalization error was benchmarked via complete cross-validation and was found to be similar for all classification methods. The misclassification rates were substantially higher in classification across studies, when each single study was used as an independent test set while all remaining studies were combined for the training of the classifier. However, with increasing number of independent microarray studies used in the training, the overall classification performance improved. DV performed better than the average and showed slightly less variance. In

  19. Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.

    Science.gov (United States)

    Park, Byungkyu; Im, Jinyong; Tuvshinjargal, Narankhuu; Lee, Wook; Han, Kyungsook

    2014-11-01

    As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of

  20. Data Driven Constraints for the SVM

    DEFF Research Database (Denmark)

    Darkner, Sune; Clemmensen, Line Katrine Harder

    2012-01-01

    We propose a generalized data driven constraint for support vector machines exemplified by classification of paired observations in general and specifically on the human ear canal. This is particularly interesting in dynamic cases such as tissue movement or pathologies developing over time. Assum...

  1. Active relearning for robust supervised classification of pulmonary emphysema

    Science.gov (United States)

    Raghunath, Sushravya; Rajagopalan, Srinivasan; Karwoski, Ronald A.; Bartholmai, Brian J.; Robb, Richard A.

    2012-03-01

    Radiologists are adept at recognizing the appearance of lung parenchymal abnormalities in CT scans. However, the inconsistent differential diagnosis, due to subjective aggregation, mandates supervised classification. Towards optimizing Emphysema classification, we introduce a physician-in-the-loop feedback approach in order to minimize uncertainty in the selected training samples. Using multi-view inductive learning with the training samples, an ensemble of Support Vector Machine (SVM) models, each based on a specific pair-wise dissimilarity metric, was constructed in less than six seconds. In the active relearning phase, the ensemble-expert label conflicts were resolved by an expert. This just-in-time feedback with unoptimized SVMs yielded 15% increase in classification accuracy and 25% reduction in the number of support vectors. The generality of relearning was assessed in the optimized parameter space of six different classifiers across seven dissimilarity metrics. The resultant average accuracy improved to 21%. The co-operative feedback method proposed here could enhance both diagnostic and staging throughput efficiency in chest radiology practice.

  2. Three-Class Mammogram Classification Based on Descriptive CNN Features

    Directory of Open Access Journals (Sweden)

    M. Mohsin Jadoon

    2017-01-01

    Full Text Available In this paper, a novel classification technique for large data set of mammograms using a deep learning method is proposed. The proposed model targets a three-class classification study (normal, malignant, and benign cases. In our model we have presented two methods, namely, convolutional neural network-discrete wavelet (CNN-DW and convolutional neural network-curvelet transform (CNN-CT. An augmented data set is generated by using mammogram patches. To enhance the contrast of mammogram images, the data set is filtered by contrast limited adaptive histogram equalization (CLAHE. In the CNN-DW method, enhanced mammogram images are decomposed as its four subbands by means of two-dimensional discrete wavelet transform (2D-DWT, while in the second method discrete curvelet transform (DCT is used. In both methods, dense scale invariant feature (DSIFT for all subbands is extracted. Input data matrix containing these subband features of all the mammogram patches is created that is processed as input to convolutional neural network (CNN. Softmax layer and support vector machine (SVM layer are used to train CNN for classification. Proposed methods have been compared with existing methods in terms of accuracy rate, error rate, and various validation assessment measures. CNN-DW and CNN-CT have achieved accuracy rate of 81.83% and 83.74%, respectively. Simulation results clearly validate the significance and impact of our proposed model as compared to other well-known existing techniques.

  3. Applying machine-learning techniques to Twitter data for automatic hazard-event classification.

    Science.gov (United States)

    Filgueira, R.; Bee, E. J.; Diaz-Doce, D.; Poole, J., Sr.; Singh, A.

    2017-12-01

    The constant flow of information offered by tweets provides valuable information about all sorts of events at a high temporal and spatial resolution. Over the past year we have been analyzing in real-time geological hazards/phenomenon, such as earthquakes, volcanic eruptions, landslides, floods or the aurora, as part of the GeoSocial project, by geo-locating tweets filtered by keywords in a web-map. However, not all the filtered tweets are related with hazard/phenomenon events. This work explores two classification techniques for automatic hazard-event categorization based on tweets about the "Aurora". First, tweets were filtered using aurora-related keywords, removing stop words and selecting the ones written in English. For classifying the remaining between "aurora-event" or "no-aurora-event" categories, we compared two state-of-art techniques: Support Vector Machine (SVM) and Deep Convolutional Neural Networks (CNN) algorithms. Both approaches belong to the family of supervised learning algorithms, which make predictions based on labelled training dataset. Therefore, we created a training dataset by tagging 1200 tweets between both categories. The general form of SVM is used to separate two classes by a function (kernel). We compared the performance of four different kernels (Linear Regression, Logistic Regression, Multinomial Naïve Bayesian and Stochastic Gradient Descent) provided by Scikit-Learn library using our training dataset to build the SVM classifier. The results shown that the Logistic Regression (LR) gets the best accuracy (87%). So, we selected the SVM-LR classifier to categorise a large collection of tweets using the "dispel4py" framework.Later, we developed a CNN classifier, where the first layer embeds words into low-dimensional vectors. The next layer performs convolutions over the embedded word vectors. Results from the convolutional layer are max-pooled into a long feature vector, which is classified using a softmax layer. The CNN's accuracy

  4. Fuzzy support vector machine for microarray imbalanced data classification

    Science.gov (United States)

    Ladayya, Faroh; Purnami, Santi Wulan; Irhamah

    2017-11-01

    DNA microarrays are data containing gene expression with small sample sizes and high number of features. Furthermore, imbalanced classes is a common problem in microarray data. This occurs when a dataset is dominated by a class which have significantly more instances than the other minority classes. Therefore, it is needed a classification method that solve the problem of high dimensional and imbalanced data. Support Vector Machine (SVM) is one of the classification methods that is capable of handling large or small samples, nonlinear, high dimensional, over learning and local minimum issues. SVM has been widely applied to DNA microarray data classification and it has been shown that SVM provides the best performance among other machine learning methods. However, imbalanced data will be a problem because SVM treats all samples in the same importance thus the results is bias for minority class. To overcome the imbalanced data, Fuzzy SVM (FSVM) is proposed. This method apply a fuzzy membership to each input point and reformulate the SVM such that different input points provide different contributions to the classifier. The minority classes have large fuzzy membership so FSVM can pay more attention to the samples with larger fuzzy membership. Given DNA microarray data is a high dimensional data with a very large number of features, it is necessary to do feature selection first using Fast Correlation based Filter (FCBF). In this study will be analyzed by SVM, FSVM and both methods by applying FCBF and get the classification performance of them. Based on the overall results, FSVM on selected features has the best classification performance compared to SVM.

  5. A Hybrid ICA-SVM Approach for Determining the Quality Variables at Fault in a Multivariate Process

    Directory of Open Access Journals (Sweden)

    Yuehjen E. Shao

    2012-01-01

    Full Text Available The monitoring of a multivariate process with the use of multivariate statistical process control (MSPC charts has received considerable attention. However, in practice, the use of MSPC chart typically encounters a difficulty. This difficult involves which quality variable or which set of the quality variables is responsible for the generation of the signal. This study proposes a hybrid scheme which is composed of independent component analysis (ICA and support vector machine (SVM to determine the fault quality variables when a step-change disturbance existed in a multivariate process. The proposed hybrid ICA-SVM scheme initially applies ICA to the Hotelling T2 MSPC chart to generate independent components (ICs. The hidden information of the fault quality variables can be identified in these ICs. The ICs are then served as the input variables of the classifier SVM for performing the classification process. The performance of various process designs is investigated and compared with the typical classification method. Using the proposed approach, the fault quality variables for a multivariate process can be accurately and reliably determined.

  6. CLASSIFICATION OF CROPLANDS THROUGH FUSION OF OPTICAL AND SAR TIME SERIES DATA

    Directory of Open Access Journals (Sweden)

    S. Park

    2016-06-01

    Full Text Available Many satellite sensors including Landsat series have been extensively used for land cover classification. Studies have been conducted to mitigate classification problems associated with the use of single data (e.g., such as cloud contamination through multi-sensor data fusion and the use of time series data. This study investigated two areas with different environment and climate conditions: one in South Korea and the other in US. Cropland classification was conducted by using multi-temporal Landsat 5, Radarsat-1 and digital elevation models (DEM based on two machine learning approaches (i.e., random forest and support vector machines. Seven classification scenarios were examined and evaluated through accuracy assessment. Results show that SVM produced the best performance (overall accuracy of 93.87% when using all temporal and spectral data as input variables. Normalized Difference Water Index (NDWI, SAR backscattering, and Normalized Difference Vegetation Index (NDVI were identified as more contributing variables than the others for cropland classification.

  7. Classification of agricultural fields using time series of dual polarimetry TerraSAR-X images

    Directory of Open Access Journals (Sweden)

    S. Mirzaee

    2014-10-01

    Full Text Available Due to its special imaging characteristics, Synthetic Aperture Radar (SAR has become an important source of information for a variety of remote sensing applications dealing with environmental changes. SAR images contain information about both phase and intensity in different polarization modes, making them sensitive to geometrical structure and physical properties of the targets such as dielectric and plant water content. In this study we investigate multi temporal changes occurring to different crop types due to phenological changes using high-resolution TerraSAR-X imagers. The dataset includes 17 dual-polarimetry TSX data acquired from June 2012 to August 2013 in Lorestan province, Iran. Several features are extracted from polarized data and classified using support vector machine (SVM classifier. Training samples and different features employed in classification are also assessed in the study. Results show a satisfactory accuracy for classification which is about 0.91 in kappa coefficient.

  8. Automatic optical detection and classification of marine animals around MHK converters using machine vision

    Energy Technology Data Exchange (ETDEWEB)

    Brunton, Steven [Univ. of Washington, Seattle, WA (United States)

    2018-01-15

    Optical systems provide valuable information for evaluating interactions and associations between organisms and MHK energy converters and for capturing potentially rare encounters between marine organisms and MHK device. The deluge of optical data from cabled monitoring packages makes expert review time-consuming and expensive. We propose algorithms and a processing framework to automatically extract events of interest from underwater video. The open-source software framework consists of background subtraction, filtering, feature extraction and hierarchical classification algorithms. This principle classification pipeline was validated on real-world data collected with an experimental underwater monitoring package. An event detection rate of 100% was achieved using robust principal components analysis (RPCA), Fourier feature extraction and a support vector machine (SVM) binary classifier. The detected events were then further classified into more complex classes – algae | invertebrate | vertebrate, one species | multiple species of fish, and interest rank. Greater than 80% accuracy was achieved using a combination of machine learning techniques.

  9. Global discriminative learning for higher-accuracy computational gene prediction.

    Directory of Open Access Journals (Sweden)

    Axel Bernal

    2007-03-01

    Full Text Available Most ab initio gene predictors use a probabilistic sequence model, typically a hidden Markov model, to combine separately trained models of genomic signals and content. By combining separate models of relevant genomic features, such gene predictors can exploit small training sets and incomplete annotations, and can be trained fairly efficiently. However, that type of piecewise training does not optimize prediction accuracy and has difficulty in accounting for statistical dependencies among different parts of the gene model. With genomic information being created at an ever-increasing rate, it is worth investigating alternative approaches in which many different types of genomic evidence, with complex statistical dependencies, can be integrated by discriminative learning to maximize annotation accuracy. Among discriminative learning methods, large-margin classifiers have become prominent because of the success of support vector machines (SVM in many classification tasks. We describe CRAIG, a new program for ab initio gene prediction based on a conditional random field model with semi-Markov structure that is trained with an online large-margin algorithm related to multiclass SVMs. Our experiments on benchmark vertebrate datasets and on regions from the ENCODE project show significant improvements in prediction accuracy over published gene predictors that use intrinsic features only, particularly at the gene level and on genes with long introns.

  10. Analysis of Different Classification Techniques for Two-Class Functional Near-Infrared Spectroscopy-Based Brain-Computer Interface

    Directory of Open Access Journals (Sweden)

    Noman Naseer

    2016-01-01

    Full Text Available We analyse and compare the classification accuracies of six different classifiers for a two-class mental task (mental arithmetic and rest using functional near-infrared spectroscopy (fNIRS signals. The signals of the mental arithmetic and rest tasks from the prefrontal cortex region of the brain for seven healthy subjects were acquired using a multichannel continuous-wave imaging system. After removal of the physiological noises, six features were extracted from the oxygenated hemoglobin (HbO signals. Two- and three-dimensional combinations of those features were used for classification of mental tasks. In the classification, six different modalities, linear discriminant analysis (LDA, quadratic discriminant analysis (QDA, k-nearest neighbour (kNN, the Naïve Bayes approach, support vector machine (SVM, and artificial neural networks (ANN, were utilized. With these classifiers, the average classification accuracies among the seven subjects for the 2- and 3-dimensional combinations of features were 71.6, 90.0, 69.7, 89.8, 89.5, and 91.4% and 79.6, 95.2, 64.5, 94.8, 95.2, and 96.3%, respectively. ANN showed the maximum classification accuracies: 91.4 and 96.3%. In order to validate the results, a statistical significance test was performed, which confirmed that the p values were statistically significant relative to all of the other classifiers (p < 0.005 using HbO signals.

  11. Crown-level tree species classification from AISA hyperspectral imagery using an innovative pixel-weighting approach

    Science.gov (United States)

    Liu, Haijian; Wu, Changshan

    2018-06-01

    Crown-level tree species classification is a challenging task due to the spectral similarity among different tree species. Shadow, underlying objects, and other materials within a crown may decrease the purity of extracted crown spectra and further reduce classification accuracy. To address this problem, an innovative pixel-weighting approach was developed for tree species classification at the crown level. The method utilized high density discrete LiDAR data for individual tree delineation and Airborne Imaging Spectrometer for Applications (AISA) hyperspectral imagery for pure crown-scale spectra extraction. Specifically, three steps were included: 1) individual tree identification using LiDAR data, 2) pixel-weighted representative crown spectra calculation using hyperspectral imagery, with which pixel-based illuminated-leaf fractions estimated using a linear spectral mixture analysis (LSMA) were employed as weighted factors, and 3) representative spectra based tree species classification was performed through applying a support vector machine (SVM) approach. Analysis of results suggests that the developed pixel-weighting approach (OA = 82.12%, Kc = 0.74) performed better than treetop-based (OA = 70.86%, Kc = 0.58) and pixel-majority methods (OA = 72.26, Kc = 0.62) in terms of classification accuracy. McNemar tests indicated the differences in accuracy between pixel-weighting and treetop-based approaches as well as that between pixel-weighting and pixel-majority approaches were statistically significant.

  12. Vehicle classification in WAMI imagery using deep network

    Science.gov (United States)

    Yi, Meng; Yang, Fan; Blasch, Erik; Sheaff, Carolyn; Liu, Kui; Chen, Genshe; Ling, Haibin

    2016-05-01

    Humans have always had a keen interest in understanding activities and the surrounding environment for mobility, communication, and survival. Thanks to recent progress in photography and breakthroughs in aviation, we are now able to capture tens of megapixels of ground imagery, namely Wide Area Motion Imagery (WAMI), at multiple frames per second from unmanned aerial vehicles (UAVs). WAMI serves as a great source for many applications, including security, urban planning and route planning. These applications require fast and accurate image understanding which is time consuming for humans, due to the large data volume and city-scale area coverage. Therefore, automatic processing and understanding of WAMI imagery has been gaining attention in both industry and the research community. This paper focuses on an essential step in WAMI imagery analysis, namely vehicle classification. That is, deciding whether a certain image patch contains a vehicle or not. We collect a set of positive and negative sample image patches, for training and testing the detector. Positive samples are 64 × 64 image patches centered on annotated vehicles. We generate two sets of negative images. The first set is generated from positive images with some location shift. The second set of negative patches is generated from randomly sampled patches. We also discard those patches if a vehicle accidentally locates at the center. Both positive and negative samples are randomly divided into 9000 training images and 3000 testing images. We propose to train a deep convolution network for classifying these patches. The classifier is based on a pre-trained AlexNet Model in the Caffe library, with an adapted loss function for vehicle classification. The performance of our classifier is compared to several traditional image classifier methods using Support Vector Machine (SVM) and Histogram of Oriented Gradient (HOG) features. While the SVM+HOG method achieves an accuracy of 91.2%, the accuracy of our deep

  13. Support vector machine and principal component analysis for microarray data classification

    Science.gov (United States)

    Astuti, Widi; Adiwijaya

    2018-03-01

    Cancer is a leading cause of death worldwide although a significant proportion of it can be cured if it is detected early. In recent decades, technology called microarray takes an important role in the diagnosis of cancer. By using data mining technique, microarray data classification can be performed to improve the accuracy of cancer diagnosis compared to traditional techniques. The characteristic of microarray data is small sample but it has huge dimension. Since that, there is a challenge for researcher to provide solutions for microarray data classification with high performance in both accuracy and running time. This research proposed the usage of Principal Component Analysis (PCA) as a dimension reduction method along with Support Vector Method (SVM) optimized by kernel functions as a classifier for microarray data classification. The proposed scheme was applied on seven data sets using 5-fold cross validation and then evaluation and analysis conducted on term of both accuracy and running time. The result showed that the scheme can obtained 100% accuracy for Ovarian and Lung Cancer data when Linear and Cubic kernel functions are used. In term of running time, PCA greatly reduced the running time for every data sets.

  14. An improved PSO-SVM model for online recognition defects in eddy current testing

    Science.gov (United States)

    Liu, Baoling; Hou, Dibo; Huang, Pingjie; Liu, Banteng; Tang, Huayi; Zhang, Wubo; Chen, Peihua; Zhang, Guangxin

    2013-12-01

    Accurate and rapid recognition of defects is essential for structural integrity and health monitoring of in-service device using eddy current (EC) non-destructive testing. This paper introduces a novel model-free method that includes three main modules: a signal pre-processing module, a classifier module and an optimisation module. In the signal pre-processing module, a kind of two-stage differential structure is proposed to suppress the lift-off fluctuation that could contaminate the EC signal. In the classifier module, multi-class support vector machine (SVM) based on one-against-one strategy is utilised for its good accuracy. In the optimisation module, the optimal parameters of classifier are obtained by an improved particle swarm optimisation (IPSO) algorithm. The proposed IPSO technique can improve convergence performance of the primary PSO through the following strategies: nonlinear processing of inertia weight, introductions of the black hole and simulated annealing model with extremum disturbance. The good generalisation ability of the IPSO-SVM model has been validated through adding additional specimen into the testing set. Experiments show that the proposed algorithm can achieve higher recognition accuracy and efficiency than other well-known classifiers and the superiorities are more obvious with less training set, which contributes to online application.

  15. Construction accident narrative classification: An evaluation of text mining techniques.

    Science.gov (United States)

    Goh, Yang Miang; Ubeynarayana, C U

    2017-11-01

    Learning from past accidents is fundamental to accident prevention. Thus, accident and near miss reporting are encouraged by organizations and regulators. However, for organizations managing large safety databases, the time taken to accurately classify accident and near miss narratives will be very significant. This study aims to evaluate the utility of various text mining classification techniques in classifying 1000 publicly available construction accident narratives obtained from the US OSHA website. The study evaluated six machine learning algorithms, including support vector machine (SVM), linear regression (LR), random forest (RF), k-nearest neighbor (KNN), decision tree (DT) and Naive Bayes (NB), and found that SVM produced the best performance in classifying the test set of 251 cases. Further experimentation with tokenization of the processed text and non-linear SVM were also conducted. In addition, a grid search was conducted on the hyperparameters of the SVM models. It was found that the best performing classifiers were linear SVM with unigram tokenization and radial basis function (RBF) SVM with uni-gram tokenization. In view of its relative simplicity, the linear SVM is recommended. Across the 11 labels of accident causes or types, the precision of the linear SVM ranged from 0.5 to 1, recall ranged from 0.36 to 0.9 and F1 score was between 0.45 and 0.92. The reasons for misclassification were discussed and suggestions on ways to improve the performance were provided. Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. Individual classification of children with epilepsy using support vector machine with multiple indices of diffusion tensor imaging

    Directory of Open Access Journals (Sweden)

    Ishmael Amarreh

    2014-01-01

    Conclusion: DTI-based SVM classification appears promising for distinguishing children with active epilepsy from either those with remitted epilepsy or controls, and the question that arises is whether it will prove useful as a prognostic index of seizure remission. While SVM can correctly identify children with active epilepsy from other groups' diagnosis, further research is needed to determine the efficacy of SVM as a prognostic tool in longitudinal clinical studies.

  17. Scale Issues Related to the Accuracy Assessment of Land Use/Land Cover Maps Produced Using Multi-Resolution Data: Comments on “The Improvement of Land Cover Classification by Thermal Remote Sensing”. Remote Sens. 2015, 7(7, 8368–8390

    Directory of Open Access Journals (Sweden)

    Brian A. Johnson

    2015-10-01

    Full Text Available Much remote sensing (RS research focuses on fusing, i.e., combining, multi-resolution/multi-sensor imagery for land use/land cover (LULC classification. In relation to this topic, Sun and Schulz [1] recently found that a combination of visible-to-near infrared (VNIR; 30 m spatial resolution and thermal infrared (TIR; 100–120 m spatial resolution Landsat data led to more accurate LULC classification. They also found that using multi-temporal TIR data alone for classification resulted in comparable (and in some cases higher classification accuracies to the use of multi-temporal VNIR data, which contrasts with the findings of other recent research [2]. This discrepancy, and the generally very high LULC accuracies achieved by Sun and Schulz (up to 99.2% overall accuracy for a combined VNIR/TIR classification result, can likely be explained by their use of an accuracy assessment procedure which does not take into account the multi-resolution nature of the data. Sun and Schulz used 10-fold cross-validation for accuracy assessment, which is not necessarily inappropriate for RS accuracy assessment in general. However, here it is shown that the typical pixel-based cross-validation approach results in non-independent training and validation data sets when the lower spatial resolution TIR images are used for classification, which causes classification accuracy to be overestimated.

  18. An improved chaotic fruit fly optimization based on a mutation strategy for simultaneous feature selection and parameter optimization for SVM and its applications.

    Science.gov (United States)

    Ye, Fei; Lou, Xin Yuan; Sun, Lin Fu

    2017-01-01

    This paper proposes a new support vector machine (SVM) optimization scheme based on an improved chaotic fly optimization algorithm (FOA) with a mutation strategy to simultaneously perform parameter setting turning for the SVM and feature selection. In the improved FOA, the chaotic particle initializes the fruit fly swarm location and replaces the expression of distance for the fruit fly to find the food source. However, the proposed mutation strategy uses two distinct generative mechanisms for new food sources at the osphresis phase, allowing the algorithm procedure to search for the optimal solution in both the whole solution space and within the local solution space containing the fruit fly swarm location. In an evaluation based on a group of ten benchmark problems, the proposed algorithm's performance is compared with that of other well-known algorithms, and the results support the superiority of the proposed algorithm. Moreover, this algorithm is successfully applied in a SVM to perform both parameter setting turning for the SVM and feature selection to solve real-world classification problems. This method is called chaotic fruit fly optimization algorithm (CIFOA)-SVM and has been shown to be a more robust and effective optimization method than other well-known methods, particularly in terms of solving the medical diagnosis problem and the credit card problem.

  19. A novel one-class SVM based negative data sampling method for reconstructing proteome-wide HTLV-human protein interaction networks.

    Science.gov (United States)

    Mei, Suyu; Zhu, Hao

    2015-01-26

    Protein-protein interaction (PPI) prediction is generally treated as a problem of binary classification wherein negative data sampling is still an open problem to be addressed. The commonly used random sampling is prone to yield less representative negative data with considerable false negatives. Meanwhile rational constraints are seldom exerted on model selection to reduce the risk of false positive predictions for most of the existing computational methods. In this work, we propose a novel negative data sampling method based on one-class SVM (support vector machine, SVM) to predict proteome-wide protein interactions between HTLV retrovirus and Homo sapiens, wherein one-class SVM is used to choose reliable and representative negative data, and two-class SVM is used to yield proteome-wide outcomes as predictive feedback for rational model selection. Computational results suggest that one-class SVM is more suited to be used as negative data sampling method than two-class PPI predictor, and the predictive feedback constrained model selection helps to yield a rational predictive model that reduces the risk of false positive predictions. Some predictions have been validated by the recent literature. Lastly, gene ontology based clustering of the predicted PPI networks is conducted to provide valuable cues for the pathogenesis of HTLV retrovirus.

  20. An improved chaotic fruit fly optimization based on a mutation strategy for simultaneous feature selection and parameter optimization for SVM and its applications

    Science.gov (United States)

    Lou, Xin Yuan; Sun, Lin Fu

    2017-01-01

    This paper proposes a new support vector machine (SVM) optimization scheme based on an improved chaotic fly optimization algorithm (FOA) with a mutation strategy to simultaneously perform parameter setting turning for the SVM and feature selection. In the improved FOA, the chaotic particle initializes the fruit fly swarm location and replaces the expression of distance for the fruit fly to find the food source. However, the proposed mutation strategy uses two distinct generative mechanisms for new food sources at the osphresis phase, allowing the algorithm procedure to search for the optimal solution in both the whole solution space and within the local solution space containing the fruit fly swarm location. In an evaluation based on a group of ten benchmark problems, the proposed algorithm’s performance is compared with that of other well-known algorithms, and the results support the superiority of the proposed algorithm. Moreover, this algorithm is successfully applied in a SVM to perform both parameter setting turning for the SVM and feature selection to solve real-world classification problems. This method is called chaotic fruit fly optimization algorithm (CIFOA)-SVM and has been shown to be a more robust and effective optimization method than other well-known methods, particularly in terms of solving the medical diagnosis problem and the credit card problem. PMID:28369096

  1. Classification of team sport activities using a single wearable tracking device.

    Science.gov (United States)

    Wundersitz, Daniel W T; Josman, Casey; Gupta, Ritu; Netto, Kevin J; Gastin, Paul B; Robertson, Sam

    2015-11-26

    Wearable tracking devices incorporating accelerometers and gyroscopes are increasingly being used for activity analysis in sports. However, minimal research exists relating to their ability to classify common activities. The purpose of this study was to determine whether data obtained from a single wearable tracking device can be used to classify team sport-related activities. Seventy-six non-elite sporting participants were tested during a simulated team sport circuit (involving stationary, walking, jogging, running, changing direction, counter-movement jumping, jumping for distance and tackling activities) in a laboratory setting. A MinimaxX S4 wearable tracking device was worn below the neck, in-line and dorsal to the first to fifth thoracic vertebrae of the spine, with tri-axial accelerometer and gyroscope data collected at 100Hz. Multiple time domain, frequency domain and custom features were extracted from each sensor using 0.5, 1.0, and 1.5s movement capture durations. Features were further screened using a combination of ANOVA and Lasso methods. Relevant features were used to classify the eight activities performed using the Random Forest (RF), Support Vector Machine (SVM) and Logistic Model Tree (LMT) algorithms. The LMT (79-92% classification accuracy) outperformed RF (32-43%) and SVM algorithms (27-40%), obtaining strongest performance using the full model (accelerometer and gyroscope inputs). Processing time can be reduced through feature selection methods (range 1.5-30.2%), however a trade-off exists between classification accuracy and processing time. Movement capture duration also had little impact on classification accuracy or processing time. In sporting scenarios where wearable tracking devices are employed, it is both possible and feasible to accurately classify team sport-related activities. Copyright © 2015 Elsevier Ltd. All rights reserved.

  2. Cancer Classification Based on Support Vector Machine Optimized by Particle Swarm Optimization and Artificial Bee Colony.

    Science.gov (United States)

    Gao, Lingyun; Ye, Mingquan; Wu, Changrong

    2017-11-29

    Intelligent optimization algorithms have advantages in dealing with complex nonlinear problems accompanied by good flexibility and adaptability. In this paper, the FCBF (Fast Correlation-Based Feature selection) method is used to filter irrelevant and redundant features in order to improve the quality of cancer classification. Then, we perform classification based on SVM (Support Vector Machine) optimized by PSO (Particle Swarm Optimization) combined with ABC (Artificial Bee Colony) approaches, which is represented as PA-SVM. The proposed PA-SVM method is applied to nine cancer datasets, including five datasets of outcome prediction and a protein dataset of ovarian cancer. By comparison with other classification methods, the results demonstrate the effectiveness and the robustness of the proposed PA-SVM method in handling various types of data for cancer classification.

  3. The Classification of Tongue Colors with Standardized Acquisition and ICC Profile Correction in Traditional Chinese Medicine.

    Science.gov (United States)

    Qi, Zhen; Tu, Li-Ping; Chen, Jing-Bo; Hu, Xiao-Juan; Xu, Jia-Tuo; Zhang, Zhi-Feng

    2016-01-01

    Background and Goal . The application of digital image processing techniques and machine learning methods in tongue image classification in Traditional Chinese Medicine (TCM) has been widely studied nowadays. However, it is difficult for the outcomes to generalize because of lack of color reproducibility and image standardization. Our study aims at the exploration of tongue colors classification with a standardized tongue image acquisition process and color correction. Methods . Three traditional Chinese medical experts are chosen to identify the selected tongue pictures taken by the TDA-1 tongue imaging device in TIFF format through ICC profile correction. Then we compare the mean value of L * a * b * of different tongue colors and evaluate the effect of the tongue color classification by machine learning methods. Results . The L * a * b * values of the five tongue colors are statistically different. Random forest method has a better performance than SVM in classification. SMOTE algorithm can increase classification accuracy by solving the imbalance of the varied color samples. Conclusions . At the premise of standardized tongue acquisition and color reproduction, preliminary objectification of tongue color classification in Traditional Chinese Medicine (TCM) is feasible.

  4. A Discrete Wavelet Based Feature Extraction and Hybrid Classification Technique for Microarray Data Analysis

    Directory of Open Access Journals (Sweden)

    Jaison Bennet

    2014-01-01

    Full Text Available Cancer classification by doctors and radiologists was based on morphological and clinical features and had limited diagnostic ability in olden days. The recent arrival of DNA microarray technology has led to the concurrent monitoring of thousands of gene expressions in a single chip which stimulates the progress in cancer classification. In this paper, we have proposed a hybrid approach for microarray data classification based on nearest neighbor (KNN, naive Bayes, and support vector machine (SVM. Feature selection prior to classification plays a vital role and a feature selection technique which combines discrete wavelet transform (DWT and moving window technique (MWT is used. The performance of the proposed method is compared with the conventional classifiers like support vector machine, nearest neighbor, and naive Bayes. Experiments have been conducted on both real and benchmark datasets and the results indicate that the ensemble approach produces higher classification accuracy than conventional classifiers. This paper serves as an automated system for the classification of cancer and can be applied by doctors in real cases which serve as a boon to the medical community. This work further reduces the misclassification of cancers which is highly not allowed in cancer detection.

  5. An object-oriented classification method of high resolution imagery based on improved AdaTree

    International Nuclear Information System (INIS)

    Xiaohe, Zhang; Liang, Zhai; Jixian, Zhang; Huiyong, Sang

    2014-01-01

    With the popularity of the application using high spatial resolution remote sensing image, more and more studies paid attention to object-oriented classification on image segmentation as well as automatic classification after image segmentation. This paper proposed a fast method of object-oriented automatic classification. First, edge-based or FNEA-based segmentation was used to identify image objects and the values of most suitable attributes of image objects for classification were calculated. Then a certain number of samples from the image objects were selected as training data for improved AdaTree algorithm to get classification rules. Finally, the image objects could be classified easily using these rules. In the AdaTree, we mainly modified the final hypothesis to get classification rules. In the experiment with WorldView2 image, the result of the method based on AdaTree showed obvious accuracy and efficient improvement compared with the method based on SVM with the kappa coefficient achieving 0.9242

  6. Land use and land cover classification for rural residential areas in China using soft-probability cascading of multifeatures

    Science.gov (United States)

    Zhang, Bin; Liu, Yueyan; Zhang, Zuyu; Shen, Yonglin

    2017-10-01

    A multifeature soft-probability cascading scheme to solve the problem of land use and land cover (LULC) classification using high-spatial-resolution images to map rural residential areas in China is proposed. The proposed method is used to build midlevel LULC features. Local features are frequently considered as low-level feature descriptors in a midlevel feature learning method. However, spectral and textural features, which are very effective low-level features, are neglected. The acquisition of the dictionary of sparse coding is unsupervised, and this phenomenon reduces the discriminative power of the midlevel feature. Thus, we propose to learn supervised features based on sparse coding, a support vector machine (SVM) classifier, and a conditional random field (CRF) model to utilize the different effective low-level features and improve the discriminability of midlevel feature descriptors. First, three kinds of typical low-level features, namely, dense scale-invariant feature transform, gray-level co-occurrence matrix, and spectral features, are extracted separately. Second, combined with sparse coding and the SVM classifier, the probabilities of the different LULC classes are inferred to build supervised feature descriptors. Finally, the CRF model, which consists of two parts: unary potential and pairwise potential, is employed to construct an LULC classification map. Experimental results show that the proposed classification scheme can achieve impressive performance when the total accuracy reached about 87%.

  7. An Improved Cloud Classification Algorithm for China's FY-2C Multi-Channel Images Using Artificial Neural Network.

    Science.gov (United States)

    Liu, Yu; Xia, Jun; Shi, Chun-Xiang; Hong, Yang

    2009-01-01

    The crowning objective of this research was to identify a better cloud classification method to upgrade the current window-based clustering algorithm used operationally for China's first operational geostationary meteorological satellite FengYun-2C (FY-2C) data. First, the capabilities of six widely-used Artificial Neural Network (ANN) methods are analyzed, together with the comparison of two other methods: Principal Component Analysis (PCA) and a Support Vector Machine (SVM), using 2864 cloud samples manually collected by meteorologists in June, July, and August in 2007 from three FY-2C channel (IR1, 10.3-11.3 μm; IR2, 11.5-12.5 μm and WV 6.3-7.6 μm) imagery. The result shows that: (1) ANN approaches, in general, outperformed the PCA and the SVM given sufficient training samples and (2) among the six ANN networks, higher cloud classification accuracy was obtained with the Self-Organizing Map (SOM) and Probabilistic Neural Network (PNN). Second, to compare the ANN methods to the present FY-2C operational algorithm, this study implemented SOM, one of the best ANN network identified from this study, as an automated cloud classification system for the FY-2C multi-channel data. It shows that SOM method has improved the results greatly not only in pixel-level accuracy but also in cloud patch-level classification by more accurately identifying cloud types such as cumulonimbus, cirrus and clouds in high latitude. Findings of this study suggest that the ANN-based classifiers, in particular the SOM, can be potentially used as an improved Automated Cloud Classification Algorithm to upgrade the current window-based clustering method for the FY-2C operational products.

  8. An Improved Cloud Classification Algorithm for China’s FY-2C Multi-Channel Images Using Artificial Neural Network

    Directory of Open Access Journals (Sweden)

    Chun-Xiang Shi

    2009-07-01

    Full Text Available The crowning objective of this research was to identify a better cloud classification method to upgrade the current window-based clustering algorithm used operationally for China’s first operational geostationary meteorological satellite FengYun-2C (FY-2C data. First, the capabilities of six widely-used Artificial Neural Network (ANN methods are analyzed, together with the comparison of two other methods: Principal Component Analysis (PCA and a Support Vector Machine (SVM, using 2864 cloud s