WorldWideScience

Sample records for machine svm classification

  1. Optimization of Support Vector Machine (SVM) for Object Classification

    Science.gov (United States)

    Scholten, Matthew; Dhingra, Neil; Lu, Thomas T.; Chao, Tien-Hsin

    2012-01-01

    The Support Vector Machine (SVM) is a powerful algorithm, useful in classifying data into species. The SVMs implemented in this research were used as classifiers for the final stage in a Multistage Automatic Target Recognition (ATR) system. A single kernel SVM known as SVMlight, and a modified version known as a SVM with K-Means Clustering were used. These SVM algorithms were tested as classifiers under varying conditions. Image noise levels varied, and the orientation of the targets changed. The classifiers were then optimized to demonstrate their maximum potential as classifiers. Results demonstrate the reliability of SVM as a method for classification. From trial to trial, SVM produces consistent results.

  2. Image Reconstruction Using Pixel Wise Support Vector Machine SVM Classification.

    Directory of Open Access Journals (Sweden)

    Mohammad Mahmudul Alam Mia

    2015-02-01

    Full Text Available Abstract Image reconstruction using support vector machine SVM has been one of the major parts of image processing. The exactness of a supervised image classification is a function of the training data used in its generation. In this paper we studied support vector machine for classification aspects and reconstructed an image using support vector machine. Firstly value of the random pixels is used as the SVM classifier. Then the SVM classifier is trained by using those values of the random pixels. Finally the image is reconstructed after cross-validation with the trained SVM classifier. Matlab result shows that training with support vector machine produce better results and great computational efficiency with only a few minutes of runtime is necessary for training. Support vector machine have high classification accuracy and much faster convergence. Overall classification accuracy is 99.5. From our experiment It can be seen that classification accuracy mostly depends on the choice of the kernel function and best estimation of parameters for kernel is critical for a given image.

  3. TV-SVM: Total Variation Support Vector Machine for Semi-Supervised Data Classification

    OpenAIRE

    Bresson, Xavier; Zhang, Ruiliang

    2012-01-01

    We introduce semi-supervised data classification algorithms based on total variation (TV), Reproducing Kernel Hilbert Space (RKHS), support vector machine (SVM), Cheeger cut, labeled and unlabeled data points. We design binary and multi-class semi-supervised classification algorithms. We compare the TV-based classification algorithms with the related Laplacian-based algorithms, and show that TV classification perform significantly better when the number of labeled data is small.

  4. Image Reconstruction Using Multi Layer Perceptron MLP And Support Vector Machine SVM Classifier And Study Of Classification Accuracy

    Directory of Open Access Journals (Sweden)

    Shovasis Kumar Biswas

    2015-02-01

    Full Text Available Abstract Support Vector Machine SVM and back-propagation neural network BPNN has been applied successfully in many areas for example rule extraction classification and evaluation. In this paper we studied the back-propagation algorithm for training the multilayer artificial neural network and a support vector machine for data classification and image reconstruction aspects. A model focused on SVM with Gaussian RBF kernel is utilized here for data classification. Back propagation neural network is viewed as one of the most straightforward and is most general methods used for supervised training of multilayered neural network. We compared a support vector machine SVM with a back-propagation neural network BPNN for the task of data classification and image reconstruction. We made a comparison between the performances of the multi-class classification of these two learning methods. Comparing with these two methods we can conclude that the classification accuracy of the support vector machine is better and algorithm is much faster than the MLP with back propagation algorithm.

  5. Efficient HIK SVM learning for image classification.

    Science.gov (United States)

    Wu, Jianxin

    2012-10-01

    Histograms are used in almost every aspect of image processing and computer vision, from visual descriptors to image representations. Histogram intersection kernel (HIK) and support vector machine (SVM) classifiers are shown to be very effective in dealing with histograms. This paper presents contributions concerning HIK SVM for image classification. First, we propose intersection coordinate descent (ICD), a deterministic and scalable HIK SVM solver. ICD is much faster than, and has similar accuracies to, general purpose SVM solvers and other fast HIK SVM training methods. We also extend ICD to the efficient training of a broader family of kernels. Second, we show an important empirical observation that ICD is not sensitive to the C parameter in SVM, and we provide some theoretical analyses to explain this observation. ICD achieves high accuracies in many problems, using its default parameters. This is an attractive property for practitioners, because many image processing tasks are too large to choose SVM parameters using cross-validation.

  6. GenSVM: a generalized multiclass support vector machine

    NARCIS (Netherlands)

    G.J.J. van den Burg (Gerrit); P.J.F. Groenen (Patrick)

    2016-01-01

    textabstractTraditional extensions of the binary support vector machine (SVM) to multiclass problems are either heuristics or require solving a large dual optimization problem. Here, a generalized multiclass SVM is proposed called GenSVM. In this method classification boundaries for a K-class proble

  7. Seizure prediction using polynomial SVM classification.

    Science.gov (United States)

    Zisheng Zhang; Parhi, Keshab K

    2015-08-01

    This paper presents a novel patient-specific algorithm for prediction of seizures in epileptic patients with low hardware complexity and low power consumption. In the proposed approach, we first compute the spectrogram of the input fragmented EEG signals from a few electrodes. Each fragmented data clip is ten minutes in duration. Band powers, relative spectral powers and ratios of spectral powers are extracted as features. The features are then subjected to electrode selection and feature selection using classification and regression tree. The baseline experiment uses all features from selected electrodes and these features are then subjected to a radial basis function kernel support vector machine (RBF-SVM) classifier. The proposed method further selects a small number features from the selected electrodes and train a polynomial support vector machine (SVM) classifier with degree of 2 on these features. Prediction performances are compared between the baseline experiment and the proposed method. The algorithm is tested using intra-cranial EEG (iEEG) from the American Epilepsy Society Seizure Prediction Challenge database. The baseline experiment using a large number of features and RBF-SVM achieves a 100% sensitivity and an average AUC of 0.9985, while the proposed algorithm using only a small number of features and polynomial SVM with degree of 2 can achieve a sensitivity of 100.0%, an average area under curve (AUC) of 0.9795. For both experiments, only 10% of the available training data are used for training.

  8. Online LS-SVM for function estimation and classification

    Institute of Scientific and Technical Information of China (English)

    Jianghua Liu; Jia-pin Chen; Shan Jiang; Junshi Cheng

    2003-01-01

    An online algorithm for training LS-SVM (Least Square Support Vector Machines) was proposed for the application of function estimation and classification. Online LS-SVM means that LS-SVM can be trained in an incremental way, and can be pruned to get sparse approximation in a decremental way. When a SV (Support Vector) is added or removed, the online algorithm avoids computing large-scale matrix inverse. Thus the computation cost is reduced. Online algorithm is especially useful to realistic function estimation problem such as system identification. The experiments with benchmark function estimation problem and classification problem show the validity of this online algorithm.

  9. A method of neighbor classes based SVM classification for optical printed Chinese character recognition.

    Science.gov (United States)

    Zhang, Jie; Wu, Xiaohong; Yu, Yanmei; Luo, Daisheng

    2013-01-01

    In optical printed Chinese character recognition (OPCCR), many classifiers have been proposed for the recognition. Among the classifiers, support vector machine (SVM) might be the best classifier. However, SVM is a classifier for two classes. When it is used for multi-classes in OPCCR, its computation is time-consuming. Thus, we propose a neighbor classes based SVM (NC-SVM) to reduce the computation consumption of SVM. Experiments of NC-SVM classification for OPCCR have been done. The results of the experiments have shown that the NC-SVM we proposed can effectively reduce the computation time in OPCCR.

  10. A method of neighbor classes based SVM classification for optical printed Chinese character recognition.

    Directory of Open Access Journals (Sweden)

    Jie Zhang

    Full Text Available In optical printed Chinese character recognition (OPCCR, many classifiers have been proposed for the recognition. Among the classifiers, support vector machine (SVM might be the best classifier. However, SVM is a classifier for two classes. When it is used for multi-classes in OPCCR, its computation is time-consuming. Thus, we propose a neighbor classes based SVM (NC-SVM to reduce the computation consumption of SVM. Experiments of NC-SVM classification for OPCCR have been done. The results of the experiments have shown that the NC-SVM we proposed can effectively reduce the computation time in OPCCR.

  11. Big Data Classification Using the SVM Classifiers with the Modified Particle Swarm Optimization and the SVM Ensembles

    Directory of Open Access Journals (Sweden)

    Liliya Demidova

    2016-05-01

    Full Text Available The problem with development of the support vector machine (SVM classifiers using modified particle swarm optimization (PSO algorithm and their ensembles has been considered. Solving this problem would allow fulfilling the high-precision data classification, especially Big Data classification, with the acceptable time expenditures. The modified PSO algorithm conducts a simultaneous search of the type of kernel functions, the parameters of the kernel function and the value of the regularization parameter for the SVM classifier. The idea of particles' «regeneration» served as the basis for the modified PSO algorithm. In the implementation of this algorithm, some particles change the type of their kernel function to the one which corresponds to the particle with the best value of the classification accuracy. The offered PSO algorithm allows reducing the time expenditures for the developed SVM classifiers, which is very important for Big Data classification problem. In most cases such SVM classifier provides the high quality of data classification. In exceptional cases the SVM ensembles based on the decorrelation maximization algorithm for the different strategies of the decision-making on the data classification and the majority vote rule can be used. Also, the two-level SVM classifier has been offered. This classifier works as the group of the SVM classifiers at the first level and as the SVM classifier on the base of the modified PSO algorithm at the second level. The results of experimental studies confirm the efficiency of the offered approaches for Big Data classification.

  12. Quality-Oriented Classification of Aircraft Material Based on SVM

    Directory of Open Access Journals (Sweden)

    Hongxia Cai

    2014-01-01

    Full Text Available The existing material classification is proposed to improve the inventory management. However, different materials have the different quality-related attributes, especially in the aircraft industry. In order to reduce the cost without sacrificing the quality, we propose a quality-oriented material classification system considering the material quality character, Quality cost, and Quality influence. Analytic Hierarchy Process helps to make feature selection and classification decision. We use the improved Kraljic Portfolio Matrix to establish the three-dimensional classification model. The aircraft materials can be divided into eight types, including general type, key type, risk type, and leveraged type. Aiming to improve the classification accuracy of various materials, the algorithm of Support Vector Machine is introduced. Finally, we compare the SVM and BP neural network in the application. The results prove that the SVM algorithm is more efficient and accurate and the quality-oriented material classification is valuable.

  13. The generalization ability of online SVM classification based on Markov sampling.

    Science.gov (United States)

    Xu, Jie; Yan Tang, Yuan; Zou, Bin; Xu, Zongben; Li, Luoqing; Lu, Yang

    2015-03-01

    In this paper, we consider online support vector machine (SVM) classification learning algorithms with uniformly ergodic Markov chain (u.e.M.c.) samples. We establish the bound on the misclassification error of an online SVM classification algorithm with u.e.M.c. samples based on reproducing kernel Hilbert spaces and obtain a satisfactory convergence rate. We also introduce a novel online SVM classification algorithm based on Markov sampling, and present the numerical studies on the learning ability of online SVM classification based on Markov sampling for benchmark repository. The numerical studies show that the learning performance of the online SVM classification algorithm based on Markov sampling is better than that of classical online SVM classification based on random sampling as the size of training samples is larger.

  14. Hybrid model based on Genetic Algorithms and SVM applied to variable selection within fruit juice classification.

    Science.gov (United States)

    Fernandez-Lozano, C; Canto, C; Gestal, M; Andrade-Garda, J M; Rabuñal, J R; Dorado, J; Pazos, A

    2013-01-01

    Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM). Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA), the most representative variables for a specific classification problem can be selected.

  15. Classification of EMG signals using PSO optimized SVM for diagnosis of neuromuscular disorders.

    Science.gov (United States)

    Subasi, Abdulhamit

    2013-06-01

    Support vector machine (SVM) is an extensively used machine learning method with many biomedical signal classification applications. In this study, a novel PSO-SVM model has been proposed that hybridized the particle swarm optimization (PSO) and SVM to improve the EMG signal classification accuracy. This optimization mechanism involves kernel parameter setting in the SVM training procedure, which significantly influences the classification accuracy. The experiments were conducted on the basis of EMG signal to classify into normal, neurogenic or myopathic. In the proposed method the EMG signals were decomposed into the frequency sub-bands using discrete wavelet transform (DWT) and a set of statistical features were extracted from these sub-bands to represent the distribution of wavelet coefficients. The obtained results obviously validate the superiority of the SVM method compared to conventional machine learning methods, and suggest that further significant enhancements in terms of classification accuracy can be achieved by the proposed PSO-SVM classification system. The PSO-SVM yielded an overall accuracy of 97.41% on 1200 EMG signals selected from 27 subject records against 96.75%, 95.17% and 94.08% for the SVM, the k-NN and the RBF classifiers, respectively. PSO-SVM is developed as an efficient tool so that various SVMs can be used conveniently as the core of PSO-SVM for diagnosis of neuromuscular disorders.

  16. New KF-PP-SVM classification method for EEG in brain-computer interfaces.

    Science.gov (United States)

    Yang, Banghua; Han, Zhijun; Zan, Peng; Wang, Qian

    2014-01-01

    Classification methods are a crucial direction in the current study of brain-computer interfaces (BCIs). To improve the classification accuracy for electroencephalogram (EEG) signals, a novel KF-PP-SVM (kernel fisher, posterior probability, and support vector machine) classification method is developed. Its detailed process entails the use of common spatial patterns to obtain features, based on which the within-class scatter is calculated. Then the scatter is added into the kernel function of a radial basis function to construct a new kernel function. This new kernel is integrated into the SVM to obtain a new classification model. Finally, the output of SVM is calculated based on posterior probability and the final recognition result is obtained. To evaluate the effectiveness of the proposed KF-PP-SVM method, EEG data collected from laboratory are processed with four different classification schemes (KF-PP-SVM, KF-SVM, PP-SVM, and SVM). The results showed that the overall average improvements arising from the use of the KF-PP-SVM scheme as opposed to KF-SVM, PP-SVM and SVM schemes are 2.49%, 5.83 % and 6.49 % respectively.

  17. A NEW SVM BASED EMOTIONAL CLASSIFICATION OF IMAGE

    Institute of Scientific and Technical Information of China (English)

    Wang Weining; Yu Yinglin; Zhang Jianchao

    2005-01-01

    How high-level emotional representation of art paintings can be inferred from percep tual level features suited for the particular classes (dynamic vs. static classification)is presented. The key points are feature selection and classification. According to the strong relationship between notable lines of image and human sensations, a novel feature vector WLDLV (Weighted Line Direction-Length Vector) is proposed, which includes both orientation and length information of lines in an image. Classification is performed by SVM (Support Vector Machine) and images can be classified into dynamic and static. Experimental results demonstrate the effectiveness and superiority of the algorithm.

  18. Intrusion Awareness Based on Data Fusion and SVM Classification

    Directory of Open Access Journals (Sweden)

    Ramnaresh Sharma

    2012-06-01

    Full Text Available Network intrusion awareness is important factor for risk analysis of network security. In the current decade various method and framework are available for intrusion detection and security awareness. Some method based on knowledge discovery process and some framework based on neural network. These entire model take rule based decision for the generation of security alerts. In this paper we proposed a novel method for intrusion awareness using data fusion and SVM classification. Data fusion work on the biases of features gathering of event. Support vector machine is super classifier of data. Here we used SVM for the detection of closed item of ruled based technique. Our proposed method simulate on KDD1999 DARPA data set and get better empirical evaluation result in comparison of rule based technique and neural network model.

  19. Intrusion Awareness Based on Data Fusion and SVM Classification

    Directory of Open Access Journals (Sweden)

    Ramnaresh Sharma

    2012-06-01

    Full Text Available Network intrusion awareness is important factor forrisk analysis of network security. In the currentdecade various method and framework are availablefor intrusion detection and security awareness.Some method based on knowledge discovery processand some framework based on neural network.These entire model take rule based decision for thegeneration of security alerts. In this paper weproposed a novel method for intrusion awarenessusing data fusion and SVM classification. Datafusion work on the biases of features gathering ofevent. Support vector machine is super classifier ofdata. Here we used SVM for the detection of closeditem of ruled based technique. Our proposedmethod simulate on KDD1999 DARPA data set andget better empirical evaluation result in comparisonof rule based technique and neural network model.

  20. sw-SVM: sensor weighting support vector machines for EEG-based brain-computer interfaces.

    Science.gov (United States)

    Jrad, N; Congedo, M; Phlypo, R; Rousseau, S; Flamary, R; Yger, F; Rakotomamonjy, A

    2011-10-01

    In many machine learning applications, like brain-computer interfaces (BCI), high-dimensional sensor array data are available. Sensor measurements are often highly correlated and signal-to-noise ratio is not homogeneously spread across sensors. Thus, collected data are highly variable and discrimination tasks are challenging. In this work, we focus on sensor weighting as an efficient tool to improve the classification procedure. We present an approach integrating sensor weighting in the classification framework. Sensor weights are considered as hyper-parameters to be learned by a support vector machine (SVM). The resulting sensor weighting SVM (sw-SVM) is designed to satisfy a margin criterion, that is, the generalization error. Experimental studies on two data sets are presented, a P300 data set and an error-related potential (ErrP) data set. For the P300 data set (BCI competition III), for which a large number of trials is available, the sw-SVM proves to perform equivalently with respect to the ensemble SVM strategy that won the competition. For the ErrP data set, for which a small number of trials are available, the sw-SVM shows superior performances as compared to three state-of-the art approaches. Results suggest that the sw-SVM promises to be useful in event-related potentials classification, even with a small number of training trials.

  1. Density Based Support Vector Machines for Classification

    Directory of Open Access Journals (Sweden)

    Zahra Nazari

    2015-04-01

    Full Text Available Support Vector Machines (SVM is the most successful algorithm for classification problems. SVM learns the decision boundary from two classes (for Binary Classification of training points. However, sometimes there are some less meaningful samples amongst training points, which are corrupted by noises or misplaced in wrong side, called outliers. These outliers are affecting on margin and classification performance, and machine should better to discard them. SVM as a popular and widely used classification algorithm is very sensitive to these outliers and lacks the ability to discard them. Many research results prove this sensitivity which is a weak point for SVM. Different approaches are proposed to reduce the effect of outliers but no method is suitable for all types of data sets. In this paper, the new method of Density Based SVM (DBSVM is introduced. Population Density is the basic concept which is used in this method for both linear and non-linear SVM to detect outliers. Experiments on artificial data sets, real high-dimensional benchmark data sets of Liver disorder and Heart disease, and data sets of new and fatigued banknotes’ acoustic signals can prove the efficiency of this method on noisy data classification and the better generalization that it can provide compared to the standard SVM.

  2. Hyperspectral remote sensing image classification based on combined SVM and LDA

    Science.gov (United States)

    Zhang, Chunsen; Zheng, Yiwei

    2014-11-01

    This paper presents a novel method for hyperspectral image classification based on the minimum noise fraction (MNF) and an approach combining support vector machine (SVM) and linear discriminant analysis (LDA). A new SVM/LDA algorithm is used for the classification. First, we use MNF method to reduce the dimension and extract features of the image, and then use the SVM/LDA algorithm to transform the extracted features. Next, we train the result of transformation, optimize the parameters through cross-validation and grid search method, then get a optimal hyperspectral image classifier. Finally, we use this classifier to complete classification. In order to verify the proposed method, the AVIRIS Indian Pines image was used. The experimental results show that the proposed method can solve the contradiction between the small amount of samples and high dimension, improve classification accuracy compared to the classical SVM method.

  3. [Application of SVM and wavelet analysis in EEG classification].

    Science.gov (United States)

    Zhao, Jianlin; Zhou, Weidong; Liu, Kai; Cai, Dongmei

    2011-04-01

    We employed two methods of support vector machines (SVM) combined with two kinds of wavelet analysis to classify these EEG signals, on the basis of the different profiles, energy, and frequency characteristics of the EEG during the seizures. One method was to classify these signals using waveform characteristics of the EEG signal. The other was to classify these signals based on fluctuation index and variation coefficient of the EEG signal. We compared the classification accuracies of these two methods with the intermittent EEG and epileptic EEG. The results of the experiments showed that both the two methods for distinguishing epileptic EEG and interictal EEG can achieve an effective performance. It was also confirmed that the latter, the method based on the fluctuation index and variation coefficient, possesses a better effect of classification.

  4. Incremental learning with SVM for multimodal classification of prostatic adenocarcinoma.

    Science.gov (United States)

    García Molina, José Fernando; Zheng, Lei; Sertdemir, Metin; Dinter, Dietmar J; Schönberg, Stefan; Rädle, Matthias

    2014-01-01

    Robust detection of prostatic cancer is a challenge due to the multitude of variants and their representation in MR images. We propose a pattern recognition system with an incremental learning ensemble algorithm using support vector machines (SVM) tackling this problem employing multimodal MR images and a texture-based information strategy. The proposed system integrates anatomic, texture, and functional features. The data set was preprocessed using B-Spline interpolation, bias field correction and intensity standardization. First- and second-order angular independent statistical approaches and rotation invariant local phase quantization (RI-LPQ) were utilized to quantify texture information. An incremental learning ensemble SVM was implemented to suit working conditions in medical applications and to improve effectiveness and robustness of the system. The probability estimation of cancer structures was calculated using SVM and the corresponding optimization was carried out with a heuristic method together with a 3-fold cross-validation methodology. We achieved an average sensitivity of 0.844 ± 0.068 and a specificity of 0.780 ± 0.038, which yielded superior or similar performance to current state of the art using a total database of only 41 slices from twelve patients with histological confirmed information, including cancerous, unhealthy non-cancerous and healthy prostate tissue. Our results show the feasibility of an ensemble SVM being able to learn additional information from new data while preserving previously acquired knowledge and preventing unlearning. The use of texture descriptors provides more salient discriminative patterns than the functional information used. Furthermore, the system improves selection of information, efficiency and robustness of the classification. The generated probability map enables radiologists to have a lower variability in diagnosis, decrease false negative rates and reduce the time to recognize and delineate structures in

  5. Incremental learning with SVM for multimodal classification of prostatic adenocarcinoma.

    Directory of Open Access Journals (Sweden)

    José Fernando García Molina

    Full Text Available Robust detection of prostatic cancer is a challenge due to the multitude of variants and their representation in MR images. We propose a pattern recognition system with an incremental learning ensemble algorithm using support vector machines (SVM tackling this problem employing multimodal MR images and a texture-based information strategy. The proposed system integrates anatomic, texture, and functional features. The data set was preprocessed using B-Spline interpolation, bias field correction and intensity standardization. First- and second-order angular independent statistical approaches and rotation invariant local phase quantization (RI-LPQ were utilized to quantify texture information. An incremental learning ensemble SVM was implemented to suit working conditions in medical applications and to improve effectiveness and robustness of the system. The probability estimation of cancer structures was calculated using SVM and the corresponding optimization was carried out with a heuristic method together with a 3-fold cross-validation methodology. We achieved an average sensitivity of 0.844 ± 0.068 and a specificity of 0.780 ± 0.038, which yielded superior or similar performance to current state of the art using a total database of only 41 slices from twelve patients with histological confirmed information, including cancerous, unhealthy non-cancerous and healthy prostate tissue. Our results show the feasibility of an ensemble SVM being able to learn additional information from new data while preserving previously acquired knowledge and preventing unlearning. The use of texture descriptors provides more salient discriminative patterns than the functional information used. Furthermore, the system improves selection of information, efficiency and robustness of the classification. The generated probability map enables radiologists to have a lower variability in diagnosis, decrease false negative rates and reduce the time to recognize and

  6. [LLE-SVM classification of apple mealiness based on hyperspectral scattering image].

    Science.gov (United States)

    Zhao, Gui-lin; Zhu, Qi-bing; Huang, Min

    2010-10-01

    Apple mealiness degree is an important factor for its internal quality. hyperspectral scattering, as a promising technique, was investigated for noninvasive measurement of apple mealiness. In the present paper, a locally linear embedding (LLE) coupled with support vector machine (SVM) was proposed to achieve classification because of large number of image data. LLE is a nonlinear lowering dimension method, which reveals the structure of the global nonlinearity by the local linear joint. This method can effectively calculate high-dimensional input data embedded in a low-dimensional space manifold. The dimension reduction of hyperspectral data was classified by SVM. Comparing the LLE-SVM classification method with the traditional SVM classification, the results indicated that the training accuracy obtained with the LLE-SVM was higher than that just with SVM; and the testing accuracy of the classifier changed a little before and after dimensionality reduction, and the range of fluctuation was less than 5%. It is expected that LLE-SVM method would provide an effective classification method for apple mealiness nondestructive detection using hyperspectral scattering image technique.

  7. [Hyperspectral remote sensing image classification based on SVM optimized by clonal selection].

    Science.gov (United States)

    Liu, Qing-Jie; Jing, Lin-Hai; Wang, Meng-Fei; Lin, Qi-Zhong

    2013-03-01

    Model selection for support vector machine (SVM) involving kernel and the margin parameter values selection is usually time-consuming, impacts training efficiency of SVM model and final classification accuracies of SVM hyperspectral remote sensing image classifier greatly. Firstly, based on combinatorial optimization theory and cross-validation method, artificial immune clonal selection algorithm is introduced to the optimal selection of SVM (CSSVM) kernel parameter a and margin parameter C to improve the training efficiency of SVM model. Then an experiment of classifying AVIRIS in India Pine site of USA was performed for testing the novel CSSVM, as well as a traditional SVM classifier with general Grid Searching cross-validation method (GSSVM) for comparison. And then, evaluation indexes including SVM model training time, classification overall accuracy (OA) and Kappa index of both CSSVM and GSSVM were all analyzed quantitatively. It is demonstrated that OA of CSSVM on test samples and whole image are 85.1% and 81.58, the differences from that of GSSVM are both within 0.08% respectively; And Kappa indexes reach 0.8213 and 0.7728, the differences from that of GSSVM are both within 0.001; While the ratio of model training time of CSSVM and GSSVM is between 1/6 and 1/10. Therefore, CSSVM is fast and accurate algorithm for hyperspectral image classification and is superior to GSSVM.

  8. Discrimination of Rice Varieties using LS-SVM Classification Algorithms and Hyperspectral Data

    Directory of Open Access Journals (Sweden)

    Jin Xiaming

    2015-03-01

    Full Text Available Fast discrimination of rice varieties plays a key role in the rice processing industry and benefits the management of rice in the supermarket. In order to discriminate rice varieties in a fast and nondestructive way, hyperspectral technology and several classification algorithms were used in this study. The hyperspectral data of 250 rice samples of 5 varieties were obtained using FieldSpec®3 spectrometer. Multiplication Scatter Correction (MSC was used to preprocess the raw spectra. Principal Component Analysis (PCA was used to reduce the dimension of raw spectra. To investigate the influence of different linear and non-linear classification algorithms on the discrimination results, K-Nearest Neighbors (KNN, Support Vector Machine (SVM and Least Square Support Vector Machine (LS-SVM were used to develop the discrimination models respectively. Then the performances of these three multivariate classification methods were compared according to the discrimination accuracy. The number of Principal Components (PCs and K parameter of KNN, kernel function of SVM or LS-SVM, were optimized by cross-validation in corresponding models. One hundred and twenty five rice samples (25 of each variety were chosen as calibration set and the remaining 125 rice samples were prediction set. The experiment results showed that, the optimal PCs was 8 and the cross-validation accuracy of KNN (K = 2, SVM, LS-SVM were 94.4, 96.8 and 100%, respectively, while the prediction accuracy of KNN (K = 2, SVM, LS-SVM were 89.6, 93.6 and 100%, respectively. The results indicated that LS-SVM performed the best in the discrimination of rice varieties.

  9. [Classification technique for hyperspectral image based on subspace of bands feature extraction and LS-SVM].

    Science.gov (United States)

    Gao, Heng-zhen; Wan, Jian-wei; Zhu, Zhen-zhen; Wang, Li-bao; Nian, Yong-jian

    2011-05-01

    The present paper proposes a novel hyperspectral image classification algorithm based on LS-SVM (least squares support vector machine). The LS-SVM uses the features extracted from subspace of bands (SOB). The maximum noise fraction (MNF) method is adopted as the feature extraction method. The spectral correlations of the hyperspectral image are used in order to divide the feature space into several SOBs. Then the MNF is used to extract characteristic features of the SOBs. The extracted features are combined into the feature vector for classification. So the strong bands correlation is avoided and the spectral redundancies are reduced. The LS-SVM classifier is adopted, which replaces inequality constraints in SVM by equality constraints. So the computation consumption is reduced and the learning performance is improved. The proposed method optimizes spectral information by feature extraction and reduces the spectral noise. The classifier performance is improved. Experimental results show the superiorities of the proposed algorithm.

  10. Support Vector Machine for mechanical faults classification

    Institute of Scientific and Technical Information of China (English)

    JIANG Zhi-qiang; FU Han-guang; LI Ling-jun

    2005-01-01

    Support Vector Machine (SVM) is a machine learning algorithm based on the Statistical Learning Theory (SLT), which can get good classification effects with a few learning samples. SVM represents a new approach to pattern classification and has been shown to be particularly successful in many fields such as image identification and face recognition. It also provides us with a new method to develop intelligent fault diagnosis. This paper presents an SVM based approach for fault diagnosis of rolling bearings. Experimentation with vibration signals of bearing was conducted. The vibration signals acquired from the bearings were directly used in the calculating without the preprocessing of extracting its features. Compared with the Artificial Neural Network (ANN) based method, the SVM based method has desirable advantages. Also a multi-fault SVM classifier based on binary classifier is constructed for gear faults in this paper. Other experiments with gear fault samples showed that the multi-fault SVM classifier has good classification ability and high efficiency in mechanical system. It is suitable for online diagnosis for mechanical system.

  11. 基于加权聚类质心的 SVM 不平衡分类方法%Support vector machine imbalanced data classification based on weighted clustering centroid

    Institute of Scientific and Technical Information of China (English)

    2013-01-01

    Classification of imbalanced data has become a research hot topic in machine learning .Traditional classi-fication algorithms assume that different classes have balanced distribution or equal misclassification cost , thus, making it hard to get ideal result of classifications .A support vector machine (SVM) classification method based on weighted clustering centroid was proposed in this paper .First, unsupervised clustering was applied to the positive and negative samples respectively to extract the clustering centroid of each clustering , which was represented the most in compactness of the clustering sample .Next, all clustering centroids formed a new set of balance training .In order to minimize the information loss during clustering , each clustering centroid was associated with a weight factor that was defined proportional to the number of samples of the class .Finally, all clustering centroids and weight fac-tors participated in the training of the improved SVM model .Experimental results show that the proposed method can make the sample selected from model train sets more typical and improve the classification performance better than other sampling techniques for dealing with imbalanced data .%  不平衡数据分类是机器学习研究的热点问题,传统分类算法假定不同类别具有平衡分布或误分代价相同,难以得到理想的分类结果。提出一种基于加权聚类质心的SVM分类方法,在正负类样本上分别进行聚类,对每个聚类,用聚类质心和权重因子代表聚类内样本分布和数量,相等类别数量的质心和权重因子参与SVM模型训练。实验结果表明,该方法使模型的训练样本具有较高的代表性,分类性能与其他采样方法相比得到了提升。

  12. CyNetSVM: A Cytoscape App for Cancer Biomarker Identification Using Network Constrained Support Vector Machines

    OpenAIRE

    Shi, Xu; Banerjee, Sharmi; Chen, Li; Hilakivi-Clarke, Leena; Clarke, Robert; Xuan, Jianhua

    2017-01-01

    One of the important tasks in cancer research is to identify biomarkers and build classification models for clinical outcome prediction. In this paper, we develop a CyNetSVM software package, implemented in Java and integrated with Cytoscape as an app, to identify network biomarkers using network-constrained support vector machines (NetSVM). The Cytoscape app of NetSVM is specifically designed to improve the usability of NetSVM with the following enhancements: (1) user-friendly graphical user...

  13. Sensitivity of Support Vector Machine Classification to Various Training Features

    Directory of Open Access Journals (Sweden)

    Fuling Bian

    2013-07-01

    Full Text Available Remote sensing image classification is one of the most important techniques in image interpretation, which can be used for environmental monitoring, evaluation and prediction. Many algorithms have been developed for image classification in the literature. Support vector machine (SVM is a kind of supervised classification that has been widely used recently. The classification accuracy produced by SVM may show variation depending on the choice of training features. In this paper, SVM was used for land cover classification using Quickbird images. Spectral and textural features were extracted for the classification and the results were analyzed thoroughly. Results showed that the number of features employed in SVM was not the more the better. Different features are suitable for different type of land cover extraction. This study verifies the effectiveness and robustness of SVM in the classification of high spatial resolution remote sensing images.    

  14. Multi-view L2-SVM and its multi-view core vector machine.

    Science.gov (United States)

    Huang, Chengquan; Chung, Fu-lai; Wang, Shitong

    2016-03-01

    In this paper, a novel L2-SVM based classifier Multi-view L2-SVM is proposed to address multi-view classification tasks. The proposed Multi-view L2-SVM classifier does not have any bias in its objective function and hence has the flexibility like μ-SVC in the sense that the number of the yielded support vectors can be controlled by a pre-specified parameter. The proposed Multi-view L2-SVM classifier can make full use of the coherence and the difference of different views through imposing the consensus among multiple views to improve the overall classification performance. Besides, based on the generalized core vector machine GCVM, the proposed Multi-view L2-SVM classifier is extended into its GCVM version MvCVM which can realize its fast training on large scale multi-view datasets, with its asymptotic linear time complexity with the sample size and its space complexity independent of the sample size. Our experimental results demonstrated the effectiveness of the proposed Multi-view L2-SVM classifier for small scale multi-view datasets and the proposed MvCVM classifier for large scale multi-view datasets.

  15. Classification using least squares support vector machine for reliability analysis

    Institute of Scientific and Technical Information of China (English)

    Zhi-wei GUO; Guang-chen BAI

    2009-01-01

    In order to improve the efficiency of the support vector machine (SVM) for classification to deal with a large amount of samples,the least squares support vector machine (LSSVM) for classification methods is introduced into the reliability analysis.To reduce the computational cost,the solution of the SVM is transformed from a quadratic programming to a group of linear equations.The numerical results indicate that the reliability method based on the LSSVM for classification has higher accuracy and requires less computational cost than the SVM method.

  16. Semi-supervised SVM for individual tree crown species classification

    Science.gov (United States)

    Dalponte, Michele; Ene, Liviu Theodor; Marconcini, Mattia; Gobakken, Terje; Næsset, Erik

    2015-12-01

    In this paper a novel semi-supervised SVM classifier is presented, specifically developed for tree species classification at individual tree crown (ITC) level. In ITC tree species classification, all the pixels belonging to an ITC should have the same label. This assumption is used in the learning of the proposed semi-supervised SVM classifier (ITC-S3VM). This method exploits the information contained in the unlabeled ITC samples in order to improve the classification accuracy of a standard SVM. The ITC-S3VM method can be easily implemented using freely available software libraries. The datasets used in this study include hyperspectral imagery and laser scanning data acquired over two boreal forest areas characterized by the presence of three information classes (Pine, Spruce, and Broadleaves). The experimental results quantify the effectiveness of the proposed approach, which provides classification accuracies significantly higher (from 2% to above 27%) than those obtained by the standard supervised SVM and by a state-of-the-art semi-supervised SVM (S3VM). Particularly, by reducing the number of training samples (i.e. from 100% to 25%, and from 100% to 5% for the two datasets, respectively) the proposed method still exhibits results comparable to the ones of a supervised SVM trained with the full available training set. This property of the method makes it particularly suitable for practical forest inventory applications in which collection of in situ information can be very expensive both in terms of cost and time.

  17. Identification of handwriting by using the genetic algorithm (GA) and support vector machine (SVM)

    Science.gov (United States)

    Zhang, Qigui; Deng, Kai

    2016-12-01

    As portable digital camera and a camera phone comes more and more popular, and equally pressing is meeting the requirements of people to shoot at any time, to identify and storage handwritten character. In this paper, genetic algorithm(GA) and support vector machine(SVM)are used for identification of handwriting. Compare with parameters-optimized method, this technique overcomes two defects: first, it's easy to trap in the local optimum; second, finding the best parameters in the larger range will affects the efficiency of classification and prediction. As the experimental results suggest, GA-SVM has a higher recognition rate.

  18. Analysis of dengue infection based on Raman spectroscopy and support vector machine (SVM).

    Science.gov (United States)

    Khan, Saranjam; Ullah, Rahat; Khan, Asifullah; Wahab, Noorul; Bilal, Muhammad; Ahmed, Mushtaq

    2016-06-01

    The current study presents the use of Raman spectroscopy combined with support vector machine (SVM) for the classification of dengue suspected human blood sera. Raman spectra for 84 clinically dengue suspected patients acquired from Holy Family Hospital, Rawalpindi, Pakistan, have been used in this study.The spectral differences between dengue positive and normal sera have been exploited by using effective machine learning techniques. In this regard, SVM models built on the basis of three different kernel functions including Gaussian radial basis function (RBF), polynomial function and linear functionhave been employed to classify the human blood sera based on features obtained from Raman Spectra.The classification model have been evaluated with the 10-fold cross validation method. In the present study, the best performance has been achieved for the polynomial kernel of order 1. A diagnostic accuracy of about 85% with the precision of 90%, sensitivity of 73% and specificity of 93% has been achieved under these conditions.

  19. SVM Based Descriptor Selection and Classification of Neurodegenerative Disease Drugs for Pharmacological Modeling.

    Science.gov (United States)

    Shahid, Mohammad; Shahzad Cheema, Muhammad; Klenner, Alexander; Younesi, Erfan; Hofmann-Apitius, Martin

    2013-03-01

    Systems pharmacological modeling of drug mode of action for the next generation of multitarget drugs may open new routes for drug design and discovery. Computational methods are widely used in this context amongst which support vector machines (SVM) have proven successful in addressing the challenge of classifying drugs with similar features. We have applied a variety of such SVM-based approaches, namely SVM-based recursive feature elimination (SVM-RFE). We use the approach to predict the pharmacological properties of drugs widely used against complex neurodegenerative disorders (NDD) and to build an in-silico computational model for the binary classification of NDD drugs from other drugs. Application of an SVM-RFE model to a set of drugs successfully classified NDD drugs from non-NDD drugs and resulted in overall accuracy of ∼80 % with 10 fold cross validation using 40 top ranked molecular descriptors selected out of total 314 descriptors. Moreover, SVM-RFE method outperformed linear discriminant analysis (LDA) based feature selection and classification. The model reduced the multidimensional descriptors space of drugs dramatically and predicted NDD drugs with high accuracy, while avoiding over fitting. Based on these results, NDD-specific focused libraries of drug-like compounds can be designed and existing NDD-specific drugs can be characterized by a well-characterized set of molecular descriptors.

  20. Wavelet-SVM classifier based on texture features for land cover classification

    Science.gov (United States)

    Zhang, Ning; Wu, Bingfang; Zhu, Jianjun; Zhou, Yuemin; Zhu, Liang

    2008-12-01

    Texture features are recognized to be a special hint in images, which represent the spatial relations of the gray pixels. Nowadays, the applications of the texture analysis in image classification spread abroad. Combined with wavelet multi-resolution analysis or support vector machine statistical learning theory, texture analysis could improve the quality of classification increasingly. In this paper, we focus on the land cover for the Three Gorges reservoir using remote sensing data SPOT-5, a new classification method, wavelet-SVM classifier based on texture features, is employed for this study. Compare to the traditional maximum likelihood classifier and SVM classifier only use spectrum feature, this method produces more accurate classification results. According to the real environment of the Three Gorges reservoir land cover, a best texture group is selected from several texture features. Decompose the image at different levels, which is one of the main advantage of wavelet, and then compute the texture features in every sub-image, and the next step is eliminating the redundant, every texture features are centralized on the first principal components using principal component analysis. Finally, with the first principal components inputted, we can get the classification result using SVM in every decomposition scale, but what the problem we couldn't overlook is how to select the best SVM parameters. So an iterative rule based on the classification accuracy is induced, the more accuracy, the proper parameters.

  1. Reducing Support Vector Machine Classification Error by Implementing Kalman Filter

    Directory of Open Access Journals (Sweden)

    Muhsin Hassan

    2013-08-01

    Full Text Available The aim of this is to demonstrate the capability of Kalman Filter to reduce Support Vector Machine classification errors in classifying pipeline corrosion depth. In pipeline defect classification, it is important to increase the accuracy of the SVM classification so that one can avoid misclassification which can lead to greater problems in monitoring pipeline defect and prediction of pipeline leakage. In this paper, it is found that noisy data can greatly affect the performance of SVM. Hence, Kalman Filter + SVM hybrid technique has been proposed as a solution to reduce SVM classification errors. The datasets has been added with Additive White Gaussian Noise in several stages to study the effect of noise on SVM classification accuracy. Three techniques have been studied in this experiment, namely SVM, hybrid of Discrete Wavelet Transform + SVM and hybrid of Kalman Filter + SVM. Experiment results have been compared to find the most promising techniques among them. MATLAB simulations show Kalman Filter and Support Vector Machine combination in a single system produced higher accuracy compared to the other two techniques.

  2. A Multi-Classification Method of Improved SVM-based Information Fusion for Traffic Parameters Forecasting

    Directory of Open Access Journals (Sweden)

    Hongzhuan Zhao

    2016-04-01

    Full Text Available With the enrichment of perception methods, modern transportation system has many physical objects whose states are influenced by many information factors so that it is a typical Cyber-Physical System (CPS. Thus, the traffic information is generally multi-sourced, heterogeneous and hierarchical. Existing research results show that the multisourced traffic information through accurate classification in the process of information fusion can achieve better parameters forecasting performance. For solving the problem of traffic information accurate classification, via analysing the characteristics of the multi-sourced traffic information and using redefined binary tree to overcome the shortcomings of the original Support Vector Machine (SVM classification in information fusion, a multi-classification method using improved SVM in information fusion for traffic parameters forecasting is proposed. The experiment was conducted to examine the performance of the proposed scheme, and the results reveal that the method can get more accurate and practical outcomes.

  3. Hybrid Support Vector Machines-Based Multi-fault Classification

    Institute of Scientific and Technical Information of China (English)

    GAO Guo-hua; ZHANG Yong-zhong; ZHU Yu; DUAN Guang-huang

    2007-01-01

    Support Vector Machines (SVM) is a new general machine-learning tool based on structural risk minimization principle. This characteristic is very signific ant for the fault diagnostics when the number of fault samples is limited. Considering that SVM theory is originally designed for a two-class classification, a hybrid SVM scheme is proposed for multi-fault classification of rotating machinery in our paper. Two SVM strategies, 1-v-1 (one versus one) and 1-v-r (one versus rest), are respectively adopted at different classification levels. At the parallel classification level, using 1-v-1 strategy, the fault features extracted by various signal analysis methods are transferred into the multiple parallel SVM and the local classification results are obtained. At the serial classification level, these local results values are fused by one serial SVM based on 1-v-r strategy. The hybrid SVM scheme introduced in our paper not only generalizes the performance of signal binary SVMs but improves the precision and reliability of the fault classification results. The actually testing results show the availability suitability of this new method.

  4. Machine Learning Algorithms in Web Page Classification

    Directory of Open Access Journals (Sweden)

    W.A.AWAD

    2012-11-01

    Full Text Available In this paper we use machine learning algorithms like SVM, KNN and GIS to perform a behaviorcomparison on the web pages classifications problem, from the experiment we see in the SVM with smallnumber of negative documents to build the centroids has the smallest storage requirement and the least online test computation cost. But almost all GIS with different number of nearest neighbors have an evenhigher storage requirement and on line test computation cost than KNN. This suggests that some futurework should be done to try to reduce the storage requirement and on list test cost of GIS.

  5. Multiclass Posterior Probability Twin SVM for Motor Imagery EEG Classification.

    Science.gov (United States)

    She, Qingshan; Ma, Yuliang; Meng, Ming; Luo, Zhizeng

    2015-01-01

    Motor imagery electroencephalography is widely used in the brain-computer interface systems. Due to inherent characteristics of electroencephalography signals, accurate and real-time multiclass classification is always challenging. In order to solve this problem, a multiclass posterior probability solution for twin SVM is proposed by the ranking continuous output and pairwise coupling in this paper. First, two-class posterior probability model is constructed to approximate the posterior probability by the ranking continuous output techniques and Platt's estimating method. Secondly, a solution of multiclass probabilistic outputs for twin SVM is provided by combining every pair of class probabilities according to the method of pairwise coupling. Finally, the proposed method is compared with multiclass SVM and twin SVM via voting, and multiclass posterior probability SVM using different coupling approaches. The efficacy on the classification accuracy and time complexity of the proposed method has been demonstrated by both the UCI benchmark datasets and real world EEG data from BCI Competition IV Dataset 2a, respectively.

  6. Incremental Training for SVM-Based Classification with Keyword Adjusting

    Institute of Scientific and Technical Information of China (English)

    SUN Jin-wen; YANG Jian-wu; LU Bin; XIAO Jian-guo

    2004-01-01

    This paper analyzed the theory of incremental learning of SVM (support vector machine) and pointed out it is a shortage that the support vector optimization is only considered in present research of SVM incremental learning.According to the significance of keyword in training, a new incremental training method considering keyword adjusting was proposed, which eliminates the difference between incremental learning and batch learning through the keyword adjusting.The experimental results show that the improved method outperforms the method without the keyword adjusting and achieve the same precision as the batch method.

  7. Classification of skin cancer images using local binary pattern and SVM classifier

    Science.gov (United States)

    Adjed, Faouzi; Faye, Ibrahima; Ababsa, Fakhreddine; Gardezi, Syed Jamal; Dass, Sarat Chandra

    2016-11-01

    In this paper, a classification method for melanoma and non-melanoma skin cancer images has been presented using the local binary patterns (LBP). The LBP computes the local texture information from the skin cancer images, which is later used to compute some statistical features that have capability to discriminate the melanoma and non-melanoma skin tissues. Support vector machine (SVM) is applied on the feature matrix for classification into two skin image classes (malignant and benign). The method achieves good classification accuracy of 76.1% with sensitivity of 75.6% and specificity of 76.7%.

  8. Feature Selection and Parameters Optimization of SVM Using Particle Swarm Optimization for Fault Classification in Power Distribution Systems

    Directory of Open Access Journals (Sweden)

    Ming-Yuan Cho

    2017-01-01

    Full Text Available Fast and accurate fault classification is essential to power system operations. In this paper, in order to classify electrical faults in radial distribution systems, a particle swarm optimization (PSO based support vector machine (SVM classifier has been proposed. The proposed PSO based SVM classifier is able to select appropriate input features and optimize SVM parameters to increase classification accuracy. Further, a time-domain reflectometry (TDR method with a pseudorandom binary sequence (PRBS stimulus has been used to generate a dataset for purposes of classification. The proposed technique has been tested on a typical radial distribution network to identify ten different types of faults considering 12 given input features generated by using Simulink software and MATLAB Toolbox. The success rate of the SVM classifier is over 97%, which demonstrates the effectiveness and high efficiency of the developed method.

  9. Feature Selection and Parameters Optimization of SVM Using Particle Swarm Optimization for Fault Classification in Power Distribution Systems.

    Science.gov (United States)

    Cho, Ming-Yuan; Hoang, Thi Thom

    2017-01-01

    Fast and accurate fault classification is essential to power system operations. In this paper, in order to classify electrical faults in radial distribution systems, a particle swarm optimization (PSO) based support vector machine (SVM) classifier has been proposed. The proposed PSO based SVM classifier is able to select appropriate input features and optimize SVM parameters to increase classification accuracy. Further, a time-domain reflectometry (TDR) method with a pseudorandom binary sequence (PRBS) stimulus has been used to generate a dataset for purposes of classification. The proposed technique has been tested on a typical radial distribution network to identify ten different types of faults considering 12 given input features generated by using Simulink software and MATLAB Toolbox. The success rate of the SVM classifier is over 97%, which demonstrates the effectiveness and high efficiency of the developed method.

  10. SVM classification model in depression recognition based on mutation PSO parameter optimization

    Directory of Open Access Journals (Sweden)

    Zhang Ming

    2017-01-01

    Full Text Available At present, the clinical diagnosis of depression is mainly through structured interviews by psychiatrists, which is lack of objective diagnostic methods, so it causes the higher rate of misdiagnosis. In this paper, a method of depression recognition based on SVM and particle swarm optimization algorithm mutation is proposed. To address on the problem that particle swarm optimization (PSO algorithm easily trap in local optima, we propose a feedback mutation PSO algorithm (FBPSO to balance the local search and global exploration ability, so that the parameters of the classification model is optimal. We compared different PSO mutation algorithms about classification accuracy for depression, and found the classification accuracy of support vector machine (SVM classifier based on feedback mutation PSO algorithm is the highest. Our study promotes important reference value for establishing auxiliary diagnostic used in depression recognition of clinical diagnosis.

  11. SEMI-SUPERVISED RADIO TRANSMITTER CLASSIFICATION BASED ON ELASTIC SPARSITY REGULARIZED SVM

    Institute of Scientific and Technical Information of China (English)

    Hu Guyu; Gong Yong; Chen Yande; Pan Zhisong; Deng Zhantao

    2012-01-01

    Non-collaborative radio transmitter recognition is a significant but challenging issue,sinceit is hard or costly to obtain labeled training data samples.In order to make effective use of the unlabeled samples which can be obtained much easier,a novel semi-supervised classification method named Elastic Sparsity Regularized Support Vector Machine (ESRSVM) is proposed for radio transmitter classification.ESRSVM first constructs an elastic-net graph over data samples to capture the robust and natural discriminating information and then incorporate the information into the manifold learning framework by an elastic sparsity regularization term.Experimental results on 10 GMSK modulated Automatic Identification System radios and 15 FM walkie-talkie radios show that ESRSVM achieves obviously better performance than KNN and SVM,which use only labeled samples for classification,and also outperforms semi-supervised classifier LapSVM based on manifold regularization.

  12. A hybrid particle swarm optimization-SVM classification for automatic cardiac auscultation

    Directory of Open Access Journals (Sweden)

    Prasertsak Charoen

    2017-04-01

    Full Text Available Cardiac auscultation is a method for a doctor to listen to heart sounds, using a stethoscope, for examining the condition of the heart. Automatic cardiac auscultation with machine learning is a promising technique to classify heart conditions without need of doctors or expertise. In this paper, we develop a classification model based on support vector machine (SVM and particle swarm optimization (PSO for an automatic cardiac auscultation system. The model consists of two parts: heart sound signal processing part and a proposed PSO for weighted SVM (WSVM classifier part. In this method, the PSO takes into account the degree of importance for each feature extracted from wavelet packet (WP decomposition. Then, by using principle component analysis (PCA, the features can be selected. The PSO technique is used to assign diverse weights to different features for the WSVM classifier. Experimental results show that both continuous and binary PSO-WSVM models achieve better classification accuracy on the heart sound samples, by reducing system false negatives (FNs, compared to traditional SVM and genetic algorithm (GA based SVM.

  13. SVM CLASSIFICATION :ITS CONTENTS AND CHALLENGES%SVM法分类:它的内容和挑战

    Institute of Scientific and Technical Information of China (English)

    岳士弘; 李平; 郝沛毅

    2003-01-01

    SVM (support vector machines) have become an increasingly popular tool for machine learning tasks involving classification, regression or novelty detection. In particular,they exhibit good generalization performance on many real issues and the approach is properly motivated theoretically. There are relatively a few free parameters to adjust and the architecture of the learning machine does not need to be found by experimentation. In this paper,survey of the key contents on this subject, focusing on the most well-known models based on kernel substitution, namely SVM, as well as the activated fields at present and the development tendency ,is presented.

  14. The Application of Support Vector Machine (svm) Using Cielab Color Model, Color Intensity and Color Constancy as Features for Ortho Image Classification of Benthic Habitats in Hinatuan, Surigao del Sur, Philippines

    Science.gov (United States)

    Cubillas, J. E.; Japitana, M.

    2016-06-01

    This study demonstrates the application of CIELAB, Color intensity, and One Dimensional Scalar Constancy as features for image recognition and classifying benthic habitats in an image with the coastal areas of Hinatuan, Surigao Del Sur, Philippines as the study area. The study area is composed of four datasets, namely: (a) Blk66L005, (b) Blk66L021, (c) Blk66L024, and (d) Blk66L0114. SVM optimization was performed in Matlab® software with the help of Parallel Computing Toolbox to hasten the SVM computing speed. The image used for collecting samples for SVM procedure was Blk66L0114 in which a total of 134,516 sample objects of mangrove, possible coral existence with rocks, sand, sea, fish pens and sea grasses were collected and processed. The collected samples were then used as training sets for the supervised learning algorithm and for the creation of class definitions. The learned hyper-planes separating one class from another in the multi-dimensional feature space can be thought of as a super feature which will then be used in developing the C (classifier) rule set in eCognition® software. The classification results of the sampling site yielded an accuracy of 98.85% which confirms the reliability of remote sensing techniques and analysis employed to orthophotos like the CIELAB, Color Intensity and One dimensional scalar constancy and the use of SVM classification algorithm in classifying benthic habitats.

  15. THE APPLICATION OF SUPPORT VECTOR MACHINE (SVM USING CIELAB COLOR MODEL, COLOR INTENSITY AND COLOR CONSTANCY AS FEATURES FOR ORTHO IMAGE CLASSIFICATION OF BENTHIC HABITATS IN HINATUAN, SURIGAO DEL SUR, PHILIPPINES

    Directory of Open Access Journals (Sweden)

    J. E. Cubillas

    2016-06-01

    Full Text Available This study demonstrates the application of CIELAB, Color intensity, and One Dimensional Scalar Constancy as features for image recognition and classifying benthic habitats in an image with the coastal areas of Hinatuan, Surigao Del Sur, Philippines as the study area. The study area is composed of four datasets, namely: (a Blk66L005, (b Blk66L021, (c Blk66L024, and (d Blk66L0114. SVM optimization was performed in Matlab® software with the help of Parallel Computing Toolbox to hasten the SVM computing speed. The image used for collecting samples for SVM procedure was Blk66L0114 in which a total of 134,516 sample objects of mangrove, possible coral existence with rocks, sand, sea, fish pens and sea grasses were collected and processed. The collected samples were then used as training sets for the supervised learning algorithm and for the creation of class definitions. The learned hyper-planes separating one class from another in the multi-dimensional feature space can be thought of as a super feature which will then be used in developing the C (classifier rule set in eCognition® software. The classification results of the sampling site yielded an accuracy of 98.85% which confirms the reliability of remote sensing techniques and analysis employed to orthophotos like the CIELAB, Color Intensity and One dimensional scalar constancy and the use of SVM classification algorithm in classifying benthic habitats.

  16. Incremental Learning with SVM for Multimodal Classification of Prostatic Adenocarcinoma

    OpenAIRE

    José Fernando García Molina; Lei Zheng; Metin Sertdemir; Dietmar J Dinter; Stefan Schönberg; Matthias Rädle

    2014-01-01

    Robust detection of prostatic cancer is a challenge due to the multitude of variants and their representation in MR images. We propose a pattern recognition system with an incremental learning ensemble algorithm using support vector machines (SVM) tackling this problem employing multimodal MR images and a texture-based information strategy. The proposed system integrates anatomic, texture, and functional features. The data set was preprocessed using B-Spline interpolation, bias field correcti...

  17. AN IMPLEMENTATION OF EIS-SVM CLASSIFIER USING RESEARCH ARTICLES FOR TEXT CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    B Ramesh

    2016-04-01

    Full Text Available Automatic text classification is a prominent research topic in text mining. The text pre-processing is a major role in text classifier. The efficiency of pre-processing techniques is increasing the performance of text classifier. In this paper, we are implementing ECAS stemmer, Efficient Instance Selection and Pre-computed Kernel Support Vector Machine for text classification using recent research articles. We are using better pre-processing techniques such as ECAS stemmer to find root word, Efficient Instance Selection for dimensionality reduction of text data and Pre-computed Kernel Support Vector Machine for classification of selected instances. In this experiments were performed on 750 research articles with three classes such as engineering article, medical articles and educational articles. The EIS-SVM classifier provides better performance in real-time research articles classification.

  18. Classification of different kinds of pesticide residues on lettuce based on fluorescence spectra and WT-BCC-SVM algorithm

    Science.gov (United States)

    Zhou, Xin; Jun, Sun; Zhang, Bing; Jun, Wu

    2017-07-01

    In order to improve the reliability of the spectrum feature extracted by wavelet transform, a method combining wavelet transform (WT) with bacterial colony chemotaxis algorithm and support vector machine (BCC-SVM) algorithm (WT-BCC-SVM) was proposed in this paper. Besides, we aimed to identify different kinds of pesticide residues on lettuce leaves in a novel and rapid non-destructive way by using fluorescence spectra technology. The fluorescence spectral data of 150 lettuce leaf samples of five different kinds of pesticide residues on the surface of lettuce were obtained using Cary Eclipse fluorescence spectrometer. Standard normalized variable detrending (SNV detrending), Savitzky-Golay coupled with Standard normalized variable detrending (SG-SNV detrending) were used to preprocess the raw spectra, respectively. Bacterial colony chemotaxis combined with support vector machine (BCC-SVM) and support vector machine (SVM) classification models were established based on full spectra (FS) and wavelet transform characteristics (WTC), respectively. Moreover, WTC were selected by WT. The results showed that the accuracy of training set, calibration set and the prediction set of the best optimal classification model (SG-SNV detrending-WT-BCC-SVM) were 100%, 98% and 93.33%, respectively. In addition, the results indicated that it was feasible to use WT-BCC-SVM to establish diagnostic model of different kinds of pesticide residues on lettuce leaves.

  19. Application of SVM classifier in thermographic image classification for early detection of breast cancer

    Science.gov (United States)

    Oleszkiewicz, Witold; Cichosz, Paweł; Jagodziński, Dariusz; Matysiewicz, Mateusz; Neumann, Łukasz; Nowak, Robert M.; Okuniewski, Rafał

    2016-09-01

    This article presents the application of machine learning algorithms for early detection of breast cancer on the basis of thermographic images. Supervised learning model: Support vector machine (SVM) and Sequential Minimal Optimization algorithm (SMO) for the training of SVM classifier were implemented. The SVM classifier was included in a client-server application which enables to create a training set of examinations and to apply classifiers (including SVM) for the diagnosis and early detection of the breast cancer. The sensitivity and specificity of SVM classifier were calculated based on the thermographic images from studies. Furthermore, the heuristic method for SVM's parameters tuning was proposed.

  20. Network Traffic Classification Using SVM Decision Tree%基于SVM决策树的网络流量分类

    Institute of Scientific and Technical Information of China (English)

    邱婧; 夏靖波; 柏骏

    2012-01-01

    In order to solve the unrecognized area and long training time problems existed when using Support Vector Machine ( SVM) method in network traffic classification, SVM decision tree was used in network traffic classification by using its advantages in multi-class classification. The authoritative flow data sets were tested. The experiment results show that SVM decision tree method has shorter training time and better classification performance than ordinary "one-on-one" and "one-on-more"SVM method in network traffic classification, whose classification accuracy rate can reach 98. 8%.%提出一种用支持向量机(SVM)决策树来对网络流量进行分类的方法,利用SVM决策树在多类分类方面的优势,解决SVM在流量分类中存在的无法识别区域和训练时间较长的问题.对权威流量数据集进行了测试,实验结果表明,SVM决策树在流量分类中比普通的“一对一”和“一对多”SVM方法具有更短的训练时问和更好的分类性能,分类准确率可以达到98.8%.

  1. MULTI-LABEL CLASSIFICATION OF PRODUCT REVIEWS USING STRUCTURED SVM

    Directory of Open Access Journals (Sweden)

    Jincy B. Chrystal

    2015-05-01

    Full Text Available Most of the text classification problems are associated with multiple class labels and hence automatic text classification is one of the most challenging and prominent research area. Text classification is the problem of categorizing text documents into different classes. In the multi-label classification scenario, each document is associated may have more than one label. The real challenge in the multi-label classification is the labelling of large number of text documents with a subset of class categories. The feature extraction and classification of such text documents require an efficient machine learning algorithm which performs automatic text classification. This paper describes the multi-label classification of product review documents using Structured Support Vector Machine.

  2. Support-vector-machine tree-based domain knowledge learning toward automated sports video classification

    Science.gov (United States)

    Xiao, Guoqiang; Jiang, Yang; Song, Gang; Jiang, Jianmin

    2010-12-01

    We propose a support-vector-machine (SVM) tree to hierarchically learn from domain knowledge represented by low-level features toward automatic classification of sports videos. The proposed SVM tree adopts a binary tree structure to exploit the nature of SVM's binary classification, where each internal node is a single SVM learning unit, and each external node represents the classified output type. Such a SVM tree presents a number of advantages, which include: 1. low computing cost; 2. integrated learning and classification while preserving individual SVM's learning strength; and 3. flexibility in both structure and learning modules, where different numbers of nodes and features can be added to address specific learning requirements, and various learning models can be added as individual nodes, such as neural networks, AdaBoost, hidden Markov models, dynamic Bayesian networks, etc. Experiments support that the proposed SVM tree achieves good performances in sports video classifications.

  3. CyNetSVM: A Cytoscape App for Cancer Biomarker Identification Using Network Constrained Support Vector Machines.

    Science.gov (United States)

    Shi, Xu; Banerjee, Sharmi; Chen, Li; Hilakivi-Clarke, Leena; Clarke, Robert; Xuan, Jianhua

    2017-01-01

    One of the important tasks in cancer research is to identify biomarkers and build classification models for clinical outcome prediction. In this paper, we develop a CyNetSVM software package, implemented in Java and integrated with Cytoscape as an app, to identify network biomarkers using network-constrained support vector machines (NetSVM). The Cytoscape app of NetSVM is specifically designed to improve the usability of NetSVM with the following enhancements: (1) user-friendly graphical user interface (GUI), (2) computationally efficient core program and (3) convenient network visualization capability. The CyNetSVM app has been used to analyze breast cancer data to identify network genes associated with breast cancer recurrence. The biological function of these network genes is enriched in signaling pathways associated with breast cancer progression, showing the effectiveness of CyNetSVM for cancer biomarker identification. The CyNetSVM package is available at Cytoscape App Store and http://sourceforge.net/projects/netsvmjava; a sample data set is also provided at sourceforge.net.

  4. 支持向量机研究进展%Advances of Support Vector Machines(SVM)

    Institute of Scientific and Technical Information of China (English)

    顾亚祥; 丁世飞

    2011-01-01

    Support vector machines(SVM) are widespread attended for its excellent ability to learn, that are based on statistical learning theory. But in dealing with large-scale quadratic programming(QP) problem, traditional SVM will take too long time of training time, and has low efficiency and so on. This paper made a summarize of the new progress in the SVM training of algorithm,and made analysis and comparison on main algorithm,pointed out the advantages and disadvantages of them,focused on new progress in the current study — Fuzzy Support Vector Machine and Granular Support Vector Machine. Then the two mainly applications — classification and regression of SVM were discussed. Finally, the article gave the future research directions on SVM prediction.%基于统计学习理论的支持向量机(Support vector machines,SVM)以其优秀的学习能力受到广泛的关注.但传统支持向量机在处理大规模二次规划问题时会出现训练时间长、效率低下等问题.对SVM训练算法的最新研究成果进行了综述,对主要算法进行了比较深入的分析和比较,指出了各自的优点及其存在的问题,并且着重介绍了目前研究的新进展--模糊SVM和粒度SVM.接着论述了SVM主要的两方面应用--分类和回归.最后给出了今后SVM研究方向的预见.

  5. Poster abstract: A machine learning approach for vehicle classification using passive infrared and ultrasonic sensors

    KAUST Repository

    Warriach, Ehsan Ullah

    2013-01-01

    This article describes the implementation of four different machine learning techniques for vehicle classification in a dual ultrasonic/passive infrared traffic flow sensors. Using k-NN, Naive Bayes, SVM and KNN-SVM algorithms, we show that KNN-SVM significantly outperforms other algorithms in terms of classification accuracy. We also show that some of these algorithms could run in real time on the prototype system. Copyright © 2013 ACM.

  6. Binary classification of ¹⁸F-flutemetamol PET using machine learning

    DEFF Research Database (Denmark)

    Vandenberghe, Rik; Nelissen, Natalie; Salmon, Eric

    2013-01-01

    (18)F-flutemetamol is a positron emission tomography (PET) tracer for in vivo amyloid imaging. The ability to classify amyloid scans in a binary manner as 'normal' versus 'Alzheimer-like', is of high clinical relevance. We evaluated whether a supervised machine learning technique, support vector...... machines (SVM), can replicate the assignments made by visual readers blind to the clinical diagnosis, which image components have highest diagnostic value according to SVM and how (18)F-flutemetamol-based classification using SVM relates to structural MRI-based classification using SVM within the same...

  7. Comparison of Advanced Pixel Based (ANN and SVM) and Object-Oriented Classification Approaches Using Landsat-7 Etm+ Data

    OpenAIRE

    Prasun Kumar Gupta; Gaurav Kalidas Pakhale

    2010-01-01

    In this study, the pixel-based and object-oriented image classification approaches were used for identifying different land use types in Karnal district. Imagery from Landsat-7 ETM with 6 spectral bands was used to perform the image classification.Ground truth data were collected from the available maps, personal knowledge and communication with the local people. In order to prepare land use map different approaches: Artificial Neural Network(ANN) and Support Vector Machine (SVM) were used. F...

  8. Classification of 5-HT1A receptor agonists and antagonists using GA-SVM method

    Institute of Scientific and Technical Information of China (English)

    Xue-lian ZHU; Hai-yan CAI; Zhi-jian XU; Yong WANG; He-yao WANG; Ao ZHANG; Wei-liang ZHU

    2011-01-01

    Aim:To construct a reliable computational model for the classification of agonists and antagonists of 5-HT1A receptor.Methods:Support vector machine (SVM),a well-known machine learning method,was employed to build a prediction model,and genetic algorithm (GA) was used to select the most relevant descriptors and to optimize two important parameters,C and r of the SVM model.The overall dataset used in this study comprised 284 ligands of the 5-HT1A receptor with diverse structures reported in the literatures.Results:A SVM model was successfully developed that could be used to predict the probability of a ligand being an agonist or antagonist of the 5-HT1A receptor.The predictive accuracy for training and test sets was 0.942 and 0.865,respectively.For compounds with probability estimate higher than 0.7,the predictive accuracy of the model for training and test sets was 0.954 and 0.927,respectively.To further validate our model,the receiver operating characteristic (ROC) curve was plotted,and the Area-Under-the-ROC-Curve (AUC) value was calculated to be 0.883 for training set and 0.906 for test set.Conclusion:A reliable SVM model was successfully developed that could effectively distinguish agonists and antagonists among the ligands of the 5-HT1A receptor.To our knowledge,this is the first effort for the classification of 5-HT1A receptor agonists and antagonists based on a diverse dataset.This method may be used to classify the ligands of other members of the GPCR family.

  9. Evaluation of Effectiveness of Wavelet Based Denoising Schemes Using ANN and SVM for Bearing Condition Classification

    Directory of Open Access Journals (Sweden)

    Vijay G. S.

    2012-01-01

    Full Text Available The wavelet based denoising has proven its ability to denoise the bearing vibration signals by improving the signal-to-noise ratio (SNR and reducing the root-mean-square error (RMSE. In this paper seven wavelet based denoising schemes have been evaluated based on the performance of the Artificial Neural Network (ANN and the Support Vector Machine (SVM, for the bearing condition classification. The work consists of two parts, the first part in which a synthetic signal simulating the defective bearing vibration signal with Gaussian noise was subjected to these denoising schemes. The best scheme based on the SNR and the RMSE was identified. In the second part, the vibration signals collected from a customized Rolling Element Bearing (REB test rig for four bearing conditions were subjected to these denoising schemes. Several time and frequency domain features were extracted from the denoised signals, out of which a few sensitive features were selected using the Fisher’s Criterion (FC. Extracted features were used to train and test the ANN and the SVM. The best denoising scheme identified, based on the classification performances of the ANN and the SVM, was found to be the same as the one obtained using the synthetic signal.

  10. A Stock Market Prediction Method Based on Support Vector Machines (SVM and Independent Component Analysis (ICA

    Directory of Open Access Journals (Sweden)

    Hakob GRIGORYAN

    2016-08-01

    Full Text Available The research presented in this work focuses on financial time series prediction problem. The integrated prediction model based on support vector machines (SVM with independent component analysis (ICA (called SVM-ICA is proposed for stock market prediction. The presented approach first uses ICA technique to extract important features from the research data, and then applies SVM technique to perform time series prediction. The results obtained from the SVM-ICA technique are compared with the results of SVM-based model without using any pre-processing step. In order to show the effectiveness of the proposed methodology, two different research data are used as illustrative examples. In experiments, the root mean square error (RMSE measure is used to evaluate the performance of proposed models. The comparative analysis leads to the conclusion that the proposed SVM-ICA model outperforms the simple SVM-based model in forecasting task of nonstationary time series.

  11. Semi-supervised Learning for Classification of Polarimetric SAR Images Based on SVM-Wishart

    Directory of Open Access Journals (Sweden)

    Hua Wen-qiang

    2015-02-01

    Full Text Available In this study, we propose a new semi-supervised classification method for Polarimetric SAR (PolSAR images, aiming at handling the issue that the number of train set is small. First, considering the scattering characters of PolSAR data, this method extracts multiple scattering features using target decomposition approach. Then, a semi-supervised learning model is established based on a co-training framework and Support Vector Machine (SVM. Both labeled and unlabeled data are utilized in this model to obtain high classification accuracy. Third, a recovery scheme based on the Wishart classifier is proposed to improve the classification performance. From the experiments conducted in this study, it is evident that the proposed method performs more effectively compared with other traditional methods when the number of train set is small.

  12. Application of machine learning on brain cancer multiclass classification

    Science.gov (United States)

    Panca, V.; Rustam, Z.

    2017-07-01

    Classification of brain cancer is a problem of multiclass classification. One approach to solve this problem is by first transforming it into several binary problems. The microarray gene expression dataset has the two main characteristics of medical data: extremely many features (genes) and only a few number of samples. The application of machine learning on microarray gene expression dataset mainly consists of two steps: feature selection and classification. In this paper, the features are selected using a method based on support vector machine recursive feature elimination (SVM-RFE) principle which is improved to solve multiclass classification, called multiple multiclass SVM-RFE. Instead of using only the selected features on a single classifier, this method combines the result of multiple classifiers. The features are divided into subsets and SVM-RFE is used on each subset. Then, the selected features on each subset are put on separate classifiers. This method enhances the feature selection ability of each single SVM-RFE. Twin support vector machine (TWSVM) is used as the method of the classifier to reduce computational complexity. While ordinary SVM finds single optimum hyperplane, the main objective Twin SVM is to find two non-parallel optimum hyperplanes. The experiment on the brain cancer microarray gene expression dataset shows this method could classify 71,4% of the overall test data correctly, using 100 and 1000 genes selected from multiple multiclass SVM-RFE feature selection method. Furthermore, the per class results show that this method could classify data of normal and MD class with 100% accuracy.

  13. Classification of Convective and Stratiform Cells in Meteorological Radar Images Using SVM Based on a Textural Analysis

    Institute of Scientific and Technical Information of China (English)

    Abdenasser Djafri; Boualem Haddad

    2014-01-01

    This contribution deals with the discrimination between stratiform and convective cells in meteorological radar images. This study is based on a textural analysis of the latter and their classification using a support vector machine (SVM). First, we apply different textural parameters such as energy, entropy, inertia, and local homogeneity. Through this experience, we identify the different textural features of both the stratiform and convective cells. Then, we use an SVM to find the best discriminating parameter between the two types of clouds. The main goal of this work is to better apply the Palmer and Marshall Z-R relations specific to each type of precipitation.

  14. Improved Sparse Multi-Class SVM and Its Application for Gene Selection in Cancer Classification.

    Science.gov (United States)

    Huang, Lingkang; Zhang, Hao Helen; Zeng, Zhao-Bang; Bushel, Pierre R

    2013-01-01

    Microarray techniques provide promising tools for cancer diagnosis using gene expression profiles. However, molecular diagnosis based on high-throughput platforms presents great challenges due to the overwhelming number of variables versus the small sample size and the complex nature of multi-type tumors. Support vector machines (SVMs) have shown superior performance in cancer classification due to their ability to handle high dimensional low sample size data. The multi-class SVM algorithm of Crammer and Singer provides a natural framework for multi-class learning. Despite its effective performance, the procedure utilizes all variables without selection. In this paper, we propose to improve the procedure by imposing shrinkage penalties in learning to enforce solution sparsity. The original multi-class SVM of Crammer and Singer is effective for multi-class classification but does not conduct variable selection. We improved the method by introducing soft-thresholding type penalties to incorporate variable selection into multi-class classification for high dimensional data. The new methods were applied to simulated data and two cancer gene expression data sets. The results demonstrate that the new methods can select a small number of genes for building accurate multi-class classification rules. Furthermore, the important genes selected by the methods overlap significantly, suggesting general agreement among different variable selection schemes. High accuracy and sparsity make the new methods attractive for cancer diagnostics with gene expression data and defining targets of therapeutic intervention. The source MATLAB code are available from http://math.arizona.edu/~hzhang/software.html.

  15. SVM-Maj: a majorization approach to linear support vector machines with different hinge errors

    NARCIS (Netherlands)

    P.J.F. Groenen (Patrick); G.I. Nalbantov (Georgi); J.C. Bioch (Cor)

    2007-01-01

    textabstractSupport vector machines (SVM) are becoming increasingly popular for the prediction of a binary dependent variable. SVMs perform very well with respect to competing techniques. Often, the solution of an SVM is obtained by switching to the dual. In this paper, we stick to the primal suppor

  16. Human Behavior Classification Using Multi-Class Relevance Vector Machine

    Directory of Open Access Journals (Sweden)

    Yogameena, B.

    2010-01-01

    Full Text Available Problem statement: In computer vision and robotics, one of the typical tasks is to identify specific objects in an image and to determine each object’s position and orientation relative to coordinate system. This study presented a Multi-class Relevance Vector machine (RVM classification algorithm which classifies different human poses from a single stationary camera for video surveillance applications. Approach: First the foreground blobs and their edges are obtained. Then the relevance vector machine classification scheme classified the normal and abnormal behavior. Results: The performance proposed by our method was compared with Support Vector Machine (SVM and multi-class support vector machine. Experimental results showed the effectiveness of the method. Conclusion: It is evident that RVM has good accuracy and lesser computational than SVM.

  17. Tree Crown Delineation on Vhr Aerial Imagery with Svm Classification Technique Optimized by Taguchi Method: a Case Study in Zagros Woodlands

    Science.gov (United States)

    Erfanifard, Y.; Behnia, N.; Moosavi, V.

    2013-09-01

    The Support Vector Machine (SVM) is a theoretically superior machine learning methodology with great results in classification of remotely sensed datasets. Determination of optimal parameters applied in SVM is still vague to some scientists. In this research, it is suggested to use the Taguchi method to optimize these parameters. The objective of this study was to detect tree crowns on very high resolution (VHR) aerial imagery in Zagros woodlands by SVM optimized by Taguchi method. A 30 ha plot of Persian oak (Quercus persica) coppice trees was selected in Zagros woodlands, Iran. The VHR aerial imagery of the plot with 0.06 m spatial resolution was obtained from National Geographic Organization (NGO), Iran, to extract the crowns of Persian oak trees in this study. The SVM parameters were optimized by Taguchi method and thereafter, the imagery was classified by the SVM with optimal parameters. The results showed that the Taguchi method is a very useful approach to optimize the combination of parameters of SVM. It was also concluded that the SVM method could detect the tree crowns with a KHAT coefficient of 0.961 which showed a great agreement with the observed samples and overall accuracy of 97.7% that showed the accuracy of the final map. Finally, the authors suggest applying this method to optimize the parameters of classification techniques like SVM.

  18. Settlement Prediction of Road Soft Foundation Using a Support Vector Machine (SVM Based on Measured Data

    Directory of Open Access Journals (Sweden)

    Yu Huiling

    2016-01-01

    Full Text Available The suppor1t vector machine (SVM is a relatively new artificial intelligence technique which is increasingly being applied to geotechnical problems and is yielding encouraging results. SVM is a new machine learning method based on the statistical learning theory. A case study based on road foundation engineering project shows that the forecast results are in good agreement with the measured data. The SVM model is also compared with BP artificial neural network model and traditional hyperbola method. The prediction results indicate that the SVM model has a better prediction ability than BP neural network model and hyperbola method. Therefore, settlement prediction based on SVM model can reflect actual settlement process more correctly. The results indicate that it is effective and feasible to use this method and the nonlinear mapping relation between foundation settlement and its influence factor can be expressed well. It will provide a new method to predict foundation settlement.

  19. Image Classification Using PSO-SVM and an RGB-D Sensor

    Directory of Open Access Journals (Sweden)

    Carlos López-Franco

    2014-01-01

    Full Text Available Image classification is a process that depends on the descriptor used to represent an object. To create such descriptors we use object models with rich information of the distribution of points. The object model stage is improved with an optimization process by spreading the point that conforms the mesh. In this paper, particle swarm optimization (PSO is used to improve the model generation, while for the classification problem a support vector machine (SVM is used. In order to measure the performance of the proposed method a group of objects from a public RGB-D object data set has been used. Experimental results show that our approach improves the distribution on the feature space of the model, which allows to reduce the number of support vectors obtained in the training process.

  20. Classification of Multiple Chinese Liquors by Means of a QCM-based E-Nose and MDS-SVM Classifier.

    Science.gov (United States)

    Li, Qiang; Gu, Yu; Jia, Jing

    2017-01-30

    Chinese liquors are internationally well-known fermentative alcoholic beverages. They have unique flavors attributable to the use of various bacteria and fungi, raw materials, and production processes. Developing a novel, rapid, and reliable method to identify multiple Chinese liquors is of positive significance. This paper presents a pattern recognition system for classifying ten brands of Chinese liquors based on multidimensional scaling (MDS) and support vector machine (SVM) algorithms in a quartz crystal microbalance (QCM)-based electronic nose (e-nose) we designed. We evaluated the comprehensive performance of the MDS-SVM classifier that predicted all ten brands of Chinese liquors individually. The prediction accuracy (98.3%) showed superior performance of the MDS-SVM classifier over the back-propagation artificial neural network (BP-ANN) classifier (93.3%) and moving average-linear discriminant analysis (MA-LDA) classifier (87.6%). The MDS-SVM classifier has reasonable reliability, good fitting and prediction (generalization) performance in classification of the Chinese liquors. Taking both application of the e-nose and validation of the MDS-SVM classifier into account, we have thus created a useful method for the classification of multiple Chinese liquors.

  1. Classification of Multiple Chinese Liquors by Means of a QCM-based E-Nose and MDS-SVM Classifier

    Directory of Open Access Journals (Sweden)

    Qiang Li

    2017-01-01

    Full Text Available Chinese liquors are internationally well-known fermentative alcoholic beverages. They have unique flavors attributable to the use of various bacteria and fungi, raw materials, and production processes. Developing a novel, rapid, and reliable method to identify multiple Chinese liquors is of positive significance. This paper presents a pattern recognition system for classifying ten brands of Chinese liquors based on multidimensional scaling (MDS and support vector machine (SVM algorithms in a quartz crystal microbalance (QCM-based electronic nose (e-nose we designed. We evaluated the comprehensive performance of the MDS-SVM classifier that predicted all ten brands of Chinese liquors individually. The prediction accuracy (98.3% showed superior performance of the MDS-SVM classifier over the back-propagation artificial neural network (BP-ANN classifier (93.3% and moving average-linear discriminant analysis (MA-LDA classifier (87.6%. The MDS-SVM classifier has reasonable reliability, good fitting and prediction (generalization performance in classification of the Chinese liquors. Taking both application of the e-nose and validation of the MDS-SVM classifier into account, we have thus created a useful method for the classification of multiple Chinese liquors.

  2. Feature Selection Based on the SVM Weight Vector for Classification of Dementia.

    Science.gov (United States)

    Bron, Esther E; Smits, Marion; Niessen, Wiro J; Klein, Stefan

    2015-09-01

    Computer-aided diagnosis of dementia using a support vector machine (SVM) can be improved with feature selection. The relevance of individual features can be quantified from the SVM weights as a significance map (p-map). Although these p-maps previously showed clusters of relevant voxels in dementia-related brain regions, they have not yet been used for feature selection. Therefore, we introduce two novel feature selection methods based on p-maps using a direct approach (filter) and an iterative approach (wrapper). To evaluate these p-map feature selection methods, we compared them with methods based on the SVM weight vector directly, t-statistics, and expert knowledge. We used MRI data from the Alzheimer's disease neuroimaging initiative classifying Alzheimer's disease (AD) patients, mild cognitive impairment (MCI) patients who converted to AD (MCIc), MCI patients who did not convert to AD (MCInc), and cognitively normal controls (CN). Features for each voxel were derived from gray matter morphometry. Feature selection based on the SVM weights gave better results than t-statistics and expert knowledge. The p-map methods performed slightly better than those using the weight vector. The wrapper method scored better than the filter method. Recursive feature elimination based on the p-map improved most for AD-CN: the area under the receiver-operating-characteristic curve (AUC) significantly increased from 90.3% without feature selection to 92.0% when selecting 1.5%-3% of the features. This feature selection method also improved the other classifications: AD-MCI 0.1% improvement in AUC (not significant), MCI-CN 0.7%, and MCIc-MCInc 0.1% (not significant). Although the performance improvement due to feature selection was limited, the methods based on the p-map generally had the best performance, and were therefore better in estimating the relevance of individual features.

  3. Integrated Features by Administering the Support Vector Machine (SVM of Translational Initiations Sites in Alternative Polymorphic Contex

    Directory of Open Access Journals (Sweden)

    Nurul Arneida Husin

    2012-04-01

    Full Text Available Many algorithms and methods have been proposed for classification problems in bioinformatics. In this study, the discriminative approach in particular support vector machines (SVM is employed to recognize the studied TIS patterns. The applied discriminative approach is used to learn about some discriminant functions of samples that have been labelled as positive or negative. After learning, the discriminant functions are employed to decide whether a new sample is true or false. In this study, support vector machines (SVM is employed to recognize the patterns for studied translational initiation sites in alternative weak context. The method has been optimized with the best parameters selected; c=100, E=10-6 and ex=2 for non linear kernel function. Results show that with top 5 features and non linear kernel, the best prediction accuracy achieved is 95.8%. J48 algorithm is applied to compare with SVM with top 15 features and the results show a good prediction accuracy of 95.8%. This indicates that the top 5 features selected by the IGR method and that are performed by SVM are sufficient to use in the prediction of TIS in weak contexts.

  4. A new type SVM-projected SVM

    Institute of Scientific and Technical Information of China (English)

    ZHU; Yongsheng; ZHANG; Youyun

    2004-01-01

    Support vector machine (SVM), developed by Vapnik et al., is a new and promising technique for classification and regression and has been proved to be competitive with the best available learning machines in many applications. However, the classification speed of SVM is substantially slower than that of other techniques with similar generalization ability. A new type SVM named projected SVM (PSVM), which is a combination of feature vector selection (FVS) method and linear SVM (LSVM), is proposed in present paper. In PSVM, the FVS method is first used to select a relevant subset (feature vectors, FVs) from the training data, and then both the training data and the test data are projected into the subspace constructed by FVs, and finally linear SVM(LSVM) is applied to classify the projected data. The time required by PSVM to calculate the class of new samples is proportional to the count of FVs. In most cases, the count of FVs is smaller than that of support vectors (SVs), and therefore PSVM is faster than SVM in running. Compared with other speeding-up techniques of SVM, PSVM is proved to possess not only speeding-up ability but also de-noising ability for high-noised data, and is found to be of potential use in mechanical fault pattern recognition.

  5. Sparse extreme learning machine for classification.

    Science.gov (United States)

    Bai, Zuo; Huang, Guang-Bin; Wang, Danwei; Wang, Han; Westover, M Brandon

    2014-10-01

    Extreme learning machine (ELM) was initially proposed for single-hidden-layer feedforward neural networks (SLFNs). In the hidden layer (feature mapping), nodes are randomly generated independently of training data. Furthermore, a unified ELM was proposed, providing a single framework to simplify and unify different learning methods, such as SLFNs, least square support vector machines, proximal support vector machines, and so on. However, the solution of unified ELM is dense, and thus, usually plenty of storage space and testing time are required for large-scale applications. In this paper, a sparse ELM is proposed as an alternative solution for classification, reducing storage space and testing time. In addition, unified ELM obtains the solution by matrix inversion, whose computational complexity is between quadratic and cubic with respect to the training size. It still requires plenty of training time for large-scale problems, even though it is much faster than many other traditional methods. In this paper, an efficient training algorithm is specifically developed for sparse ELM. The quadratic programming problem involved in sparse ELM is divided into a series of smallest possible sub-problems, each of which are solved analytically. Compared with SVM, sparse ELM obtains better generalization performance with much faster training speed. Compared with unified ELM, sparse ELM achieves similar generalization performance for binary classification applications, and when dealing with large-scale binary classification problems, sparse ELM realizes even faster training speed than unified ELM.

  6. SVM and ANN Based Classification of Plant Diseases Using Feature Reduction Technique

    Directory of Open Access Journals (Sweden)

    Jagadeesh D.Pujari

    2016-06-01

    Full Text Available Computers have been used for mechanization and automation in different applications of agriculture/horticulture. The critical decision on the agricultural yield and plant protection is done with the development of expert system (decision support system using computer vision techniques. One of the areas considered in the present work is the processing of images of plant diseases affecting agriculture/horticulture crops. The first symptoms of plant disease have to be correctly detected, identified, and quantified in the initial stages. The color and texture features have been used in order to work with the sample images of plant diseases. Algorithms for extraction of color and texture features have been developed, which are in turn used to train support vector machine (SVM and artificial neural network (ANN classifiers. The study has presented a reduced feature set based approach for recognition and classification of images of plant diseases. The results reveal that SVM classifier is more suitable for identification and classification of plant diseases affecting agriculture/horticulture crops.

  7. Stellar Spectral Classification with Minimum Within-Class and Maximum Between-Class Scatter Support Vector Machine

    Indian Academy of Sciences (India)

    Liu Zhong-Bao

    2016-06-01

    Support Vector Machine (SVM) is one of the important stellar spectral classification methods, and it is widely used in practice. But its classification efficiencies cannot be greatly improved because it does not take the class distribution into consideration. In view of this, a modified SVM-named Minimum within-class and Maximum between-class scatter Support Vector Machine (MMSVM) is constructed to deal with the above problem. MMSVM merges the advantages of Fisher’s Discriminant Analysis (FDA) and SVM, and the comparative experiments on the Sloan Digital Sky Survey (SDSS) show that MMSVM performs better than SVM.

  8. Stellar Spectral Classification with Minimum Within-Class and Maximum Between-Class Scatter Support Vector Machine

    Science.gov (United States)

    Zhong-Bao, Liu

    2016-06-01

    Support Vector Machine (SVM) is one of the important stellar spectral classification methods, and it is widely used in practice. But its classification efficiencies cannot be greatly improved because it does not take the class distribution into consideration. In view of this, a modified SVM named Minimum within-class and Maximum between-class scatter Support Vector Machine (MMSVM) is constructed to deal with the above problem. MMSVM merges the advantages of Fisher's Discriminant Analysis (FDA) and SVM, and the comparative experiments on the Sloan Digital Sky Survey (SDSS) show that MMSVM performs better than SVM.

  9. SVM Classification of Urban High-Resolution Imagery Using Composite Kernels and Contour Information

    Directory of Open Access Journals (Sweden)

    Aissam Bekkari

    2013-08-01

    Full Text Available The classification of remote sensing images has done great forward taking into account the image’s availability with different resolutions, as well as an abundance of very efficient classification algorithms. A number of works have shown promising results by the fusion of spatial and spectral information using Support Vector Machines (SVM which are a group of supervised classification algorithms that have been recently used in the remote sensing field, however the addition of contour information to both spectral and spatial information still less explored. For this purpose we propose a methodology exploiting the properties of Mercer’s kernels to construct a family of composite kernels that easily combine multi-spectral features and Haralick texture features as data source. The composite kernel that gives the best results will be used to introduce contour information in the classification process. The proposed approach was tested on common scenes of urban imagery. The three different kernels tested allow a significant improvement of the classification performances and a flexibility to balance between the spatial and spectral information in the classifier. The experimental results indicate a global accuracy value of 93.52%, the addition of contour information, described by the Fourier descriptors, Hough transform and Zernike moments, allows increasing the obtained global accuracy by 1.61% which is very promising.

  10. OFW-ITS-LSSVM: Weighted Classification by LS-SVM for Diabetes diagnosis

    Directory of Open Access Journals (Sweden)

    Fawzi Elias Bekri

    2012-03-01

    Full Text Available In accordance to the fast developing technology now a days, every field is gaining it’s benefit through machines other than human involvement. Many changes are being made much advancement is possible by this developing technology. Likewise this technology is too gaining its importance in bioinformatics especially to analyse data. As we all know that diabetes is one of the present day deadly diseases prevailing. So in this paper we introduce LS-SVM classification to understand which datasets of blood may have the chance to get diabetes. Further, considering the patient’s details we can predict where he has a chance to get diabetes, if so measures to cure or stop it. In this method, an optimal Tabu search model will be suggested to reduce the chances of getting it in the future

  11. Comparison of Advanced Pixel Based (ANN and SVM and Object-Oriented Classification Approaches Using Landsat-7 Etm+ Data

    Directory of Open Access Journals (Sweden)

    Prasun Kumar Gupta

    2010-08-01

    Full Text Available In this study, the pixel-based and object-oriented image classification approaches were used for identifying different land use types in Karnal district. Imagery from Landsat-7 ETM with 6 spectral bands was used to perform the image classification.Ground truth data were collected from the available maps, personal knowledge and communication with the local people. In order to prepare land use map different approaches: Artificial Neural Network(ANN and Support Vector Machine (SVM were used. For performing object oriented classification eCognition software was used. During the object oriented classification, in first step several differentsets of parameters were used for image segmentation and in second step nearest neighbor classifier was used for classification. Outcome from the classification works show that the object-oriented approach gave more accurate results (including higher producer’s and user’s accuracy for most of the land cover classes than those achieved by pixelbased classification algorithms. It is also observed that ANN performed better as compared to SVM classification approach.

  12. Classification of full-polarization ALOS-PALSAR imagery using SVM in arid area of Dunhuang

    Institute of Scientific and Technical Information of China (English)

    JunZhan Wang; JianJun Qu; WeiMin Zhang; KeCun Zhang

    2016-01-01

    Classification is an important process in interpretation of synthetic aperture radar (SAR) imagery. As an advanced in-strument for remote sensing, the polarimetric SAR has been applied widely in many fields. The main aim of this paper is to explore the ability of the full-polarization SAR data in classification. The studyarea is a part of Dunhuang, Gansu Province, China. An L-band full-polarization image of Dunhuang which includes quad-polarization modes was acquired by the ALOS-PALSAR (Advanced Land Observing Satellite–the Phased Array type L-band Synthetic Aperture Radar). Firstly, new characteristic information was extracted by the difference operation, ratio operation, and principal component transform based on the full-polarization (HH, HV or VH, VV) modes SAR data. Then the single-, dual-, full-polarization SAR data and new SAR characteristic information were used to analyze quantitatively the classification accuracy based on the Support Vector Machines (SVM). The results show that classification overall accuracy of single-polarization SAR data is poor, and the highest is 56.53% of VV polarization. The classification overall accuracy of dual-polarization SAR is much better than single-polarization, the highest is 74.77% of HV & VV polarization data. The classification overall accuracy of full-polarization SAR is 80.21%, adding the difference characteristic information, ratio characteristic information and the first principal component (PC1) respectively, the overall accuracy increased by 3.09%, 3.38%, 4.14% respectively. When the full-polarization SAR data in combination with the all characteristic information, the classification overall accuracy reached to 91.01%. The full-polarization SAR data in combination with the band math characteristic information or the PC1 can greatly improve classification accuracy.

  13. Laguerre Kernels –Based SVM for Image Classification

    Directory of Open Access Journals (Sweden)

    Ashraf Afifi

    2014-01-01

    Full Text Available Support vector machines (SVMs have been promising methods for classification and regression analysis because of their solid mathematical foundations which convey several salient properties that other methods hardly provide. However the performance of SVMs is very sensitive to how the kernel function is selected, the challenge is to choose the kernel function for accurate data classification. In this paper, we introduce a set of new kernel functions derived from the generalized Laguerre polynomials. The proposed kernels could improve the classification accuracy of SVMs for both linear and nonlinear data sets. The proposed kernel functions satisfy Mercer’s condition and orthogonally properties which are important and useful in some applications when the support vector number is needed as in feature selection. The performance of the generalized Laguerre kernels is evaluated in comparison with the existing kernels. It was found that the choice of the kernel function, and the values of the parameters for that kernel are critical for a given amount of data. The proposed kernels give good classification accuracy in nearly all the data sets, especially those of high dimensions.

  14. A Method of Soil Salinization Information Extraction with SVM Classification Based on ICA and Texture Features

    Institute of Scientific and Technical Information of China (English)

    ZHANG Fei; TASHPOLAT Tiyip; KUNG Hsiang-te; DING Jian-li; MAMAT.Sawut; VERNER Johnson; HAN Gui-hong; GUI Dong-wei

    2011-01-01

    Salt-affected soils classification using remotely sensed images is one of the most common applications in remote sensing,and many algorithms have been developed and applied for this purpose in the literature.This study takes the Delta Oasis of Weigan and Kuqa Rivers as a study area and discusses the prediction of soil salinization from ETM+ Landsat data.It reports the Support Vector Machine(SVM) classification method based on Independent Component Analysis(ICA) and Texture features.Meanwhile,the letter introduces the fundamental theory of SVM algorithm and ICA,and then incorporates ICA and texture features.The classification result is compared with ICA-SVM classification,single data source SVM classification,maximum likelihood classification(MLC) and neural network classification qualitatively and quantitatively.The result shows that this method can effectively solve the problem of low accuracy and fracture classification result in single data source classification.It has high spread ability toward higher array input.The overall accuracy is 98.64%,which increases by 10.2% compared with maximum likelihood classification,even increases by 12.94% compared with neural net classification,and thus acquires good effectiveness.Therefore,the classification method based on SVM and incorporating the ICA and texture features can be adapted to RS image classification and monitoring of soil salinization.

  15. Deriving statistical significance maps for SVM based image classification and group comparisons.

    Science.gov (United States)

    Gaonkar, Bilwaj; Davatzikos, Christos

    2012-01-01

    Population based pattern analysis and classification for quantifying structural and functional differences between diverse groups has been shown to be a powerful tool for the study of a number of diseases, and is quite commonly used especially in neuroimaging. The alternative to these pattern analysis methods, namely mass univariate methods such as voxel based analysis and all related methods, cannot detect multivariate patterns associated with group differences, and are not particularly suitable for developing individual-based diagnostic and prognostic biomarkers. A commonly used pattern analysis tool is the support vector machine (SVM). Unlike univariate statistical frameworks for morphometry, analytical tools for statistical inference are unavailable for the SVM. In this paper, we show that null distributions ordinarily obtained by permutation tests using SVMs can be analytically approximated from the data. The analytical computation takes a small fraction of the time it takes to do an actual permutation test, thereby rendering it possible to quickly create statistical significance maps derived from SVMs. Such maps are critical for understanding imaging patterns of group differences and interpreting which anatomical regions are important in determining the classifier's decision.

  16. Nonlinear programming for classification problems in machine learning

    Science.gov (United States)

    Astorino, Annabella; Fuduli, Antonio; Gaudioso, Manlio

    2016-10-01

    We survey some nonlinear models for classification problems arising in machine learning. In the last years this field has become more and more relevant due to a lot of practical applications, such as text and web classification, object recognition in machine vision, gene expression profile analysis, DNA and protein analysis, medical diagnosis, customer profiling etc. Classification deals with separation of sets by means of appropriate separation surfaces, which is generally obtained by solving a numerical optimization model. While linear separability is the basis of the most popular approach to classification, the Support Vector Machine (SVM), in the recent years using nonlinear separating surfaces has received some attention. The objective of this work is to recall some of such proposals, mainly in terms of the numerical optimization models. In particular we tackle the polyhedral, ellipsoidal, spherical and conical separation approaches and, for some of them, we also consider the semisupervised versions.

  17. Gabor Wavelet Selection and SVM Classification for Object Recognition%采用精选Gabor小波和SVM分类的物体识别

    Institute of Scientific and Technical Information of China (English)

    沈琳琳; 纪震

    2009-01-01

    This paper proposes a Gabor wavelets and support vector machine (SVM)-based framework for object recognition. When discriminative features are extracted at optimized locations using selected Gabor wavelets, classifications are done via SVM. Compared to conventional Gabor feature based object recognition system, the system developed in this paper is both robust and efficient. The proposed framework has been successfully applied to two object recognition applications, i.e., object/non-object classification and face recognition. Experimental results clearly show advantages of the proposed method over other approaches.

  18. Classification of 5-HT(1A) receptor ligands on the basis of their binding affinities by using PSO-Adaboost-SVM.

    Science.gov (United States)

    Cheng, Zhengjun; Zhang, Yuntao; Zhou, Changhong; Zhang, Wenjun; Gao, Shibo

    2009-07-29

    In the present work, the support vector machine (SVM) and Adaboost-SVM have been used to develop a classification model as a potential screening mechanism for a novel series of 5-HT(1A) selective ligands. Each compound is represented by calculated structural descriptors that encode topological features. The particle swarm optimization (PSO) and the stepwise multiple linear regression (Stepwise-MLR) methods have been used to search descriptor space and select the descriptors which are responsible for the inhibitory activity of these compounds. The model containing seven descriptors found by Adaboost-SVM, has showed better predictive capability than the other models. The total accuracy in prediction for the training and test set is 100.0% and 95.0% for PSO-Adaboost-SVM, 99.1% and 92.5% for PSO-SVM, 99.1% and 82.5% for Stepwise-MLR-Adaboost-SVM, 99.1% and 77.5% for Stepwise-MLR-SVM, respectively. The results indicate that Adaboost-SVM can be used as a useful modeling tool for QSAR studies.

  19. Classification of 5-HT1A Receptor Ligands on the Basis of Their Binding Affinities by Using PSO-Adaboost-SVM

    Directory of Open Access Journals (Sweden)

    Wenjun Zhang

    2009-07-01

    Full Text Available In the present work, the support vector machine (SVM and Adaboost-SVM have been used to develop a classification model as a potential screening mechanism for a novel series of 5-HT1A selective ligands. Each compound is represented by calculated structural descriptors that encode topological features. The particle swarm optimization (PSO and the stepwise multiple linear regression (Stepwise-MLR methods have been used to search descriptor space and select the descriptors which are responsible for the inhibitory activity of these compounds. The model containing seven descriptors found by Adaboost-SVM, has showed better predictive capability than the other models. The total accuracy in prediction for the training and test set is 100.0% and 95.0% for PSO-Adaboost-SVM, 99.1% and 92.5% for PSO-SVM, 99.1% and 82.5% for Stepwise-MLR-Adaboost-SVM, 99.1% and 77.5% for Stepwise-MLR-SVM, respectively. The results indicate that Adaboost-SVM can be used as a useful modeling tool for QSAR studies.

  20. SVM and SVM Ensembles in Breast Cancer Prediction

    Science.gov (United States)

    Huang, Min-Wei; Chen, Chih-Wen; Lin, Wei-Chao; Ke, Shih-Wen; Tsai, Chih-Fong

    2017-01-01

    Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers. PMID:28060807

  1. SVM and SVM Ensembles in Breast Cancer Prediction.

    Science.gov (United States)

    Huang, Min-Wei; Chen, Chih-Wen; Lin, Wei-Chao; Ke, Shih-Wen; Tsai, Chih-Fong

    2017-01-01

    Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers.

  2. Lex-SVM: exploring the potential of exon expression profiling for disease classification.

    Science.gov (United States)

    Yuan, Xiongying; Zhao, Yi; Liu, Changning; Bu, Dongbo

    2011-04-01

    Exon expression profiling technologies, including exon arrays and RNA-Seq, measure the abundance of every exon in a gene. Compared with gene expression profiling technologies like 3' array, exon expression profiling technologies could detect alterations in both transcription and alternative splicing, therefore they are expected to be more sensitive in diagnosis. However, exon expression profiling also brings higher dimension, more redundancy, and significant correlation among features. Ignoring the correlation structure among exons of a gene, a popular classification method like L1-SVM selects exons individually from each gene and thus is vulnerable to noise. To overcome this limitation, we present in this paper a new variant of SVM named Lex-SVM to incorporate correlation structure among exons and known splicing patterns to promote classification performance. Specifically, we construct a new norm, ex-norm, including our prior knowledge on exon correlation structure to regularize the coefficients of a linear SVM. Lex-SVM can be solved efficiently using standard linear programming techniques. The advantage of Lex-SVM is that it can select features group-wisely, force features in a subgroup to take equal weihts and exclude the features that contradict the majority in the subgroup. Experimental results suggest that on exon expression profile, Lex-SVM is more accurate than existing methods. Lex-SVM also generates a more compact model and selects genes more consistently in cross-validation. Unlike L1-SVM selecting only one exon in a gene, Lex-SVM assigns equal weights to as many exons in a gene as possible, lending itself easier for further interpretation.

  3. A Support Vector Machine Hydrometeor Classification Algorithm for Dual-Polarization Radar

    Directory of Open Access Journals (Sweden)

    Nicoletta Roberto

    2017-07-01

    Full Text Available An algorithm based on a support vector machine (SVM is proposed for hydrometeor classification. The training phase is driven by the output of a fuzzy logic hydrometeor classification algorithm, i.e., the most popular approach for hydrometer classification algorithms used for ground-based weather radar. The performance of SVM is evaluated by resorting to a weather scenario, generated by a weather model; the corresponding radar measurements are obtained by simulation and by comparing results of SVM classification with those obtained by a fuzzy logic classifier. Results based on the weather model and simulations show a higher accuracy of the SVM classification. Objective comparison of the two classifiers applied to real radar data shows that SVM classification maps are spatially more homogenous (textural indices, energy, and homogeneity increases by 21% and 12% respectively and do not present non-classified data. The improvements found by SVM classifier, even though it is applied pixel-by-pixel, can be attributed to its ability to learn from the entire hyperspace of radar measurements and to the accurate training. The reliability of results and higher computing performance make SVM attractive for some challenging tasks such as its implementation in Decision Support Systems for helping pilots to make optimal decisions about changes inthe flight route caused by unexpected adverse weather.

  4. Segmentasi Citra menggunakan Support Vector Machine (SVM dan Ellipsoid Region Search Strategy (ERSS Arimoto Entropy berdasarkan Ciri Warna dan Tekstur

    Directory of Open Access Journals (Sweden)

    Lukman Hakim

    2016-02-01

    Full Text Available Abstrak Segmentasi citra merupakan suatu metode penting dalam pengolahan citra digital yang bertujuan membagi citra menjadi beberapa region yang homogen berdasarkan kriteria kemiripan tertentu. Salah satu syarat utama yang harus dimiliki suatu metode segmentasi citra yaitu menghasilkan citra boundary yang optimal.Untuk memenuhi syarat tersebut suatu metode segmentasi membutuhkan suatu klasifikasi piksel citra yang dapat memisahkan piksel secara linier dan non-linear. Pada penelitian ini, penulis mengusulkan metode segmentasi citra menggunakan SVM dan entropi Arimoto berbasis ERSS sehingga tahan terhadap derau dan mempunyai kompleksitas yang rendah untuk menghasilkan citra boundary yang optimal. Pertama, ekstraksi ciri warna dengan local homogeneity dan ciri tekstur dengan menggunakan Gray Level Co-occurrence Matrix (GLCM yang menghasilkan beberapa fitur. Kedua, pelabelan dengan Arimoto berbasis ERSS yang digunakan sebagai kelas dalam klasifikasi. Ketiga, hasil ekstraksi fitur dan training kemudian diklasifikasi berdasarkan label dengan SVM yang telah di-training. Dari percobaan yang dilakukan menunjukkan hasil segmentasi kurang optimal dengan akurasi 69 %. Reduksi fitur perlu dilakukan untuk menghasilkan citra yang tersegmentasi dengan baik. Kata kunci: segmentasi citra, support vector machine, ERSS Arimoto Entropy, ekstraksi ciri. Abstract Image segmentation is an important tool in image processing that divides an image into homogeneous regions based on certain similarity criteria, which ideally should be meaning-full for a certain purpose. Optimal boundary is one of the main criteria that an image segmentation method should has. A classification method that can partitions pixel linearly or non-linearly is needed by an image segmentation method. We propose a color image segmentation using Support Vector Machine (SVM classification and ERSS Arimoto entropy thresholding to get optimal boundary of segmented image that noise-free and low complexity

  5. Photometric classification of emission line galaxies with Machine Learning methods

    CERN Document Server

    Cavuoti, Stefano; D'Abrusco, Raffaele; Longo, Giuseppe; Paolillo, Maurizio

    2013-01-01

    In this paper we discuss an application of machine learning based methods to the identification of candidate AGN from optical survey data and to the automatic classification of AGNs in broad classes. We applied four different machine learning algorithms, namely the Multi Layer Perceptron (MLP), trained respectively with the Conjugate Gradient, Scaled Conjugate Gradient and Quasi Newton learning rules, and the Support Vector Machines (SVM), to tackle the problem of the classification of emission line galaxies in different classes, mainly AGNs vs non-AGNs, obtained using optical photometry in place of the diagnostics based on line intensity ratios which are classically used in the literature. Using the same photometric features we discuss also the behavior of the classifiers on finer AGN classification tasks, namely Seyfert I vs Seyfert II and Seyfert vs LINER. Furthermore we describe the algorithms employed, the samples of spectroscopically classified galaxies used to train the algorithms, the procedure follow...

  6. 基于支持向量机无限集成学习方法的遥感图像分类%Remotely sensed imagery classification by SVM-based Infinite Ensemble Learning method

    Institute of Scientific and Technical Information of China (English)

    杨娜; 秦志远; 张俊

    2013-01-01

    基于支持向量机的无限集成学习方法(SVM-based IEL)是机器学习领域新兴起的一种集成学习方法.本文将SVM-based IEL引入遥感图像的分类领域,并同时将SVM、Bagging、AdaBoost和SVM-based IEL等方法应用于遥感图像分类.实验表明:Bagging方法可以提高遥感图像的分类精度,而AdaBoost却降低了遥感图像的分类精度;同时,与SVM、有限集成的学习方法相比,SVM-based IEL方法具有可以显著地提高遥感图像的分类精度、分类效率的优势.%Support-vector-machines-based Infinite Ensemble Learning method ( SVM-based IEL) is one of the ensemble learning methods in the field of machine learning. In this paper, the SVM-based IEL was applied to the classification of remotely sensed imagery besides classic ensemble learning methods such as Bagging, AdaBoost and SVM etc. SVM was taken as the base classifier in Bagging, AdaBoost The experiments showed that the classic ensemble learning methods have different performances compared to SVM. In detail , the Bagging was capable of enhancing the classification accuracy but the AdaBoost was decreasing the classification accuracy. Furthermore, the experiments suggested that compared to SVM and classic ensemble learning methods, SVM-based IEL has many merits such as increasing both of the classification accuracy and classification efficiency.

  7. Multi-Class Classification Methods of Cost-Conscious LS-SVM for Fault Diagnosis of Blast Furnace%Multi-Class Classification Methods of Cost-Conscious LS-SVM for Fault Diagnosis of Blast Furnace

    Institute of Scientific and Technical Information of China (English)

    LIU Li-mei; WANG An-na; SHA Mo; ZHAO Feng-yun

    2011-01-01

    Aiming at the limitations of rapid fault diagnosis of blast furnace, a novel strategy based on cost-conscious least squares support vector machine (LS-SVM) is proposed to solve this problem. Firstly, modified discrete particle swarm optimization is applied to optimize the feature selection and the LS-SVM parameters. Secondly, cost-con- scious formula is presented for fitness function and it contains in detail training time, recognition accuracy and the feature selection. The CLS-SVM algorithm is presented to increase the performance of the LS-SVM classifier. The new method can select the best fault features in much shorter time and have fewer support vectbrs and better general- ization performance in the application of fault diagnosis of the blast furnace. Thirdly, a gradual change binary tree is established for blast furnace faults diagnosis. It is a multi-class classification method based on center-of-gravity formula distance of cluster. A gradual change classification percentage ia used to select sample randomly. The proposed new metbod raises the sped of diagnosis, optimizes the classifieation scraraey and has good generalization ability for fault diagnosis of the application of blast furnace.

  8. Speech/Music Classification Enhancement for 3GPP2 SMV Codec Based on Support Vector Machine

    Science.gov (United States)

    Kim, Sang-Kyun; Chang, Joon-Hyuk

    In this letter, we propose a novel approach to speech/music classification based on the support vector machine (SVM) to improve the performance of the 3GPP2 selectable mode vocoder (SMV) codec. We first analyze the features and the classification method used in real time speech/music classification algorithm in SMV, and then apply the SVM for enhanced speech/music classification. For evaluation of performance, we compare the proposed algorithm and the traditional algorithm of the SMV. The performance of the proposed system is evaluated under the various environments and shows better performance compared to the original method in the SMV.

  9. AREA DETERMINATION OF DIABETIC FOOT ULCER IMAGES USING A CASCADED TWO-STAGE SVM BASED CLASSIFICATION.

    Science.gov (United States)

    Wang, Lei; Pedersen, Peder; Agu, Emmanuel; Strong, Diane; Tulu, Bengisu

    2016-11-23

    It is standard practice for clinicians and nurses to primarily assess patients' wounds via visual examination. This subjective method can be inaccurate in wound assessment and also represents a significant clinical workload. Hence, computer-based systems, especially implemented on mobile devices, can provide automatic, quantitative wound assessment and can thus be valuable for accurately monitoring wound healing status. Out of all wound assessment parameters, the measurement of the wound area is the most suitable for automated analysis. Most of the current wound boundary determination methods only process the image of the wound area along with a small amount of surrounding healthy skin. In this paper, we present a novel approach that uses Support Vector Machine (SVM) to determine the wound boundary on a foot ulcer image captured with an image capture box, which provides controlled lighting, angle and range conditions. The Simple Linear Iterative Clustering (SLIC) method is applied for effective super-pixel segmentation. A cascaded two-stage classifier is trained as follows: in the first stage a set of k binary SVM classifiers are trained and applied to different subsets of the entire training images dataset, and a set of incorrectly classified instances are collected. In the second stage, another binary SVM classifier is trained on the incorrectly classified set. We extracted various color and texture descriptors from super-pixels that are used as input for each stage in the classifier training. Specifically, we apply the color and Bag-of-Word (BoW) representation of local Dense SIFT features (DSIFT) as the descriptor for ruling out irrelevant regions (first stage), and apply color and wavelet based features as descriptors for distinguishing healthy tissue from wound regions (second stage). Finally, the detected wound boundary is refined by applying a Conditional Random Field (CRF) image processing technique. We have implemented the wound classification on a Nexus

  10. Novel cascade FPGA accelerator for support vector machines classification.

    Science.gov (United States)

    Papadonikolakis, Markos; Bouganis, Christos-Savvas

    2012-07-01

    Support vector machines (SVMs) are a powerful machine learning tool, providing state-of-the-art accuracy to many classification problems. However, SVM classification is a computationally complex task, suffering from linear dependencies on the number of the support vectors and the problem's dimensionality. This paper presents a fully scalable field programmable gate array (FPGA) architecture for the acceleration of SVM classification, which exploits the device heterogeneity and the dynamic range diversities among the dataset attributes. An adaptive and fully-customized processing unit is proposed, which utilizes the available heterogeneous resources of a modern FPGA device in efficient way with respect to the problem's characteristics. The implementation results demonstrate the efficiency of the heterogeneous architecture, presenting a speed-up factor of 2-3 orders of magnitude, compared to the CPU implementation. The proposed architecture outperforms other proposed FPGA and graphic processor unit approaches by more than seven times. Furthermore, based on the special properties of the heterogeneous architecture, this paper introduces the first FPGA-oriented cascade SVM classifier scheme, which exploits the FPGA reconfigurability and intensifies the custom-arithmetic properties of the heterogeneous architecture. The results show that the proposed cascade scheme is able to increase the heterogeneous classifier throughput even further, without introducing any penalty on the resource utilization.

  11. Simultaneous data pre-processing and SVM classification model selection based on a parallel genetic algorithm applied to spectroscopic data of olive oils.

    Science.gov (United States)

    Devos, Olivier; Downey, Gerard; Duponchel, Ludovic

    2014-04-01

    Classification is an important task in chemometrics. For several years now, support vector machines (SVMs) have proven to be powerful for infrared spectral data classification. However such methods require optimisation of parameters in order to control the risk of overfitting and the complexity of the boundary. Furthermore, it is established that the prediction ability of classification models can be improved using pre-processing in order to remove unwanted variance in the spectra. In this paper we propose a new methodology based on genetic algorithm (GA) for the simultaneous optimisation of SVM parameters and pre-processing (GENOPT-SVM). The method has been tested for the discrimination of the geographical origin of Italian olive oil (Ligurian and non-Ligurian) on the basis of near infrared (NIR) or mid infrared (FTIR) spectra. Different classification models (PLS-DA, SVM with mean centre data, GENOPT-SVM) have been tested and statistically compared using McNemar's statistical test. For the two datasets, SVM with optimised pre-processing give models with higher accuracy than the one obtained with PLS-DA on pre-processed data. In the case of the NIR dataset, most of this accuracy improvement (86.3% compared with 82.8% for PLS-DA) occurred using only a single pre-processing step. For the FTIR dataset, three optimised pre-processing steps are required to obtain SVM model with significant accuracy improvement (82.2%) compared to the one obtained with PLS-DA (78.6%). Furthermore, this study demonstrates that even SVM models have to be developed on the basis of well-corrected spectral data in order to obtain higher classification rates.

  12. Classification of Stellar Spectra with Fuzzy Minimum Within-Class Support Vector Machine

    Indian Academy of Sciences (India)

    Liu Zhong-bao; Song Wen-ai; Zhang Jing; Zhao Wen-juan

    2017-06-01

    Classification is one of the important tasks in astronomy, especially in spectra analysis. Support Vector Machine (SVM) is a typical classification method, which is widely used in spectra classification. Although it performs well in practice, its classification accuracies can not be greatly improved because of two limitations. One is it does not take the distribution of the classes into consideration. The other is it is sensitive to noise. In order to solve the above problems, inspired by the maximization of the Fisher’s Discriminant Analysis (FDA) and the SVM separability constraints, fuzzy minimum within-class support vector machine (FMWSVM) is proposed in this paper. In FMWSVM, the distribution of the classes is reflected by the within-class scatter in FDA and the fuzzy membership function is introduced to decrease the influence of the noise. The comparative experiments with SVM on the SDSS datasets verify the effectiveness of the proposed classifier FMWSVM.

  13. Support vector machine as an alternative method for lithology classification of crystalline rocks

    Science.gov (United States)

    Deng, Chengxiang; Pan, Heping; Fang, Sinan; Amara Konaté, Ahmed; Qin, Ruidong

    2017-03-01

    With the expansion of machine learning algorithms, automatic lithology classification that uses well logging data is becoming significant in formation evaluation and reservoir characterization. In fact, the complicated composition and structural variations of metamorphic rocks result in more nonlinear features in well logging data and elevate requirements to algorithms. Herein, the application of the support vector machine (SVM) in classifying crystalline rocks from Chinese Continental Scientific Drilling Main Hole (CCSD-MH) data was reported. We found that the SVM performs poorly on the lithology classification of crystalline rocks when training samples are imbalanced. The fact is that training samples are generally limited and imbalanced as cores cannot be obtained balanced and at 100 percent. In this paper, we introduced the synthetic minority over-sampling technique (SMOTE) and Borderline-SMOTE to deal with imbalanced data. After experiments generating different quantities of training samples by SMOTE and Borderline-SMOTE, the most suitable classifier was selected to overcome the disadvantage of the SVM. Then, the popular supervised classifier back-propagation neural networks (BPNN), which has been proved competent for lithology classification of crystalline rocks in previous studies, was compared to evaluate the performance of the SVM. Results show that Borderline-SMOTE can improve the SVM with substantially increased accuracy even for minority classes in a reasonable manner, while the SVM outperforms BPNN in aspects of lithology prediction and CCSD-MH data generalization. We demonstrate the potential of the SVM as an alternative to current methods for lithology identification of crystalline rocks.

  14. Multi-classification algorithm and its realization based on least square support vector machine algorithm

    Institute of Scientific and Technical Information of China (English)

    Fan Youping; Chen Yunping; Sun Wansheng; Li Yu

    2005-01-01

    As a new type of learning machine developed on the basis of statistics learning theory, support vector machine (SVM) plays an important role in knowledge discovering and knowledge updating by constructing non-linear optimal classifier. However, realizing SVM requires resolving quadratic programming under constraints of inequality, which results in calculation difficulty while learning samples gets larger. Besides, standard SVM is incapable of tackling multi-classification. To overcome the bottleneck of populating SVM, with training algorithm presented, the problem of quadratic programming is converted into that of resolving a linear system of equations composed of a group of equation constraints by adopting the least square SVM(LS-SVM) and introducing a modifying variable which can change inequality constraints into equation constraints, which simplifies the calculation. With regard to multi-classification, an LS-SVM applicable in multi-classification is deduced. Finally, efficiency of the algorithm is checked by using universal Circle in square and two-spirals to measure the performance of the classifier.

  15. Hybrid Feature Based War Scene Classification using ANN and SVM: A Comparative Study

    Directory of Open Access Journals (Sweden)

    Shanmugam A

    2011-05-01

    Full Text Available In this paper we are proposing a hybrid feature extraction method for classifying the war scene from the natural scene. For this purpose two set of image categories are taken viz., opencountry & war tank. Byusing the hybrid method, features are extracted from the images/scenes. The extracted features are trained and tested with (i Artificial Neural Networks (ANN using feed forward back propagationalgorithm and (ii Support Vector Machines (SVM using Radial basis kernel functions with p=5. The results are also compared with the commonly used feature extraction methods such as haar wavelet,daubechies(db4 wavelet, Zernike moments, Invariant moments, co-occurrence features and statistical moments. The comparative results are proving efficiency of the proposed hybrid feature extraction method (i.e., the combination of GLCM & Statistical moments in war scene classification problems. It can be concluded that the proposed work significantly and directly contributes to scene classification and its new applications. The complete work is experimented in Matlab 7.6.0 using real world dataset.

  16. Land Cover Classification from Full-Waveform LIDAR Data Based on Support Vector Machines

    Science.gov (United States)

    Zhou, M.; Li, C. R.; Ma, L.; Guan, H. C.

    2016-06-01

    In this study, a land cover classification method based on multi-class Support Vector Machines (SVM) is presented to predict the types of land cover in Miyun area. The obtained backscattered full-waveforms were processed following a workflow of waveform pre-processing, waveform decomposition and feature extraction. The extracted features, which consist of distance, intensity, Full Width at Half Maximum (FWHM) and back scattering cross-section, were corrected and used as attributes for training data to generate the SVM prediction model. The SVM prediction model was applied to predict the types of land cover in Miyun area as ground, trees, buildings and farmland. The classification results of these four types of land covers were obtained based on the ground truth information according to the CCD image data of Miyun area. It showed that the proposed classification algorithm achieved an overall classification accuracy of 90.63%. In order to better explain the SVM classification results, the classification results of SVM method were compared with that of Artificial Neural Networks (ANNs) method and it showed that SVM method could achieve better classification results.

  17. LAND COVER CLASSIFICATION FROM FULL-WAVEFORM LIDAR DATA BASED ON SUPPORT VECTOR MACHINES

    Directory of Open Access Journals (Sweden)

    M. Zhou

    2016-06-01

    Full Text Available In this study, a land cover classification method based on multi-class Support Vector Machines (SVM is presented to predict the types of land cover in Miyun area. The obtained backscattered full-waveforms were processed following a workflow of waveform pre-processing, waveform decomposition and feature extraction. The extracted features, which consist of distance, intensity, Full Width at Half Maximum (FWHM and back scattering cross-section, were corrected and used as attributes for training data to generate the SVM prediction model. The SVM prediction model was applied to predict the types of land cover in Miyun area as ground, trees, buildings and farmland. The classification results of these four types of land covers were obtained based on the ground truth information according to the CCD image data of Miyun area. It showed that the proposed classification algorithm achieved an overall classification accuracy of 90.63%. In order to better explain the SVM classification results, the classification results of SVM method were compared with that of Artificial Neural Networks (ANNs method and it showed that SVM method could achieve better classification results.

  18. Oil spill detection from SAR image using SVM based classification

    Directory of Open Access Journals (Sweden)

    A. A. Matkan

    2013-09-01

    Full Text Available In this paper, the potential of fully polarimetric L-band SAR data for detecting sea oil spills is investigated using polarimetric decompositions and texture analysis based on SVM classifier. First, power and magnitude measurements of HH and VV polarization modes and, Pauli, Freeman and Krogager decompositions are computed and applied in SVM classifier. Texture analysis is used for identification using SVM method. The texture features i.e. Mean, Variance, Contrast and Dissimilarity from them are then extracted. Experiments are conducted on full polarimetric SAR data acquired from PALSAR sensor of ALOS satellite on August 25, 2006. An accuracy assessment indicated overall accuracy of 78.92% and 96.46% for the power measurement of the VV polarization and the Krogager decomposition respectively in first step. But by use of texture analysis the results are improved to 96.44% and 96.65% quality for mean of power and magnitude measurements of HH and VV polarizations and the Krogager decomposition. Results show that the Krogager polarimetric decomposition method has the satisfying result for detection of sea oil spill on the sea surface and the texture analysis presents the good results.

  19. Object-Based Image Classification of Summer Crops with Machine Learning Methods

    Directory of Open Access Journals (Sweden)

    José M. Peña

    2014-05-01

    Full Text Available The strategic management of agricultural lands involves crop field monitoring each year. Crop discrimination via remote sensing is a complex task, especially if different crops have a similar spectral response and cropping pattern. In such cases, crop identification could be improved by combining object-based image analysis and advanced machine learning methods. In this investigation, we evaluated the C4.5 decision tree, logistic regression (LR, support vector machine (SVM and multilayer perceptron (MLP neural network methods, both as single classifiers and combined in a hierarchical classification, for the mapping of nine major summer crops (both woody and herbaceous from ASTER satellite images captured in two different dates. Each method was built with different combinations of spectral and textural features obtained after the segmentation of the remote images in an object-based framework. As single classifiers, MLP and SVM obtained maximum overall accuracy of 88%, slightly higher than LR (86% and notably higher than C4.5 (79%. The SVM+SVM classifier (best method improved these results to 89%. In most cases, the hierarchical classifiers considerably increased the accuracy of the most poorly classified class (minimum sensitivity. The SVM+SVM method offered a significant improvement in classification accuracy for all of the studied crops compared to the conventional decision tree classifier, ranging between 4% for safflower and 29% for corn, which suggests the application of object-based image analysis and advanced machine learning methods in complex crop classification tasks.

  20. Water Quantity Prediction Using Least Squares Support Vector Machines (LS-SVM Method

    Directory of Open Access Journals (Sweden)

    Nian Zhang

    2014-08-01

    Full Text Available The impact of reliable estimation of stream flows at highly urbanized areas and the associated receiving waters is very important for water resources analysis and design. We used the least squares support vector machine (LS-SVM based algorithm to forecast the future streamflow discharge. A Gaussian Radial Basis Function (RBF kernel framework was built on the data set to optimize the tuning parameters and to obtain the moderated output. The training process of LS-SVM was designed to select both kernel parameters and regularization constants. The USGS real-time water data were used as time series input. 50% of the data were used for training, and 50% were used for testing. The experimental results showed that the LS-SVM algorithm is a reliable and efficient method for streamflow prediction, which has an important impact to the water resource management field.

  1. SVM with discriminative dynamic time alignment

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    In the past several years, support vector machines (SVM) have achieved a huge success in many field, especially in pattern recognition. But the standard SVM cannot deal with length-variable vectors, which is one severe obstacle for its applications to some important areas, such as speech recognition and part-of-speech tagging. The paper proposed a novel SVM with discriminative dynamic time alignment (DDTA-SVM) to solve this problem. When training DDTA-SVM classifier, according to the category information of the training Samples, different time alignment strategies were adopted to manipulate them in the kernel functions, which contributed to great improvement for training speed and generalization capability of the classifier. Since the alignment operator was embedded in kernel functions, the training algorithms of standard SVM were still compatible in DDTA-SVM. In order to increase the reliability of the classification, a new classification algorithm was suggested. The preliminary experimental results on Chinese confusable syllables speech classification task show that DDTA-SVM obtains faster convergence speed and better classification performance than dynamic time alignment kernel SVM (DTAK-SVM).Moreover, DDTA-SVM also gives higher classification precision compared to the conventional HMM. This proves that the proposed method is effective, especially for confusable lengthvariable pattern classification tasks.

  2. Classification of Phylogenetic Profiles for Protein Function Prediction: An SVM Approach

    Science.gov (United States)

    Kotaru, Appala Raju; Joshi, Ramesh C.

    Predicting the function of an uncharacterized protein is a major challenge in post-genomic era due to problems complexity and scale. Having knowledge of protein function is a crucial link in the development of new drugs, better crops, and even the development of biochemicals such as biofuels. Recently numerous high-throughput experimental procedures have been invented to investigate the mechanisms leading to the accomplishment of a protein’s function and Phylogenetic profile is one of them. Phylogenetic profile is a way of representing a protein which encodes evolutionary history of proteins. In this paper we proposed a method for classification of phylogenetic profiles using supervised machine learning method, support vector machine classification along with radial basis function as kernel for identifying functionally linked proteins. We experimentally evaluated the performance of the classifier with the linear kernel, polynomial kernel and compared the results with the existing tree kernel. In our study we have used proteins of the budding yeast saccharomyces cerevisiae genome. We generated the phylogenetic profiles of 2465 yeast genes and for our study we used the functional annotations that are available in the MIPS database. Our experiments show that the performance of the radial basis kernel is similar to polynomial kernel is some functional classes together are better than linear, tree kernel and over all radial basis kernel outperformed the polynomial kernel, linear kernel and tree kernel. In analyzing these results we show that it will be feasible to make use of SVM classifier with radial basis function as kernel to predict the gene functionality using phylogenetic profiles.

  3. SVM-based feature extraction and classification of aflatoxin contaminated corn using fluorescence hyperspectral data

    Science.gov (United States)

    Support Vector Machine (SVM) was used in the Genetic Algorithms (GA) process to select and classify a subset of hyperspectral image bands. The method was applied to fluorescence hyperspectral data for the detection of aflatoxin contamination in Aspergillus flavus infected single corn kernels. In the...

  4. Classification of Regional Ionospheric Disturbances Based on Support Vector Machines

    Science.gov (United States)

    Begüm Terzi, Merve; Arikan, Feza; Arikan, Orhan; Karatay, Secil

    2016-07-01

    Ionosphere is an anisotropic, inhomogeneous, time varying and spatio-temporally dispersive medium whose parameters can be estimated almost always by using indirect measurements. Geomagnetic, gravitational, solar or seismic activities cause variations of ionosphere at various spatial and temporal scales. This complex spatio-temporal variability is challenging to be identified due to extensive scales in period, duration, amplitude and frequency of disturbances. Since geomagnetic and solar indices such as Disturbance storm time (Dst), F10.7 solar flux, Sun Spot Number (SSN), Auroral Electrojet (AE), Kp and W-index provide information about variability on a global scale, identification and classification of regional disturbances poses a challenge. The main aim of this study is to classify the regional effects of global geomagnetic storms and classify them according to their risk levels. For this purpose, Total Electron Content (TEC) estimated from GPS receivers, which is one of the major parameters of ionosphere, will be used to model the regional and local variability that differs from global activity along with solar and geomagnetic indices. In this work, for the automated classification of the regional disturbances, a classification technique based on a robust machine learning technique that have found wide spread use, Support Vector Machine (SVM) is proposed. SVM is a supervised learning model used for classification with associated learning algorithm that analyze the data and recognize patterns. In addition to performing linear classification, SVM can efficiently perform nonlinear classification by embedding data into higher dimensional feature spaces. Performance of the developed classification technique is demonstrated for midlatitude ionosphere over Anatolia using TEC estimates generated from the GPS data provided by Turkish National Permanent GPS Network (TNPGN-Active) for solar maximum year of 2011. As a result of implementing the developed classification

  5. GA-SVM Based Lungs Nodule Detection and Classification

    Science.gov (United States)

    Jaffar, M. Arfan; Hussain, Ayyaz; Jabeen, Fauzia; Nazir, M.; Mirza, Anwar M.

    In this paper we have proposed a method for lungs nodule detection from computed tomography (CT) scanned images by using Genetic Algorithms (GA) and morphological techniques. First of all, GA has been used for automated segmentation of lungs. Region of interests (ROIs) have been extracted by using 8 directional searches slice by slice and then features extraction have been performed. Finally SVM have been used to classify ROI that contain nodule. The proposed system is capable to perform fully automatic segmentation and nodule detection from CT Scan Lungs images. The technique was tested against the 50 datasets of different patients received from Aga Khan Medical University, Pakistan and Lung Image Database Consortium (LIDC) dataset.

  6. Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Maolong Xi

    2016-01-01

    Full Text Available This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO for cancer feature gene selection, coupling support vector machine (SVM for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV. Finally, the BQPSO coupling SVM (BQPSO/SVM, binary PSO coupling SVM (BPSO/SVM, and genetic algorithm coupling SVM (GA/SVM are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms.

  7. Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine.

    Science.gov (United States)

    Xi, Maolong; Sun, Jun; Liu, Li; Fan, Fangyun; Wu, Xiaojun

    2016-01-01

    This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO) for cancer feature gene selection, coupling support vector machine (SVM) for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV). Finally, the BQPSO coupling SVM (BQPSO/SVM), binary PSO coupling SVM (BPSO/SVM), and genetic algorithm coupling SVM (GA/SVM) are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms.

  8. Classification of Land Utilization and Covering Based on Support Vector Machine---with case of Laoha River catachment%基于 SVM 的土地利用/覆盖分类--以老哈河流域为例

    Institute of Scientific and Technical Information of China (English)

    李硕

    2015-01-01

    选取老哈河流域为研究区域,以2007年的两景Landsat5的TM影像为数据源,对该地区进行土地利用/覆盖分类。由于该区域土地覆盖类型复杂,影像较难区分且容易造成错分类。该研究中采用支持向量机( Support Vector Machine,SVM)分类法,通过引入径向基核函数进行非线性变换映射至高维空间,提取它们的非线性特征,增强不同类型之间的可分性,减少错分现象,提高遥感图像分类的精度。通过试验,提取出了2007年的老哈河流域的土地利用/覆盖现状图,以校验该方法的可行性。%The Laoha River catchment is selected as the study catchment.Based on the data source of TM image of Landsat 5 in 2007, classification of the land utilization and covering in the catchment is studied.As the land covering of this catchment is complicated in classification, the images are difficult to separate and easy to classify.In this study, classification method of support vector machine (SVM) is applied.By utilization of radial basis function, the non-linear conversion is conducted to the high-dimensional space, abstrac-ting their non-linear characteristics, strengthening the separation between different types, reducing mistaken classification and improving accuracy of the remote-sense image classification.Through tests, the land utilization and covering status images of the Laoha River catch-ment in 2007 are abstracted to verify the feasibility of this method.

  9. Classification of underwater still objects based on multi-field features and SVM

    Institute of Scientific and Technical Information of China (English)

    TIAN Jie; XUE Shan-hua; HUANG Hai-ning; ZHANG Chun-hua

    2007-01-01

    A Support Vector Machine is used as a classifier to the automatic detection and recognition of underwater still objects. Discrimination between the objects can be transferred into different projection spaces by the process of multi-field feature extraction. The multi-field feature vector includes time-domain, spectral, time-frequency distribution and bi-spectral features. Underwater target recognition can be considered as a problem of small sample recognition. SVM algorithm is appropriate to this kind of problems because of its outstanding generalizability. The SVM is contrasted with a Gaussian classifier and a k-nearest classifier in some experiments using real data of lake or sea trial. The experimental results indicate that SVM is better than the others two.

  10. Research on Sentiment Classification of Texts Based on SVM%采用 SVM 方法的文本情感极性分类研究

    Institute of Scientific and Technical Information of China (English)

    陈培文; 傅秀芬

    2014-01-01

    文本情感极性分类是文本情感分析首先要解决的关键问题。在分析影响文本情感分类的各类因素的基础上,首先构建了情感词典,并进行情感特征选取以及情感特征加权,然后使用SVM分类的方法对文本进行情感识别及分类,最后在语料数据集的基础上,在单机平台上和Spark分布式计算平台上执行分类模型,对比分析其分类精度和时间代价。实验结果验证了本文构建的情感极性分类模型在单机和分布式云平台上中的有效性。%The key problem to solve in a sentiment analysis of texts is the sentiment polarity classifica-tion.Based on the analysis of various factors affecting sentiment classification of texts , it built the senti-ment lexicon , extracted affective characteristics , and weighted sentimental features .Then , it used sup-port vector machine ( SVM) classifier for emotion recognition and text classification .Finally, it performed the classification model with the corpus data sets on the single platform and the Spark distributed compu-ting platform to analyze its classification accuracy and time cost .The experimental results verify the effec-tiveness of the text sentimental polarity categorization model on the single platform and on the spark dis-tributed computing platform .

  11. Classification of hydration status using electrocardiogram and machine learning

    Science.gov (United States)

    Kaveh, Anthony; Chung, Wayne

    2013-10-01

    The electrocardiogram (ECG) has been used extensively in clinical practice for decades to non-invasively characterize the health of heart tissue; however, these techniques are limited to time domain features. We propose a machine classification system using support vector machines (SVM) that uses temporal and spectral information to classify health state beyond cardiac arrhythmias. Our method uses single lead ECG to classify volume depletion (or dehydration) without the lengthy and costly blood analysis tests traditionally used for detecting dehydration status. Our method builds on established clinical ECG criteria for identifying electrolyte imbalances and lends to automated, computationally efficient implementation. The method was tested on the MIT-BIH PhysioNet database to validate this purely computational method for expedient disease-state classification. The results show high sensitivity, supporting use as a cost- and time-effective screening tool.

  12. Comparison of Support Vector Machine, Neural Network, and CART Algorithms for the Land-Cover Classification Using Limited Training Data Points

    Science.gov (United States)

    Support vector machine (SVM) was applied for land-cover characterization using MODIS time-series data. Classification performance was examined with respect to training sample size, sample variability, and landscape homogeneity (purity). The results were compared to two convention...

  13. Comparison of Support Vector Machine, Neural Network, and CART Algorithms for the Land-Cover Classification Using Limited Training Data Points

    Science.gov (United States)

    Support vector machine (SVM) was applied for land-cover characterization using MODIS time-series data. Classification performance was examined with respect to training sample size, sample variability, and landscape homogeneity (purity). The results were compared to two convention...

  14. A support vector machine approach for classification of welding defects from ultrasonic signals

    Science.gov (United States)

    Chen, Yuan; Ma, Hong-Wei; Zhang, Guang-Ming

    2014-07-01

    Defect classification is an important issue in ultrasonic non-destructive evaluation. A layered multi-class support vector machine (LMSVM) classification system, which combines multiple SVM classifiers through a layered architecture, is proposed in this paper. The proposed LMSVM classification system is applied to the classification of welding defects from ultrasonic test signals. The measured ultrasonic defect echo signals are first decomposed into wavelet coefficients by the wavelet packet transform. The energy of the wavelet coefficients at different frequency channels are used to construct the feature vectors. The bees algorithm (BA) is then used for feature selection and SVM parameter optimisation for the LMSVM classification system. The BA-based feature selection optimises the energy feature vectors. The optimised feature vectors are input to the LMSVM classification system for training and testing. Experimental results of classifying welding defects demonstrate that the proposed technique is highly robust, precise and reliable for ultrasonic defect classification.

  15. Redundancy-Free, Accurate Analytical Center Machine for Classification

    Institute of Scientific and Technical Information of China (English)

    ZHENGFanzi; QIUZhengding; LengYonggang; YueJianhai

    2005-01-01

    Analytical center machine (ACM) has remarkable generalization performance based on analytical center of version space and outperforms SVM. From the analysis of geometry of machine learning and principle of ACM, it is showed that some training patterns are redundant to the definition of version space. Redundant patterns push ACM classifier away from analytical center of the prime version space so that the generalization performance degrades, at the same time redundant patterns slow down the classifier and reduce the efficiency of storage. Thus, an incremental algorithm is proposed to remove redundant patterns and embed into the frame of ACM that yields a Redundancy free accurate-Analytical center machine (RFA-ACM) for classification. Experiments with Heart, Thyroid, Banana datasets demonstrate the validity of RFA-ACM.

  16. SVM-Prot 2016: A Web-Server for Machine Learning Prediction of Protein Functional Families from Sequence Irrespective of Similarity.

    Science.gov (United States)

    Li, Ying Hong; Xu, Jing Yu; Tao, Lin; Li, Xiao Feng; Li, Shuang; Zeng, Xian; Chen, Shang Ying; Zhang, Peng; Qin, Chu; Zhang, Cheng; Chen, Zhe; Zhu, Feng; Chen, Yu Zong

    2016-01-01

    Knowledge of protein function is important for biological, medical and therapeutic studies, but many proteins are still unknown in function. There is a need for more improved functional prediction methods. Our SVM-Prot web-server employed a machine learning method for predicting protein functional families from protein sequences irrespective of similarity, which complemented those similarity-based and other methods in predicting diverse classes of proteins including the distantly-related proteins and homologous proteins of different functions. Since its publication in 2003, we made major improvements to SVM-Prot with (1) expanded coverage from 54 to 192 functional families, (2) more diverse protein descriptors protein representation, (3) improved predictive performances due to the use of more enriched training datasets and more variety of protein descriptors, (4) newly integrated BLAST analysis option for assessing proteins in the SVM-Prot predicted functional families that were similar in sequence to a query protein, and (5) newly added batch submission option for supporting the classification of multiple proteins. Moreover, 2 more machine learning approaches, K nearest neighbor and probabilistic neural networks, were added for facilitating collective assessment of protein functions by multiple methods. SVM-Prot can be accessed at http://bidd2.nus.edu.sg/cgi-bin/svmprot/svmprot.cgi.

  17. Using Generalized Entropies and OC-SVM with Mahalanobis Kernel for Detection and Classification of Anomalies in Network Traffic

    Directory of Open Access Journals (Sweden)

    Jayro Santiago-Paz

    2015-09-01

    Full Text Available Network anomaly detection and classification is an important open issue in network security. Several approaches and systems based on different mathematical tools have been studied and developed, among them, the Anomaly-Network Intrusion Detection System (A-NIDS, which monitors network traffic and compares it against an established baseline of a “normal” traffic profile. Then, it is necessary to characterize the “normal” Internet traffic. This paper presents an approach for anomaly detection and classification based on Shannon, Rényi and Tsallis entropies of selected features, and the construction of regions from entropy data employing the Mahalanobis distance (MD, and One Class Support Vector Machine (OC-SVM with different kernels (Radial Basis Function (RBF and Mahalanobis Kernel (MK for “normal” and abnormal traffic. Regular and non-regular regions built from “normal” traffic profiles allow anomaly detection, while the classification is performed under the assumption that regions corresponding to the attack classes have been previously characterized. Although this approach allows the use of as many features as required, only four well-known significant features were selected in our case. In order to evaluate our approach, two different data sets were used: one set of real traffic obtained from an Academic Local Area Network (LAN, and the other a subset of the 1998 MIT-DARPA set. For these data sets, a True positive rate up to 99.35%, a True negative rate up to 99.83% and a False negative rate at about 0.16% were yielded. Experimental results show that certain q-values of the generalized entropies and the use of OC-SVM with RBF kernel improve the detection rate in the detection stage, while the novel inclusion of MK kernel in OC-SVM and k-temporal nearest neighbors improve accuracy in classification. In addition, the results show that using the Box-Cox transformation, the Mahalanobis distance yielded high detection rates with

  18. A COMPARISON STUDY OF DIFFERENT KERNEL FUNCTIONS FOR SVM-BASED CLASSIFICATION OF MULTI-TEMPORAL POLARIMETRY SAR DATA

    Directory of Open Access Journals (Sweden)

    B. Yekkehkhany

    2014-10-01

    Full Text Available In this paper, a framework is developed based on Support Vector Machines (SVM for crop classification using polarimetric features extracted from multi-temporal Synthetic Aperture Radar (SAR imageries. The multi-temporal integration of data not only improves the overall retrieval accuracy but also provides more reliable estimates with respect to single-date data. Several kernel functions are employed and compared in this study for mapping the input space to higher Hilbert dimension space. These kernel functions include linear, polynomials and Radial Based Function (RBF. The method is applied to several UAVSAR L-band SAR images acquired over an agricultural area near Winnipeg, Manitoba, Canada. In this research, the temporal alpha features of H/A/α decomposition method are used in classification. The experimental tests show an SVM classifier with RBF kernel for three dates of data increases the Overall Accuracy (OA to up to 3% in comparison to using linear kernel function, and up to 1% in comparison to a 3rd degree polynomial kernel function.

  19. Combined multi-kernel support vector machine and wavelet analysis for hyperspectral remote sensing image classification

    Institute of Scientific and Technical Information of China (English)

    Kun Tan; Peijun Du

    2011-01-01

    @@ Many remote sensing image classifiers are limited in their ability to combine spectral features with spatial features. Multi-kernel classifiers, however, are capable of integrating spectral features with spatial or structural features using multiple kernels and summing them for final outputs. Using a support vector machine (SVM) as classifier, different multi-kernel classifiers are constructed and tested using 64-band Operational Modular Imaging Spectrometer Ⅱ hyperspectral image of Changping Area, Beijing City. Results show that by integrating spectral and wavelet texture information, multi-kernel SVM classifiers can obtain more accurate classification results than sole-kernel SVM classifiers and cross-information SVM kernel classifiers. Moreover, when the multi-kernel SVM classifier is used, the combination of the first four principal components from principal component analysis and wavelet texture provides the highest accuracy (97.06%). Multi-kernel SVM is therefore an effective approach to improve the accuracy of hyperspectral image classification and to expand possibilities for remote sensing image interpretation and application.%Many remote sensing image classifiers are limited in their ability to combine spectral features with spatial features. Multi-kernel classifiers, however, are capable of integrating spectral features with spatial or structural features using multiple kernels and summing them for final outputs. Using a support vector machine (SVM) as classifier, different multi-kernel classifiers are constructed and tested using 64-band Operational Modular Imaging Spectrometer Ⅱ hyperspectral image of Changping Area, Beijing City. Results show that by integrating spectral and wavelet texture information, multi-kernel SVM classifiers can obtain more accurate classification results than sole-kernel SVM classifiers and cross-information SVM kernel classifiers. Moreover, when the multi-kernel SVM classifier is used, the combination of the first four

  20. Classification of EEG-P300 Signals Extracted from Brain Activities in BCI Systems Using ν-SVM and BLDA Algorithms

    Directory of Open Access Journals (Sweden)

    Ali MOMENNEZHAD

    2014-06-01

    Full Text Available In this paper, a linear predictive coding (LPC model is used to improve classification accuracy, convergent speed to maximum accuracy, and maximum bitrates in brain computer interface (BCI system based on extracting EEG-P300 signals. First, EEG signal is filtered in order to eliminate high frequency noise. Then, the parameters of filtered EEG signal are extracted using LPC model. Finally, the samples are reconstructed by LPC coefficients and two classifiers, a Bayesian Linear discriminant analysis (BLDA, and b the υ-support vector machine (υ-SVM are applied in order to classify. The proposed algorithm performance is compared with fisher linear discriminant analysis (FLDA. Results show that the efficiency of our algorithm in improving classification accuracy and convergent speed to maximum accuracy are much better. As example at the proposed algorithms, respectively BLDA with LPC model and υ-SVM with LPC model with8 electrode configuration for subject S1 the total classification accuracy is improved as 9.4% and 1.7%. And also, subject 7 at BLDA and υ-SVM with LPC model algorithms (LPC+BLDA and LPC+ υ-SVM after block 11th converged to maximum accuracy but Fisher Linear Discriminant Analysis (FLDA algorithm did not converge to maximum accuracy (with the same configuration. So, it can be used as a promising tool in designing BCI systems.

  1. Prediction and Classification of Human G-protein Coupled Receptors Based on Support Vector Machines

    Institute of Scientific and Technical Information of China (English)

    Yun-Fei Wang; Huan Chen; Yan-Hong Zhou

    2005-01-01

    A computational system for the prediction and classification of human G-protein coupled receptors (GPCRs) has been developed based on the support vector machine (SVM) method and protein sequence information. The feature vectors used to develop the SVM prediction models consist of statistically significant features selected from single amino acid, dipeptide, and tripeptide compositions of protein sequences. Furthermore, the length distribution difference between GPCRsand non-GPCRs has also been exploited to improve the prediction performance.The testing results with annotated human protein sequences demonstrate that this system can get good performance for both prediction and classification of human GPCRs.

  2. Extreme Learning Machine for land cover classification

    OpenAIRE

    Pal, Mahesh

    2008-01-01

    This paper explores the potential of extreme learning machine based supervised classification algorithm for land cover classification. In comparison to a backpropagation neural network, which requires setting of several user-defined parameters and may produce local minima, extreme learning machine require setting of one parameter and produce a unique solution. ETM+ multispectral data set (England) was used to judge the suitability of extreme learning machine for remote sensing classifications...

  3. Classification of the Regional Ionospheric Disturbance Based on Machine Learning Techniques

    Science.gov (United States)

    Terzi, Merve Begum; Arikan, Orhan; Karatay, Secil; Arikan, Feza; Gulyaeva, Tamara

    2016-08-01

    In this study, Total Electron Content (TEC) estimated from GPS receivers is used to model the regional and local variability that differs from global activity along with solar and geomagnetic indices. For the automated classification of regional disturbances, a classification technique based on a robust machine learning technique that have found wide spread use, Support Vector Machine (SVM) is proposed. Performance of developed classification technique is demonstrated for midlatitude ionosphere over Anatolia using TEC estimates generated from GPS data provided by Turkish National Permanent GPS Network (TNPGN-Active) for solar maximum year of 2011. As a result of implementing developed classification technique to Global Ionospheric Map (GIM) TEC data, which is provided by the NASA Jet Propulsion Laboratory (JPL), it is shown that SVM can be a suitable learning method to detect anomalies in TEC variations.

  4. Support vector machine for classification of walking conditions using miniature kinematic sensors.

    Science.gov (United States)

    Lau, Hong-Yin; Tong, Kai-Yu; Zhu, Hailong

    2008-06-01

    A portable gait analysis and activity-monitoring system for the evaluation of activities of daily life could facilitate clinical and research studies. This current study developed a small sensor unit comprising an accelerometer and a gyroscope in order to detect shank and foot segment motion and orientation during different walking conditions. The kinematic data obtained in the pre-swing phase were used to classify five walking conditions: stair ascent, stair descent, level ground, upslope and downslope. The kinematic data consisted of anterior-posterior acceleration and angular velocity measured from the shank and foot segments. A machine learning technique known as support vector machine (SVM) was applied to classify the walking conditions. SVM was also compared with other machine learning methods such as artificial neural network (ANN), radial basis function network (RBF) and Bayesian belief network (BBN). The SVM technique was shown to have a higher performance in classification than the other three methods. The results using SVM showed that stair ascent and stair descent could be distinguished from each other and from the other walking conditions with 100% accuracy by using a single sensor unit attached to the shank segment. For classification results in the five walking conditions, performance improved from 78% using the kinematic signals from the shank sensor unit to 84% by adding signals from the foot sensor unit. The SVM technique with the portable kinematic sensor unit could automatically recognize the walking condition for quantitative analysis of the activity pattern.

  5. SVM-based classification of LV wall motion in cardiac MRI with the assessment of STE

    Science.gov (United States)

    Mantilla, Juan; Garreau, Mireille; Bellanger, Jean-Jacques; Paredes, José Luis

    2015-01-01

    In this paper, we propose an automated method to classify normal/abnormal wall motion in Left Ventricle (LV) function in cardiac cine-Magnetic Resonance Imaging (MRI), taking as reference, strain information obtained from 2D Speckle Tracking Echocardiography (STE). Without the need of pre-processing and by exploiting all the images acquired during a cardiac cycle, spatio-temporal profiles are extracted from a subset of radial lines from the ventricle centroid to points outside the epicardial border. Classical Support Vector Machines (SVM) are used to classify features extracted from gray levels of the spatio-temporal profile as well as their representations in the Wavelet domain under the assumption that the data may be sparse in that domain. Based on information obtained from radial strain curves in 2D-STE studies, we label all the spatio-temporal profiles that belong to a particular segment as normal if the peak systolic radial strain curve of this segment presents normal kinesis, or abnormal if the peak systolic radial strain curve presents hypokinesis or akinesis. For this study, short-axis cine- MR images are collected from 9 patients with cardiac dyssynchrony for which we have the radial strain tracings at the mid-papilary muscle obtained by 2D STE; and from one control group formed by 9 healthy subjects. The best classification performance is obtained with the gray level information of the spatio-temporal profiles using a RBF kernel with 91.88% of accuracy, 92.75% of sensitivity and 91.52% of specificity.

  6. Support Vector Machine for Discrimination Between Fault and Magnetizing Inrush Current in Power Transformer

    Directory of Open Access Journals (Sweden)

    V. Malathi

    2007-01-01

    Full Text Available This study presents a novel technique based on Support Vector Machine (SVM for the classification of transient phenomena in power transformer. The SVM is a powerful method for statistical classification of data. The input data to this SVM for training comprises fault current and magnetizing inrush current. SVM classifier produces significant accuracy for classification of transient phenomena in power transformer.

  7. Clustering technique-based least square support vector machine for EEG signal classification.

    Science.gov (United States)

    Siuly; Li, Yan; Wen, Peng Paul

    2011-12-01

    This paper presents a new approach called clustering technique-based least square support vector machine (CT-LS-SVM) for the classification of EEG signals. Decision making is performed in two stages. In the first stage, clustering technique (CT) has been used to extract representative features of EEG data. In the second stage, least square support vector machine (LS-SVM) is applied to the extracted features to classify two-class EEG signals. To demonstrate the effectiveness of the proposed method, several experiments have been conducted on three publicly available benchmark databases, one for epileptic EEG data, one for mental imagery tasks EEG data and another one for motor imagery EEG data. Our proposed approach achieves an average sensitivity, specificity and classification accuracy of 94.92%, 93.44% and 94.18%, respectively, for the epileptic EEG data; 83.98%, 84.37% and 84.17% respectively, for the motor imagery EEG data; and 64.61%, 58.77% and 61.69%, respectively, for the mental imagery tasks EEG data. The performance of the CT-LS-SVM algorithm is compared in terms of classification accuracy and execution (running) time with our previous study where simple random sampling with a least square support vector machine (SRS-LS-SVM) was employed for EEG signal classification. We also compare the proposed method with other existing methods in the literature for the three databases. The experimental results show that the proposed algorithm can produce a better classification rate than the previous reported methods and takes much less execution time compared to the SRS-LS-SVM technique. The research findings in this paper indicate that the proposed approach is very efficient for classification of two-class EEG signals.

  8. Support Vector Machine Classification of Drunk Driving Behaviour

    Science.gov (United States)

    Chen, Huiqin; Chen, Lei

    2017-01-01

    Alcohol is the root cause of numerous traffic accidents due to its pharmacological action on the human central nervous system. This study conducted a detection process to distinguish drunk driving from normal driving under simulated driving conditions. The classification was performed by a support vector machine (SVM) classifier trained to distinguish between these two classes by integrating both driving performance and physiological measurements. In addition, principal component analysis was conducted to rank the weights of the features. The standard deviation of R–R intervals (SDNN), the root mean square value of the difference of the adjacent R–R interval series (RMSSD), low frequency (LF), high frequency (HF), the ratio of the low and high frequencies (LF/HF), and average blink duration were the highest weighted features in the study. The results show that SVM classification can successfully distinguish drunk driving from normal driving with an accuracy of 70%. The driving performance data and the physiological measurements reported by this paper combined with air-alcohol concentration could be integrated using the support vector regression classification method to establish a better early warning model, thereby improving vehicle safety.

  9. Support Vector Machine Classification of Drunk Driving Behaviour

    Directory of Open Access Journals (Sweden)

    Huiqin Chen

    2017-01-01

    Full Text Available Alcohol is the root cause of numerous traffic accidents due to its pharmacological action on the human central nervous system. This study conducted a detection process to distinguish drunk driving from normal driving under simulated driving conditions. The classification was performed by a support vector machine (SVM classifier trained to distinguish between these two classes by integrating both driving performance and physiological measurements. In addition, principal component analysis was conducted to rank the weights of the features. The standard deviation of R–R intervals (SDNN, the root mean square value of the difference of the adjacent R–R interval series (RMSSD, low frequency (LF, high frequency (HF, the ratio of the low and high frequencies (LF/HF, and average blink duration were the highest weighted features in the study. The results show that SVM classification can successfully distinguish drunk driving from normal driving with an accuracy of 70%. The driving performance data and the physiological measurements reported by this paper combined with air-alcohol concentration could be integrated using the support vector regression classification method to establish a better early warning model, thereby improving vehicle safety.

  10. Support Vector Machine Classification of Drunk Driving Behaviour.

    Science.gov (United States)

    Chen, Huiqin; Chen, Lei

    2017-01-23

    Alcohol is the root cause of numerous traffic accidents due to its pharmacological action on the human central nervous system. This study conducted a detection process to distinguish drunk driving from normal driving under simulated driving conditions. The classification was performed by a support vector machine (SVM) classifier trained to distinguish between these two classes by integrating both driving performance and physiological measurements. In addition, principal component analysis was conducted to rank the weights of the features. The standard deviation of R-R intervals (SDNN), the root mean square value of the difference of the adjacent R-R interval series (RMSSD), low frequency (LF), high frequency (HF), the ratio of the low and high frequencies (LF/HF), and average blink duration were the highest weighted features in the study. The results show that SVM classification can successfully distinguish drunk driving from normal driving with an accuracy of 70%. The driving performance data and the physiological measurements reported by this paper combined with air-alcohol concentration could be integrated using the support vector regression classification method to establish a better early warning model, thereby improving vehicle safety.

  11. A Hyper-Solution Framework for SVM Classification: Application for Predicting Destabilizations in Chronic Heart Failure Patients.

    Science.gov (United States)

    Candelieri, Antonio; Conforti, Domenico

    2010-07-27

    Support Vector Machines (SVMs) represent a powerful learning paradigm able to provide accurate and reliable decision functions in several application fields. In particular, they are really attractive for application in medical domain, where often a lack of knowledge exists. Kernel trick, on which SVMs are based, allows to map non-linearly separable data into potentially linearly separable one, according to the kernel function and its internal parameters value. During recent years non-parametric approaches have also been proposed for learning the most appropriate kernel, such as linear combination of basic kernels. Thus, SVMs classifiers may have several parameters to be tuned and their optimal values are usually difficult to be identified a-priori. Furthermore, combining different classifiers may reduce risk to perform errors on new unseen data. For such reasons, we present an hyper-solution framework for SVM classification, based on meta-heuristics, that searches for the most reliable hyper-classifier (SVM with a basic kernel, SVM with a combination of kernel, and ensemble of SVMs), and for its optimal configuration. We have applied the proposed framework on a critical and quite complex issue for the management of Chronic Heart Failure patient: the early detection of decompensation conditions. In fact, predicting new destabilizations in advance may reduce the burden of heart failure on the healthcare systems while improving quality of life of affected patients. Promising reliability has been obtained on 10-fold cross validation, proving our approach to be efficient and effective for an high-level analysis of clinical data.

  12. Stellar Spectral Classification with Locality Preserving Projections and Support Vector Machine

    Indian Academy of Sciences (India)

    Liu Zhong-bao

    2016-06-01

    With the help of computer tools and algorithms, automatic stellar spectral classification has become an area of current interest. The process of stellar spectral classification mainly includes two steps: dimension reduction and classification. As a popular dimensionality reduction technique, Principal Component Analysis (PCA) is widely used in stellar spectra classification. Another dimensionality reduction technique, Locality Preserving Projections (LPP) has not been widely used in astronomy. The advantage of LPP is that it can preserve the local structure of the data after dimensionality reduction. In view of this, we investigate how to apply LPP+SVM in classifying the stellar spectral subclasses. In the comparative experiment, the performance of LPP is compared with PCA. The stellar spectral classification process is composed of the following steps. Firstly, PCA and LPP are respectively applied to reduce the dimension of spectra data. Then, Support Vector Machine (SVM) is used to classify the 4 subclasses of K-type and 3 subclasses of F-type spectra from Sloan Digital Sky Survey (SDSS). Lastly, the performance of LPP+SVM is compared with that of PCA+SVM in stellar spectral classification, and we found that LPP does better than PCA.

  13. Extreme learning machine-based classification of ADHD using brain structural MRI data.

    Directory of Open Access Journals (Sweden)

    Xiaolong Peng

    Full Text Available BACKGROUND: Effective and accurate diagnosis of attention-deficit/hyperactivity disorder (ADHD is currently of significant interest. ADHD has been associated with multiple cortical features from structural MRI data. However, most existing learning algorithms for ADHD identification contain obvious defects, such as time-consuming training, parameters selection, etc. The aims of this study were as follows: (1 Propose an ADHD classification model using the extreme learning machine (ELM algorithm for automatic, efficient and objective clinical ADHD diagnosis. (2 Assess the computational efficiency and the effect of sample size on both ELM and support vector machine (SVM methods and analyze which brain segments are involved in ADHD. METHODS: High-resolution three-dimensional MR images were acquired from 55 ADHD subjects and 55 healthy controls. Multiple brain measures (cortical thickness, etc. were calculated using a fully automated procedure in the FreeSurfer software package. In total, 340 cortical features were automatically extracted from 68 brain segments with 5 basic cortical features. F-score and SFS methods were adopted to select the optimal features for ADHD classification. Both ELM and SVM were evaluated for classification accuracy using leave-one-out cross-validation. RESULTS: We achieved ADHD prediction accuracies of 90.18% for ELM using eleven combined features, 84.73% for SVM-Linear and 86.55% for SVM-RBF. Our results show that ELM has better computational efficiency and is more robust as sample size changes than is SVM for ADHD classification. The most pronounced differences between ADHD and healthy subjects were observed in the frontal lobe, temporal lobe, occipital lobe and insular. CONCLUSION: Our ELM-based algorithm for ADHD diagnosis performs considerably better than the traditional SVM algorithm. This result suggests that ELM may be used for the clinical diagnosis of ADHD and the investigation of different brain diseases.

  14. PMSVM: An Optimized Support Vector Machine Classification Algorithm Based on PCA and Multilevel Grid Search Methods

    Directory of Open Access Journals (Sweden)

    Yukai Yao

    2015-01-01

    Full Text Available We propose an optimized Support Vector Machine classifier, named PMSVM, in which System Normalization, PCA, and Multilevel Grid Search methods are comprehensively considered for data preprocessing and parameters optimization, respectively. The main goals of this study are to improve the classification efficiency and accuracy of SVM. Sensitivity, Specificity, Precision, and ROC curve, and so forth, are adopted to appraise the performances of PMSVM. Experimental results show that PMSVM has relatively better accuracy and remarkable higher efficiency compared with traditional SVM algorithms.

  15. Band selection for hyperspectral image classification using extreme learning machine

    Science.gov (United States)

    Li, Jiaojiao; Kingsdorf, Benjamin; Du, Qian

    2017-05-01

    Extreme learning machine (ELM) is a feedforward neural network with one hidden layer, which is similar to a multilayer perceptron (MLP). To reduce the complexity in the training process of MLP using the traditional backpropagation algorithm, the weights in ELM between input and hidden layers are random variables. The output layer in the ELM is linear, as in a radial basis function neural network (RBFNN), so the output weights can be easily estimated with a least squares solution. It has been demonstrated in our previous work that the computational cost of ELM is much lower than the standard support vector machine (SVM), and a kernel version of ELM can offer comparable performance as SVM. In our previous work, we also investigate the impact of the number of hidden neurons to the performance of ELM. Basically, more hidden neurons are needed if the number of training samples and data dimensionality are large, which results in a very large matrix inversion problem. To avoid handling such a large matrix, we propose to conduct band selection to reduce data dimensionality (i.e., the number of input neurons), thereby reducing network complexity. Experimental results show that ELM using selected bands can yield similar or even better classification accuracy than using all the original bands.

  16. IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST SENSITIVE CRITERION FOR C-SVM

    Directory of Open Access Journals (Sweden)

    M’hamed Bilal Abidine

    2013-11-01

    Full Text Available The growing population of elders in the society calls for a new approach in care giving. By inferring what activities elderly are performing in their houses it is possible to determine their physical and cognitive capabilities. In this paper we show the potential of important discriminative classifiers namely the Soft-Support Vector Machines (C-SVM, Conditional Random Fields (CRF and k-Nearest Neighbors (k-NN for recognizing activities from sensor patterns in a smart home environment. We address also the class imbalance problem in activity recognition field which has been known to hinder the learning performance of classifiers. Cost sensitive learning is attractive under most imbalanced circumstances, but it is difficult to determine the precise misclassification costs in practice. We introduce a new criterion for selecting the suitable cost parameter C of the C-SVM method. Through our evaluation on four real world imbalanced activity datasets, we demonstrate that C-SVM based on our proposed criterion outperforms the state-of-the-art discriminative methods in activity recognition.

  17. EHPred: an SVM-based method for epoxide hydrolases recognition and classification

    Institute of Scientific and Technical Information of China (English)

    JIA Jia; YANG Liang; ZHANG Zi-zhang

    2006-01-01

    A two-layer method based on support vector machines (SVMs) has been developed to distinguish epoxide hydrolases (EHs) from other enzymes and to classify its subfamilies using its primary protein sequences. SVM classifiers were built using three different feature vectors extracted from the primary sequence of EHs: the amino acid composition (AAC), the dipeptide composition (DPC), and the pseudo-amino acid composition (PAAC). Validated by 5-fold cross tests, the first layer SVM classifier can differentiate EHs and non-EHs with an accuracy of 94.2% and has a Matthew,s correlation coefficient (MCC) of 0.84.Using 2-fold cross validation, PAAC-based second layer SVM can further classify EH subfamilies with an overall accuracy of 90.7% and MCC of 0.87 as compared to AAC (80.0%) and DPC (84.9%). A program called EHPred has also been developed to assist readers to recognize EHs and to classify their subfamilies using primary protein sequences with greater accuracy.

  18. Automated Classification and Removal of EEG Artifacts with SVM and Wavelet-ICA.

    Science.gov (United States)

    Sai, Chong Yeh; Mokhtar, Norrima; Arof, Hamzah; Cumming, Paul; Iwahashi, Masahiro

    2017-07-04

    Brain electrical activity recordings by electroencephalography (EEG) are often contaminated with signal artifacts. Procedures for automated removal of EEG artifacts are frequently sought for clinical diagnostics and brain computer interface (BCI) applications. In recent years, a combination of independent component analysis (ICA) and discrete wavelet transform (DWT) has been introduced as standard technique for EEG artifact removal. However, in performing the wavelet-ICA procedure, visual inspection or arbitrary thresholding may be required for identifying artifactual components in the EEG signal. We now propose a novel approach for identifying artifactual components separated by wavelet-ICA using a pre-trained support vector machine (SVM). Our method presents a robust and extendable system that enables fully automated identification and removal of artifacts from EEG signals, without applying any arbitrary thresholding. Using test data contaminated by eye blink artifacts, we show that our method performed better in identifying artifactual components than did existing thresholding methods. Furthermore, wavelet-ICA in conjunction with SVM successfully removed target artifacts, while largely retaining the EEG source signals of interest. We propose a set of features including kurtosis, variance, Shannon's entropy and range of amplitude as training and test data of SVM to identify eye blink artifacts in EEG signals. This combinatorial method is also extendable to accommodate multiple types of artifacts present in multi-channel EEG. We envision future research to explore other descriptive features corresponding to other types of artifactual components.

  19. Classification of Cancer-related Death Certificates using Machine Learning

    Directory of Open Access Journals (Sweden)

    Luke Butt

    2013-05-01

    Full Text Available BackgroundCancer monitoring and prevention relies on the critical aspect of timely notification of cancer cases. However, the abstraction and classification of cancer from the free-text of pathology reports and other relevant documents, such as death certificates, exist as complex and time-consuming activities.AimsIn this paper, approaches for the automatic detection of notifiable cancer cases as the cause of death from free-text death certificates supplied to Cancer Registries are investigated.Method A number of machine learning classifiers were studied. Features were extracted using natural language techniques and the Medtex toolkit. The numerous features encompassed stemmed words, bi-grams, and concepts from the SNOMED CT medical terminology. The baseline consisted of a keyword spotter using keywords extracted from the long description of ICD-10 cancer related codes.ResultsDeath certificates with notifiable cancer listed as the cause of death can be effectively identified with the methods studied in this paper. A Support Vector Machine (SVM classifier achieved best performance with an overall F-measure of 0.9866 when evaluated on a set of 5,000 free-text death certificates using the token stem feature set. The SNOMED CT concept plus token stem feature set reached the lowest variance (0.0032 and false negative rate (0.0297 while achieving an F-measure of 0.9864. The SVM classifier accounts for the first 18 of the top 40 evaluated runs, and entails the most robust classifier with a variance of 0.001141, half the variance of the other classifiers.ConclusionThe selection of features significantly produced the most influences on the performance of the classifiers, although the type of classifier employed also affects performance. In contrast, the feature weighting schema created a negligible effect on performance. Specifically, it is found that stemmed tokens with or without SNOMED CT concepts create the most effective feature when combined with

  20. Data on Support Vector Machines (SVM model to forecast photovoltaic power

    Directory of Open Access Journals (Sweden)

    M. Malvoni

    2016-12-01

    Full Text Available The data concern the photovoltaic (PV power, forecasted by a hybrid model that considers weather variations and applies a technique to reduce the input data size, as presented in the paper entitled “Photovoltaic forecast based on hybrid pca-lssvm using dimensionality reducted data” (M. Malvoni, M.G. De Giorgi, P.M. Congedo, 2015 [1]. The quadratic Renyi entropy criteria together with the principal component analysis (PCA are applied to the Least Squares Support Vector Machines (LS-SVM to predict the PV power in the day-ahead time frame. The data here shared represent the proposed approach results. Hourly PV power predictions for 1,3,6,12, 24 ahead hours and for different data reduction sizes are provided in Supplementary material.

  1. Data on Support Vector Machines (SVM) model to forecast photovoltaic power.

    Science.gov (United States)

    Malvoni, M; De Giorgi, M G; Congedo, P M

    2016-12-01

    The data concern the photovoltaic (PV) power, forecasted by a hybrid model that considers weather variations and applies a technique to reduce the input data size, as presented in the paper entitled "Photovoltaic forecast based on hybrid pca-lssvm using dimensionality reducted data" (M. Malvoni, M.G. De Giorgi, P.M. Congedo, 2015) [1]. The quadratic Renyi entropy criteria together with the principal component analysis (PCA) are applied to the Least Squares Support Vector Machines (LS-SVM) to predict the PV power in the day-ahead time frame. The data here shared represent the proposed approach results. Hourly PV power predictions for 1,3,6,12, 24 ahead hours and for different data reduction sizes are provided in Supplementary material.

  2. Support vector machine classification trees based on fuzzy entropy of classification.

    Science.gov (United States)

    de Boves Harrington, Peter

    2017-02-15

    The support vector machine (SVM) is a powerful classifier that has recently been implemented in a classification tree (SVMTreeG). This classifier partitioned the data by finding gaps in the data space. For large and complex datasets, there may be no gaps in the data space confounding this type of classifier. A novel algorithm was devised that uses fuzzy entropy to find optimal partitions for situations when clusters of data are overlapped in the data space. Also, a kernel version of the fuzzy entropy algorithm was devised. A fast support vector machine implementation is used that has no cost C or slack variables to optimize. Statistical comparisons using bootstrapped Latin partitions among the tree classifiers were made using a synthetic XOR data set and validated with ten prediction sets comprised of 50,000 objects and a data set of NMR spectra obtained from 12 tea sample extracts.

  3. lncRScan-SVM: A Tool for Predicting Long Non-Coding RNAs Using Support Vector Machine.

    Science.gov (United States)

    Sun, Lei; Liu, Hui; Zhang, Lin; Meng, Jia

    2015-01-01

    Functional long non-coding RNAs (lncRNAs) have been bringing novel insight into biological study, however it is still not trivial to accurately distinguish the lncRNA transcripts (LNCTs) from the protein coding ones (PCTs). As various information and data about lncRNAs are preserved by previous studies, it is appealing to develop novel methods to identify the lncRNAs more accurately. Our method lncRScan-SVM aims at classifying PCTs and LNCTs using support vector machine (SVM). The gold-standard datasets for lncRScan-SVM model training, lncRNA prediction and method comparison were constructed according to the GENCODE gene annotations of human and mouse respectively. By integrating features derived from gene structure, transcript sequence, potential codon sequence and conservation, lncRScan-SVM outperforms other approaches, which is evaluated by several criteria such as sensitivity, specificity, accuracy, Matthews correlation coefficient (MCC) and area under curve (AUC). In addition, several known human lncRNA datasets were assessed using lncRScan-SVM. LncRScan-SVM is an efficient tool for predicting the lncRNAs, and it is quite useful for current lncRNA study.

  4. Automated Classification of Epiphyses in the Distal Radius and Ulna using a Support Vector Machine.

    Science.gov (United States)

    Wang, Ya-hui; Liu, Tai-ang; Wei, Hua; Wan, Lei; Ying, Chong-liang; Zhu, Guang-you

    2016-03-01

    The aim of this study was to automatically classify epiphyses in the distal radius and ulna using a support vector machine (SVM) and to examine the accuracy of the epiphyseal growth grades generated by the support vector machine. X-ray images of distal radii and ulnae were collected from 140 Chinese teenagers aged between 11.0 and 19.0 years. Epiphyseal growth of the two elements was classified into five grades. Features of each element were extracted using a histogram of oriented gradient (HOG), and models were established using support vector classification (SVC). The prediction results and the validity of the models were evaluated with a cross-validation test and independent test for accuracy (PA ). Our findings suggest that this new technique for epiphyseal classification was successful and that an automated technique using an SVM is reliable and feasible, with a relative high accuracy for the models.

  5. Full-polarization radar remote sensing and data mining for tropical crops mapping: a successful SVM-based classification model

    Science.gov (United States)

    Denize, J.; Corgne, S.; Todoroff, P.; LE Mezo, L.

    2015-12-01

    In Reunion, a tropical island of 2,512 km², 700 km east of Madagascar in the Indian Ocean, constrained by a rugged relief, agricultural sectors are competing in highly fragmented agricultural land constituted by heterogeneous farming systems from corporate to small-scale farming. Policymakers, planners and institutions are in dire need of reliable and updated land use references. Actually conventional land use mapping methods are inefficient under the tropic with frequent cloud cover and loosely synchronous vegetative cycles of the crops due to a constant temperature. This study aims to provide an appropriate method for the identification and mapping of tropical crops by remote sensing. For this purpose, we assess the potential of polarimetric SAR imagery associated with associated with machine learning algorithms. The method has been developed and tested on a study area of 25*25 km thanks to 6 RADARSAT-2 images in 2014 in full-polarization. A set of radar indicators (backscatter coefficient, bands ratios, indices, polarimetric decompositions (Freeman-Durden, Van zyl, Yamaguchi, Cloude and Pottier, Krogager), texture, etc.) was calculated from the coherency matrix. A random forest procedure allowed the selection of the most important variables on each images to reduce the dimension of the dataset and the processing time. Support Vector Machines (SVM), allowed the classification of these indicators based on a learning database created from field observations in 2013. The method shows an overall accuracy of 88% with a Kappa index of 0.82 for the identification of four major crops.

  6. Conotoxin protein classification using free scores of words and support vector machines

    Directory of Open Access Journals (Sweden)

    Nuel Gregory

    2011-05-01

    Full Text Available Abstract Background Conotoxin has been proven to be effective in drug design and could be used to treat various disorders such as schizophrenia, neuromuscular disorders and chronic pain. With the rapidly growing interest in conotoxin, accurate conotoxin superfamily classification tools are desirable to systematize the increasing number of newly discovered sequences and structures. However, despite the significance and extensive experimental investigations on conotoxin, those tools have not been intensively explored. Results In this paper, we propose to consider suboptimal alignments of words with restricted length. We developed a scoring system based on local alignment partition functions, called free score. The scoring system plays the key role in the feature extraction step of support vector machine classification. In the classification of conotoxin proteins, our method, SVM-Freescore, features an improved sensitivity and specificity by approximately 5.864% and 3.76%, respectively, over previously reported methods. For the generalization purpose, SVM-Freescore was also applied to classify superfamilies from curated and high quality database such as ConoServer. The average computed sensitivity and specificity for the superfamily classification were found to be 0.9742 and 0.9917, respectively. Conclusions The SVM-Freescore method is shown to be a useful sequence-based analysis tool for functional and structural characterization of conotoxin proteins. The datasets and the software are available at http://faculty.uaeu.ac.ae/nzaki/SVM-Freescore.htm.

  7. Embedded Hardware-Efficient Real-Time Classification With Cascade Support Vector Machines.

    Science.gov (United States)

    Kyrkou, Christos; Bouganis, Christos-Savvas; Theocharides, Theocharis; Polycarpou, Marios M

    2016-01-01

    Cascade support vector machines (SVMs) are optimized to efficiently handle problems, where the majority of the data belong to one of the two classes, such as image object classification, and hence can provide speedups over monolithic (single) SVM classifiers. However, SVM classification is a computationally demanding task and existing hardware architectures for SVMs only consider monolithic classifiers. This paper proposes the acceleration of cascade SVMs through a hybrid processing hardware architecture optimized for the cascade SVM classification flow, accompanied by a method to reduce the required hardware resources for its implementation, and a method to improve the classification speed utilizing cascade information to further discard data samples. The proposed SVM cascade architecture is implemented on a Spartan-6 field-programmable gate array (FPGA) platform and evaluated for object detection on 800×600 (Super Video Graphics Array) resolution images. The proposed architecture, boosted by a neural network that processes cascade information, achieves a real-time processing rate of 40 frames/s for the benchmark face detection application. Furthermore, the hardware-reduction method results in the utilization of 25% less FPGA custom-logic resources and 20% peak power reduction compared with a baseline implementation.

  8. Pairwise-Svm for On-Board Urban Road LIDAR Classification

    Science.gov (United States)

    Shu, Zhen; Sun, Kai; Qiu, Kaijin; Ding, Kou

    2016-06-01

    The common method of LiDAR classifications is Markov random fields (MRF). Based on construction of MRF energy function, spectral and directional features are extracted for on-board urban point clouds. The MRF energy function is consisted of unary and pairwise potentials. The unary terms are computed by SVM classifictaion. The initial labeling is mainly processed through geometrical shapes. The pairwise potential is estimated by Naïve Bayes. From training data, the probability of adjacent objects is computed by prior knowledge. The final labeling method is reweighted message-passing to minimization the energy function. The MRF model is difficult to process the large-scale misclassification. We propose a super-voxel clustering method for over-segment and grouping segment for large objects. Trees, poles ground, and building are classified in this paper. The experimental results show that this method improves the accuracy of classification and speed of computation.

  9. Optimized extreme learning machine for urban land cover classification using hyperspectral imagery

    Science.gov (United States)

    Su, Hongjun; Tian, Shufang; Cai, Yue; Sheng, Yehua; Chen, Chen; Najafian, Maryam

    2016-11-01

    This work presents a new urban land cover classification framework using the firefly algorithm (FA) optimized extreme learning machine (ELM). FA is adopted to optimize the regularization coefficient C and Gaussian kernel s for kernel ELM. Additionally, effectiveness of spectral features derived from an FA-based band selection algorithm is studied for the proposed classification task. Three sets of hyperspectral databases were recorded using different sensors, namely HYDICE, HyMap, and AVIRIS. Our study shows that the proposed method outperforms traditional classification algorithms such as SVM and reduces computational cost significantly.

  10. A multitemporal probabilistic error correction approach to SVM classification of alpine glacier exploiting sentinel-1 images (Conference Presentation)

    Science.gov (United States)

    Callegari, Mattia; Marin, Carlo; Notarnicola, Claudia; Carturan, Luca; Covi, Federico; Galos, Stephan; Seppi, Roberto

    2016-10-01

    In mountain regions and their forelands, glaciers are key source of melt water during the middle and late ablation season, when most of the winter snow has already melted. Furthermore, alpine glaciers are recognized as sensitive indicators of climatic fluctuations. Monitoring glacier extent changes and glacier surface characteristics (i.e. snow, firn and bare ice coverage) is therefore important for both hydrological applications and climate change studies. Satellite remote sensing data have been widely employed for glacier surface classification. Many approaches exploit optical data, such as from Landsat. Despite the intuitive visual interpretation of optical images and the demonstrated capability to discriminate glacial surface thanks to the combination of different bands, one of the main disadvantages of available high-resolution optical sensors is their dependence on cloud conditions and low revisit time frequency. Therefore, operational monitoring strategies relying only on optical data have serious limitations. Since SAR data are insensitive to clouds, they are potentially a valid alternative to optical data for glacier monitoring. Compared to past SAR missions, the new Sentinel-1 mission provides much higher revisit time frequency (two acquisitions each 12 days) over the entire European Alps, and this number will be doubled once the Sentinel1-b will be in orbit (April 2016). In this work we present a method for glacier surface classification by exploiting dual polarimetric Sentinel-1 data. The method consists of a supervised approach based on Support Vector Machine (SVM). In addition to the VV and VH signals, we tested the contribution of local incidence angle, extracted from a digital elevation model and orbital information, as auxiliary input feature in order to account for the topographic effects. By exploiting impossible temporal transition between different classes (e.g. if at a given date one pixel is classified as rock it cannot be classified as

  11. miRFam: an effective automatic miRNA classification method based on n-grams and a multiclass SVM

    Directory of Open Access Journals (Sweden)

    Zhou Shuigeng

    2011-05-01

    Full Text Available Abstract Background MicroRNAs (miRNAs are ~22 nt long integral elements responsible for post-transcriptional control of gene expressions. After the identification of thousands of miRNAs, the challenge is now to explore their specific biological functions. To this end, it will be greatly helpful to construct a reasonable organization of these miRNAs according to their homologous relationships. Given an established miRNA family system (e.g. the miRBase family organization, this paper addresses the problem of automatically and accurately classifying newly found miRNAs to their corresponding families by supervised learning techniques. Concretely, we propose an effective method, miRFam, which uses only primary information of pre-miRNAs or mature miRNAs and a multiclass SVM, to automatically classify miRNA genes. Results An existing miRNA family system prepared by miRBase was downloaded online. We first employed n-grams to extract features from known precursor sequences, and then trained a multiclass SVM classifier to classify new miRNAs (i.e. their families are unknown. Comparing with miRBase's sequence alignment and manual modification, our study shows that the application of machine learning techniques to miRNA family classification is a general and more effective approach. When the testing dataset contains more than 300 families (each of which holds no less than 5 members, the classification accuracy is around 98%. Even with the entire miRBase15 (1056 families and more than 650 of them hold less than 5 samples, the accuracy surprisingly reaches 90%. Conclusions Based on experimental results, we argue that miRFam is suitable for application as an automated method of family classification, and it is an important supplementary tool to the existing alignment-based small non-coding RNA (sncRNA classification methods, since it only requires primary sequence information. Availability The source code of miRFam, written in C++, is freely and publicly

  12. A comparative study of the SVM and K-nn machine learning algorithms for the diagnosis of respiratory pathologies using pulmonary acoustic signals.

    Science.gov (United States)

    Palaniappan, Rajkumar; Sundaraj, Kenneth; Sundaraj, Sebastian

    2014-06-27

    Pulmonary acoustic parameters extracted from recorded respiratory sounds provide valuable information for the detection of respiratory pathologies. The automated analysis of pulmonary acoustic signals can serve as a differential diagnosis tool for medical professionals, a learning tool for medical students, and a self-management tool for patients. In this context, we intend to evaluate and compare the performance of the support vector machine (SVM) and K-nearest neighbour (K-nn) classifiers in diagnosis respiratory pathologies using respiratory sounds from R.A.L.E database. The pulmonary acoustic signals used in this study were obtained from the R.A.L.E lung sound database. The pulmonary acoustic signals were manually categorised into three different groups, namely normal, airway obstruction pathology, and parenchymal pathology. The mel-frequency cepstral coefficient (MFCC) features were extracted from the pre-processed pulmonary acoustic signals. The MFCC features were analysed by one-way ANOVA and then fed separately into the SVM and K-nn classifiers. The performances of the classifiers were analysed using the confusion matrix technique. The statistical analysis of the MFCC features using one-way ANOVA showed that the extracted MFCC features are significantly different (p train and test the classifiers are limited, the classification accuracies found are satisfactory. The K-nn classifier was better than the SVM classifier for the discrimination of pulmonary acoustic signals from pathological and normal subjects obtained from the RALE database.

  13. Joint application of feature extraction based on EMD-AR strategy and multi-class classifier based on LS-SVM in EMG motion classification

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    This paper presents an effective and efficient combination of feature extraction and multi-class classifier for motion classification by analyzing the surface electromyografic (sEMG) signals. In contrast to the existing methods, considering the non-stationary and nonlinear characteristics of EMG signals, to get the more separable feature set, we introduce the empirical mode decomposition (EMD) to decompose the original EMG signals into several intrinsic mode functions (IMFs) and then compute the coefficients of autoregressive models of each IMF to form the feature set. Based on the least squares support vector machines (LS-SVMs), the multi-class classifier is designed and constructed to classify various motions. The results of contrastive experiments showed that the accuracy of motion recognition is improved with the described classification scheme. Furthermore,compared with other classifiers using different features, the excellent performance indicated the potential of the SVM techniques embedding the EMD-AR kernel in motion classification.

  14. A tool for urban soundscape evaluation applying Support Vector Machines for developing a soundscape classification model.

    Science.gov (United States)

    Torija, Antonio J; Ruiz, Diego P; Ramos-Ridao, Angel F

    2014-06-01

    To ensure appropriate soundscape management in urban environments, the urban-planning authorities need a range of tools that enable such a task to be performed. An essential step during the management of urban areas from a sound standpoint should be the evaluation of the soundscape in such an area. In this sense, it has been widely acknowledged that a subjective and acoustical categorization of a soundscape is the first step to evaluate it, providing a basis for designing or adapting it to match people's expectations as well. In this sense, this work proposes a model for automatic classification of urban soundscapes. This model is intended for the automatic classification of urban soundscapes based on underlying acoustical and perceptual criteria. Thus, this classification model is proposed to be used as a tool for a comprehensive urban soundscape evaluation. Because of the great complexity associated with the problem, two machine learning techniques, Support Vector Machines (SVM) and Support Vector Machines trained with Sequential Minimal Optimization (SMO), are implemented in developing model classification. The results indicate that the SMO model outperforms the SVM model in the specific task of soundscape classification. With the implementation of the SMO algorithm, the classification model achieves an outstanding performance (91.3% of instances correctly classified).

  15. A Fast SVM-Based Tongue’s Colour Classification Aided by k-Means Clustering Identifiers and Colour Attributes as Computer-Assisted Tool for Tongue Diagnosis

    Directory of Open Access Journals (Sweden)

    Nur Diyana Kamarudin

    2017-01-01

    Full Text Available In tongue diagnosis, colour information of tongue body has kept valuable information regarding the state of disease and its correlation with the internal organs. Qualitatively, practitioners may have difficulty in their judgement due to the instable lighting condition and naked eye’s ability to capture the exact colour distribution on the tongue especially the tongue with multicolour substance. To overcome this ambiguity, this paper presents a two-stage tongue’s multicolour classification based on a support vector machine (SVM whose support vectors are reduced by our proposed k-means clustering identifiers and red colour range for precise tongue colour diagnosis. In the first stage, k-means clustering is used to cluster a tongue image into four clusters of image background (black, deep red region, red/light red region, and transitional region. In the second-stage classification, red/light red tongue images are further classified into red tongue or light red tongue based on the red colour range derived in our work. Overall, true rate classification accuracy of the proposed two-stage classification to diagnose red, light red, and deep red tongue colours is 94%. The number of support vectors in SVM is improved by 41.2%, and the execution time for one image is recorded as 48 seconds.

  16. A Fast SVM-Based Tongue's Colour Classification Aided by k-Means Clustering Identifiers and Colour Attributes as Computer-Assisted Tool for Tongue Diagnosis

    Science.gov (United States)

    Ooi, Chia Yee; Kawanabe, Tadaaki; Odaguchi, Hiroshi; Kobayashi, Fuminori

    2017-01-01

    In tongue diagnosis, colour information of tongue body has kept valuable information regarding the state of disease and its correlation with the internal organs. Qualitatively, practitioners may have difficulty in their judgement due to the instable lighting condition and naked eye's ability to capture the exact colour distribution on the tongue especially the tongue with multicolour substance. To overcome this ambiguity, this paper presents a two-stage tongue's multicolour classification based on a support vector machine (SVM) whose support vectors are reduced by our proposed k-means clustering identifiers and red colour range for precise tongue colour diagnosis. In the first stage, k-means clustering is used to cluster a tongue image into four clusters of image background (black), deep red region, red/light red region, and transitional region. In the second-stage classification, red/light red tongue images are further classified into red tongue or light red tongue based on the red colour range derived in our work. Overall, true rate classification accuracy of the proposed two-stage classification to diagnose red, light red, and deep red tongue colours is 94%. The number of support vectors in SVM is improved by 41.2%, and the execution time for one image is recorded as 48 seconds.

  17. Web Document Classification Algorithm Based on Manifold Learning and SVM%基于流形学习和SVM的Web文档分类算法

    Institute of Scientific and Technical Information of China (English)

    王自强; 钱旭

    2009-01-01

    为解决Web文档分类问题,提出一种基于流形学习和SVM的Web文档分类算法.该算法利用流形学习算法LPP对训练集中的高维Web文档空间进行非线性降维,从中找出隐藏在高维观测数据中有意义的低维结构,在降维后的低维特征空间中利用乘性更新规则的优化SVM进行分类预测.实验结果表明该算法以较少的运行时间获得更高的分类准确率.%To efficiently resolve Web document classification problem, a novel Web document classification algorithm based on manifold learning and Support Vector Machine(SVM) is proposed. The high dimensional Web document space in the training sets are non-linearly reduced to lower dimensional space with manifold learning algorithm LPP, and the hidden interesting lower dimensional structure can be discovered from the high dimensional observisional data. The classification and predication in the lower dimensional feature space are implemented with the multiplicative update-based optimal SVM. Experimental results show that the algorithm achieves higher classification accuracy with less running time.

  18. Highly predictive support vector machine (SVM) models for anthrax toxin lethal factor (LF) inhibitors.

    Science.gov (United States)

    Zhang, Xia; Amin, Elizabeth Ambrose

    2016-01-01

    Anthrax is a highly lethal, acute infectious disease caused by the rod-shaped, Gram-positive bacterium Bacillus anthracis. The anthrax toxin lethal factor (LF), a zinc metalloprotease secreted by the bacilli, plays a key role in anthrax pathogenesis and is chiefly responsible for anthrax-related toxemia and host death, partly via inactivation of mitogen-activated protein kinase kinase (MAPKK) enzymes and consequent disruption of key cellular signaling pathways. Antibiotics such as fluoroquinolones are capable of clearing the bacilli but have no effect on LF-mediated toxemia; LF itself therefore remains the preferred target for toxin inactivation. However, currently no LF inhibitor is available on the market as a therapeutic, partly due to the insufficiency of existing LF inhibitor scaffolds in terms of efficacy, selectivity, and toxicity. In the current work, we present novel support vector machine (SVM) models with high prediction accuracy that are designed to rapidly identify potential novel, structurally diverse LF inhibitor chemical matter from compound libraries. These SVM models were trained and validated using 508 compounds with published LF biological activity data and 847 inactive compounds deposited in the Pub Chem BioAssay database. One model, M1, demonstrated particularly favorable selectivity toward highly active compounds by correctly predicting 39 (95.12%) out of 41 nanomolar-level LF inhibitors, 46 (93.88%) out of 49 inactives, and 844 (99.65%) out of 847 Pub Chem inactives in external, unbiased test sets. These models are expected to facilitate the prediction of LF inhibitory activity for existing molecules, as well as identification of novel potential LF inhibitors from large datasets.

  19. SVM-RFE based feature selection and Taguchi parameters optimization for multiclass SVM classifier.

    Science.gov (United States)

    Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, W M; Li, R K; Jiang, Bo-Ru

    2014-01-01

    Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases.

  20. SVM-RFE Based Feature Selection and Taguchi Parameters Optimization for Multiclass SVM Classifier

    Science.gov (United States)

    Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, W. M.; Li, R. K.; Jiang, Bo-Ru

    2014-01-01

    Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases. PMID:25295306

  1. A prediction model of drug-induced ototoxicity developed by an optimal support vector machine (SVM) method.

    Science.gov (United States)

    Zhou, Shu; Li, Guo-Bo; Huang, Lu-Yi; Xie, Huan-Zhang; Zhao, Ying-Lan; Chen, Yu-Zong; Li, Lin-Li; Yang, Sheng-Yong

    2014-08-01

    Drug-induced ototoxicity, as a toxic side effect, is an important issue needed to be considered in drug discovery. Nevertheless, current experimental methods used to evaluate drug-induced ototoxicity are often time-consuming and expensive, indicating that they are not suitable for a large-scale evaluation of drug-induced ototoxicity in the early stage of drug discovery. We thus, in this investigation, established an effective computational prediction model of drug-induced ototoxicity using an optimal support vector machine (SVM) method, GA-CG-SVM. Three GA-CG-SVM models were developed based on three training sets containing agents bearing different risk levels of drug-induced ototoxicity. For comparison, models based on naïve Bayesian (NB) and recursive partitioning (RP) methods were also used on the same training sets. Among all the prediction models, the GA-CG-SVM model II showed the best performance, which offered prediction accuracies of 85.33% and 83.05% for two independent test sets, respectively. Overall, the good performance of the GA-CG-SVM model II indicates that it could be used for the prediction of drug-induced ototoxicity in the early stage of drug discovery. Copyright © 2014 Elsevier Ltd. All rights reserved.

  2. Photometric Supernova Classification With Machine Learning

    CERN Document Server

    Lochner, Michelle; Peiris, Hiranya V; Lahav, Ofer; Winter, Max K

    2016-01-01

    Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Telescope (LSST), given that spectroscopic confirmation of type for all supernovae discovered with these surveys will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques fitting parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k-nearest neighbors, support vector machines, artificial neural networks and boosted decision trees. We test the pipeline on simulated multi-ba...

  3. Feature selection and classification of multiparametric medical images using bagging and SVM

    Science.gov (United States)

    Fan, Yong; Resnick, Susan M.; Davatzikos, Christos

    2008-03-01

    This paper presents a framework for brain classification based on multi-parametric medical images. This method takes advantage of multi-parametric imaging to provide a set of discriminative features for classifier construction by using a regional feature extraction method which takes into account joint correlations among different image parameters; in the experiments herein, MRI and PET images of the brain are used. Support vector machine classifiers are then trained based on the most discriminative features selected from the feature set. To facilitate robust classification and optimal selection of parameters involved in classification, in view of the well-known "curse of dimensionality", base classifiers are constructed in a bagging (bootstrap aggregating) framework for building an ensemble classifier and the classification parameters of these base classifiers are optimized by means of maximizing the area under the ROC (receiver operating characteristic) curve estimated from their prediction performance on left-out samples of bootstrap sampling. This classification system is tested on a sex classification problem, where it yields over 90% classification rates for unseen subjects. The proposed classification method is also compared with other commonly used classification algorithms, with favorable results. These results illustrate that the methods built upon information jointly extracted from multi-parametric images have the potential to perform individual classification with high sensitivity and specificity.

  4. Machine Learning for Biological Trajectory Classification Applications

    Science.gov (United States)

    Sbalzarini, Ivo F.; Theriot, Julie; Koumoutsakos, Petros

    2002-01-01

    Machine-learning techniques, including clustering algorithms, support vector machines and hidden Markov models, are applied to the task of classifying trajectories of moving keratocyte cells. The different algorithms axe compared to each other as well as to expert and non-expert test persons, using concepts from signal-detection theory. The algorithms performed very well as compared to humans, suggesting a robust tool for trajectory classification in biological applications.

  5. Galaxy Classification using Machine Learning

    Science.gov (United States)

    Fowler, Lucas; Schawinski, Kevin; Brandt, Ben-Elias; widmer, Nicole

    2017-01-01

    We present our current research into the use of machine learning to classify galaxy imaging data with various convolutional neural network configurations in TensorFlow. We are investigating how five-band Sloan Digital Sky Survey imaging data can be used to train on physical properties such as redshift, star formation rate, mass and morphology. We also investigate the performance of artificially redshifted images in recovering physical properties as image quality degrades.

  6. SVM for Solving Forward Problems of EIT.

    Science.gov (United States)

    Wu, Youxi; Li, Ying; Guo, Lei; Yan, Weili; Shen, Xueqin; Fu, Kun

    2005-01-01

    Support Vector Machine (SVM) can be seen as a new machine learning way which is based on the idea of VC dimensions and the principle of structural risk minimization rather than empirical risk minimization. SVM can be used for classification and regression. Support Vector Regression (SVR) is a very important branch of Support Vector Machine. Partial Differential Equations (PDEs) have been successfully treated by using SVR in previous works. The forward problems of EIT are the basis of EIT inverse problems. The forward problem's essence is to solve PDEs. The method has been successfully tested on the forward problems of EIT and has yielded accurate results.

  7. SVM CLASSIFICATION:ITS CONTENTS AND CHALLENGES

    Institute of Scientific and Technical Information of China (English)

    YueShihong; LiPing; HaoPeiyi

    2003-01-01

    SVM (support vector machines) have become an increasingly popular tool for machine learning tasks involving classification, regression or novelty detection. In particular,they exhibit good generalization performance on many real issues and the approach is properly motivated theoretically. There are relatively a few free parameters to adjust and the architecture of the learning machine does not need to be found by experimentation. In this paper,survey ofthe key contents on this subject, focusing on the most well-known models based on kernel substitution, namely SVM, as well as the activated fields at present and the development tendency,is presented.

  8. Machine learning algorithms for mode-of-action classification in toxicity assessment.

    Science.gov (United States)

    Zhang, Yile; Wong, Yau Shu; Deng, Jian; Anton, Cristina; Gabos, Stephan; Zhang, Weiping; Huang, Dorothy Yu; Jin, Can

    2016-01-01

    Real Time Cell Analysis (RTCA) technology is used to monitor cellular changes continuously over the entire exposure period. Combining with different testing concentrations, the profiles have potential in probing the mode of action (MOA) of the testing substances. In this paper, we present machine learning approaches for MOA assessment. Computational tools based on artificial neural network (ANN) and support vector machine (SVM) are developed to analyze the time-concentration response curves (TCRCs) of human cell lines responding to tested chemicals. The techniques are capable of learning data from given TCRCs with known MOA information and then making MOA classification for the unknown toxicity. A novel data processing step based on wavelet transform is introduced to extract important features from the original TCRC data. From the dose response curves, time interval leading to higher classification success rate can be selected as input to enhance the performance of the machine learning algorithm. This is particularly helpful when handling cases with limited and imbalanced data. The validation of the proposed method is demonstrated by the supervised learning algorithm applied to the exposure data of HepG2 cell line to 63 chemicals with 11 concentrations in each test case. Classification success rate in the range of 85 to 95 % are obtained using SVM for MOA classification with two clusters to cases up to four clusters. Wavelet transform is capable of capturing important features of TCRCs for MOA classification. The proposed SVM scheme incorporated with wavelet transform has a great potential for large scale MOA classification and high-through output chemical screening.

  9. Graph segmentation and support vector machines for bare earth classification from lidar

    Science.gov (United States)

    Shorter, Nicholas S.; Smith, O'Neil; Smith, Philip; Rahmes, Mark

    2014-06-01

    A novel approach using a support vector machine (SVM) is proposed to classify bare earth points in LiDAR point clouds. Using graph based segmentation, the LiDAR point cloud is segmented into a set of topological components. Several features establishing relationships from those components to their neighboring components are formulated. The SVM is then trained on the segment features to establish a model for the classification of bare earth and non bare earth points. Quantitative results are presented for training and testing the proposed SVM classifier on the ISPRS data set. Using the ISPRS data set as a training set, qualitative results are presented by testing the proposed SVM classifier on data downloaded from Open Topography; which covers a variety of different landscapes and building structures in Frazier Park, California. Despite the data being captured from different sensors, and collected from scenes with different terrain types and building structures, the results shown were processed with no parameter changes. Furthermore, a confidence value is returned indicating how well the unforeseen data fits the SVM's trained model for bare earth recognition.

  10. Performance and optimization of support vector machines in high-energy physics classification problems

    Energy Technology Data Exchange (ETDEWEB)

    Sahin, Mehmet Oezguer; Kruecker, Dirk; Melzer-Pellmann, Isabell [DESY, Hamburg (Germany)

    2016-07-01

    In this talk, the use of Support Vector Machines (SVM) is promoted for new-physics searches in high-energy physics. We developed an interface, called SVM HEP Interface (SVM-HINT), for a popular SVM library, LibSVM, and introduced a statistical-significance based hyper-parameter optimization algorithm for the new-physics searches. As example case study, a search for Supersymmetry at the Large Hadron Collider is given to demonstrate the capabilities of SVM using SVM-HINT.

  11. Performance and optimization of support vector machines in high-energy physics classification problems

    Science.gov (United States)

    Sahin, M. Ö.; Krücker, D.; Melzer-Pellmann, I.-A.

    2016-12-01

    In this paper we promote the use of Support Vector Machines (SVM) as a machine learning tool for searches in high-energy physics. As an example for a new-physics search we discuss the popular case of Supersymmetry at the Large Hadron Collider. We demonstrate that the SVM is a valuable tool and show that an automated discovery-significance based optimization of the SVM hyper-parameters is a highly efficient way to prepare an SVM for such applications.

  12. FEATURE RANKING BASED NESTED SUPPORT VECTOR MACHINE ENSEMBLE FOR MEDICAL IMAGE CLASSIFICATION.

    Science.gov (United States)

    Varol, Erdem; Gaonkar, Bilwaj; Erus, Guray; Schultz, Robert; Davatzikos, Christos

    2012-01-01

    This paper presents a method for classification of structural magnetic resonance images (MRI) of the brain. An ensemble of linear support vector machine classifiers (SVMs) is used for classifying a subject as either patient or normal control. Image voxels are first ranked based on the voxel wise t-statistics between the voxel intensity values and class labels. Then voxel subsets are selected based on the rank value using a forward feature selection scheme. Finally, an SVM classifier is trained on each subset of image voxels. The class label of a test subject is calculated by combining individual decisions of the SVM classifiers using a voting mechanism. The method is applied for classifying patients with neurological diseases such as Alzheimer's disease (AD) and autism spectrum disorder (ASD). The results on both datasets demonstrate superior performance as compared to two state of the art methods for medical image classification.

  13. FINGERPRINT CLASSIFICATION BASED ON RECURSIVE NEURAL NETWORK WITH SUPPORT VECTOR MACHINE

    Directory of Open Access Journals (Sweden)

    T. Chakravarthy

    2011-01-01

    Full Text Available Fingerprint classification based on statistical and structural (RNN and SVM approach. RNNs are trained on a structured representation of the fingerprint image. They are also used to extract a set of distributed features of the fingerprint which can be integrated in this support vector machine. SVMs are combined with a new error correcting codes scheme. This approach has two main advantages. (a It can tolerate the presence of ambiguous fingerprint images in the training set and (b It can effectively identify the most difficult fingerprint images in the test set. In this experiment on the fingerprint database NIST-4 (National Institute of Science and Technology, our best classification accuracy of 94.7% is obtained by training SVM on both fingerCode and RNN –extracted futures of segmentation algorithm which has used very sophisticated “region growing process”.

  14. Application of SVM in the classification of meat freshness%SVM方法在肉品新鲜度分类问题中的应用

    Institute of Scientific and Technical Information of China (English)

    刘静; 管骁

    2011-01-01

    Several fresh agricultural products, including pork, beef, mutton and shrimp samples, were stored in decompression storeroom, and the TVB-N content, total bacterial count, pH value and sensory scores of these samples in different time were determined to achieve the correct classification of freshness.The experiments showed that it was difficult to obtain the ideal classification accuracy by any single physicochemical or sensory properties.Therefore, SVM was taken into consideration to train the experimental data and the parameters would be optimized by rough and precise selection.And the obtained SVM model could be used to predict the meat freshness with high classification accuracy.%对猪肉、牛肉、羊肉及虾等几种生鲜农产品进行了减压贮藏实验,通过检测各种样品不同保藏时间的挥发性盐基氮含量(TVB-N)、细菌总数、pH值及感官评分数据,以期实现对其新鲜度的准确分类.实验结果表明,任何单一理化或感官指标都难以获得理想的分类正确率.在此基础上,运用支持向量机(support vector machine,SVM)方法对以上数据进行合理的综合训练,并对参数进行优化,从而得到SVM神经网络模型,利用此模型进行肉品的新鲜度分类预测,可大大提高分类正确率.

  15. Density-based penalty parameter optimization on C-SVM.

    Science.gov (United States)

    Liu, Yun; Lian, Jie; Bartolacci, Michael R; Zeng, Qing-An

    2014-01-01

    The support vector machine (SVM) is one of the most widely used approaches for data classification and regression. SVM achieves the largest distance between the positive and negative support vectors, which neglects the remote instances away from the SVM interface. In order to avoid a position change of the SVM interface as the result of an error system outlier, C-SVM was implemented to decrease the influences of the system's outliers. Traditional C-SVM holds a uniform parameter C for both positive and negative instances; however, according to the different number proportions and the data distribution, positive and negative instances should be set with different weights for the penalty parameter of the error terms. Therefore, in this paper, we propose density-based penalty parameter optimization of C-SVM. The experiential results indicated that our proposed algorithm has outstanding performance with respect to both precision and recall.

  16. Quantum-inspired evolutionary tuning of SVM parameters

    Institute of Scientific and Technical Information of China (English)

    Zhiyong Luo; Ping Wang; Yinguo Li; Wenfeng Zhang; Wei Tang; Min Xiang

    2008-01-01

    The most commonly used parameters selection method for support vector machines (SVM) is cross-validation, which needs a longtime complicated calculation. In this paper, a novel regularization parameter and a kernel parameter tuning approach of SVM are presented based on quantum-inspired evolutionary algorithm (QEA). QEA with quantum chromosome and quantum mutation has better global search capacity. The parameters of least squares support vector machines (LS-SVM) can be adjusted using quantum-inspired evolutionary optimization. Classification and function estimation are studied using LS-SVM with wavelet kernel and Gaussian kernel. The simulation results show that the proposed approach can effectively tune the parameters of LS-SVM, and the improved LS-SVM with wavelet kernel can provide better precision.

  17. Kernel-based machine learning techniques for infrasound signal classification

    Science.gov (United States)

    Tuma, Matthias; Igel, Christian; Mialle, Pierrick

    2014-05-01

    Infrasound monitoring is one of four remote sensing technologies continuously employed by the CTBTO Preparatory Commission. The CTBTO's infrasound network is designed to monitor the Earth for potential evidence of atmospheric or shallow underground nuclear explosions. Upon completion, it will comprise 60 infrasound array stations distributed around the globe, of which 47 were certified in January 2014. Three stages can be identified in CTBTO infrasound data processing: automated processing at the level of single array stations, automated processing at the level of the overall global network, and interactive review by human analysts. At station level, the cross correlation-based PMCC algorithm is used for initial detection of coherent wavefronts. It produces estimates for trace velocity and azimuth of incoming wavefronts, as well as other descriptive features characterizing a signal. Detected arrivals are then categorized into potentially treaty-relevant versus noise-type signals by a rule-based expert system. This corresponds to a binary classification task at the level of station processing. In addition, incoming signals may be grouped according to their travel path in the atmosphere. The present work investigates automatic classification of infrasound arrivals by kernel-based pattern recognition methods. It aims to explore the potential of state-of-the-art machine learning methods vis-a-vis the current rule-based and task-tailored expert system. To this purpose, we first address the compilation of a representative, labeled reference benchmark dataset as a prerequisite for both classifier training and evaluation. Data representation is based on features extracted by the CTBTO's PMCC algorithm. As classifiers, we employ support vector machines (SVMs) in a supervised learning setting. Different SVM kernel functions are used and adapted through different hyperparameter optimization routines. The resulting performance is compared to several baseline classifiers. All

  18. 基于SVM概率输出与证据理论的多分类方法%Multi-class Classification Method Based on SVM Probability Output and Evidence Theory

    Institute of Scientific and Technical Information of China (English)

    权文; 王晓丹; 王坚; 张玉玺

    2012-01-01

    单一技术无法有效解决多类分类问题.为此,提出一种基于一对多支持向量机(SVM)的基本概率分配 输出方法,并与置信最大熵模型的D-S证据组合方法结合,给出基于SVM概率输出和证据理论的多分类模型.在3种UCI标准数据集上的仿真 结果表明,该方法的分类精度优于传统的一对多和一对一硬输出方法,是一种有效的多类分类方法.%One-technology do not solve multi-class classification problem, on the basis of this, a basic probability output distribution method based on One-Against-All(OAA) Support Vector Machine(SVM) is proposed, a multi-class model based on Support Vector Machine(SVM) probability output and evidence theory is put forward by integrating one-against-all multi-class SVM with max-entropy D-S theory, . Simulations results on three datasets of UCI repository show that the method has higher classification precision than hard output method OAA and OAO.

  19. LHCb: Machine assisted histogram classification

    CERN Multimedia

    Somogyi, P; Gaspar, C

    2009-01-01

    LHCb is one of the four major experiments under completion at the Large Hadron Collider (LHC). Monitoring the quality of the acquired data is important, because it allows the verification of the detector performance. Anomalies, such as missing values or unexpected distributions can be indicators of a malfunctioning detector, resulting in poor data quality. Spotting faulty components can be either done visually using instruments such as the LHCb Histogram Presenter, or by automated tools. In order to assist detector experts in handling the vast monitoring information resulting from the sheer size of the detector, a graph-theoretic based clustering tool, combined with machine learning algorithms is proposed and demonstrated by processing histograms representing 2D event hitmaps. The concept is proven by detecting ion feedback events in the LHCb RICH subdetector.

  20. A learning-based similarity fusion and filtering approach for biomedical image retrieval using SVM classification and relevance feedback.

    Science.gov (United States)

    Rahman, Md Mahmudur; Antani, Sameer K; Thoma, George R

    2011-07-01

    This paper presents a classification-driven biomedical image retrieval framework based on image filtering and similarity fusion by employing supervised learning techniques. In this framework, the probabilistic outputs of a multiclass support vector machine (SVM) classifier as category prediction of query and database images are exploited at first to filter out irrelevant images, thereby reducing the search space for similarity matching. Images are classified at a global level according to their modalities based on different low-level, concept, and keypoint-based features. It is difficult to find a unique feature to compare images effectively for all types of queries. Hence, a query-specific adaptive linear combination of similarity matching approach is proposed by relying on the image classification and feedback information from users. Based on the prediction of a query image category, individual precomputed weights of different features are adjusted online. The prediction of the classifier may be inaccurate in some cases and a user might have a different semantic interpretation about retrieved images. Hence, the weights are finally determined by considering both precision and rank order information of each individual feature representation by considering top retrieved relevant images as judged by the users. As a result, the system can adapt itself to individual searches to produce query-specific results. Experiment is performed in a diverse collection of 5 000 biomedical images of different modalities, body parts, and orientations. It demonstrates the efficiency (about half computation time compared to search on entire collection) and effectiveness (about 10%-15% improvement in precision at each recall level) of the retrieval approach.

  1. 基于SVM方法的猪肉新鲜度分类问题研究%Studies on the Classification of Pork Freshness by SVM

    Institute of Scientific and Technical Information of China (English)

    刘静; 管骁

    2011-01-01

    The pork freshness is a big safety issue on people's health. In this paper, fresh pork samples were stored in decompression storage room. The TVB-N content, total bacterial count, pH value and sensory scores of the samples were determined at different storage stage. SVM neural networks models were obtained by training the sample data with different kernel functions and cross -validation. Furthermore, the test data were used to predict the freshness of pork sample by SVM neural network. The experiment results suggested that the SVM neural networks obtained higher correct classification rate of pork freshness with the right kernel function and cross - validation according to the sample performance.%文中采取减压贮藏方式对新鲜猪肉进行了贮藏实验,测定了不同贮藏时间样品的挥发性盐基氮含量(TVB-N)、细菌总数、pH值及感官评价数据,并运用支持向量机(support vector machine,SVM)对这些样本数据进行训练,选取不同的核函数,得到SVM神经网络模型,随后利用此模型对测试数据进行猪肉新鲜度分类预测.实验表明,根据样本特性进行数据预处理,且选取合适的核函数后,SVM神经网络能得到极高的猪肉新鲜度正确分类率.

  2. A Hybrid Prediction Method of Thermal Extension Error for Boring Machine Based on PCA and LS-SVM

    Directory of Open Access Journals (Sweden)

    Cheng Qiang

    2017-01-01

    Full Text Available Thermal extension error of boring bar in z-axis is one of the key factors that have a bad influence on the machining accuracy of boring machine, so how to exactly establish the relationship between the thermal extension length and temperature and predict the changing rule of thermal error are the premise of thermal extension error compensation. In this paper, a prediction method of thermal extension length of boring bar in boring machine is proposed based on principal component analysis (PCA and least squares support vector machine (LS-SVM model. In order to avoid the multiple correlation and coupling among the great amount temperature input variables, firstly, PCA is introduced to extract the principal components of temperature data samples. Then, LS-SVM is used to predict the changing tendency of the thermally induced thermal extension error of boring bar. Finally, experiments are conducted on a boring machine, the application results show that Boring bar axial thermal elongation error residual value dropped below 5 μm and minimum residual error is only 0.5 μm. This method not only effectively improve the efficiency of the temperature data acquisition and analysis, and improve the modeling accuracy and robustness.

  3. Photometric Supernova Classification with Machine Learning

    Science.gov (United States)

    Lochner, Michelle; McEwen, Jason D.; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K.

    2016-08-01

    Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k-nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.

  4. Diagnostic accuracy of Parkinson disease by support vector machine (SVM) analysis of 123I-FP-CIT brain SPECT data: implications of putaminal findings and age.

    Science.gov (United States)

    Palumbo, Barbara; Fravolini, Mario Luca; Buresta, Tommaso; Pompili, Filippo; Forini, Nevio; Nigro, Pasquale; Calabresi, Paolo; Tambasco, Nicola

    2014-12-01

    Brain single-photon-emission-computerized tomography (SPECT) with I-ioflupane (I-FP-CIT) is useful to diagnose Parkinson disease (PD). To investigate the diagnostic performance of I-FP-CIT brain SPECT with semiquantitative analysis by Basal Ganglia V2 software (BasGan), we evaluated semiquantitative data of patients with suspect of PD by a support vector machine classifier (SVM), a powerful supervised classification algorithm.I-FP-CIT SPECT with BasGan analysis was performed in 90 patients with suspect of PD showing mild symptoms (bradykinesia-rigidity and mild tremor). PD was confirmed in 56 patients, 34 resulted non-PD (essential tremor and drug-induced Parkinsonism). A clinical follow-up of at least 6 months confirmed diagnosis. To investigate BasGan diagnostic performance we trained SVM classification models featuring different descriptors using both a "leave-one-out" and a "five-fold" method. In the first study we used as class descriptors the semiquantitative radiopharmaceutical uptake values in the left (L) and right (R) putamen (P) and in the L and R caudate nucleus (C) for a total of 4 descriptors (CL, CR, PL, PR). In the second study each patient was described only by CL and CR, while in the third by PL and PR descriptors. Age was added as a further descriptor to evaluate its influence in the classification performance.I-FP-CIT SPECT with BasGan analysis reached a classification performance higher than 73.9% in all the models. Considering the "Leave-one-out" method, PL and PR were better predictors (accuracy of 91% for all patients) than CL and CR descriptors; using PL, PR, CL, and CR diagnostic accuracy was similar to that of PL and PR descriptors in the different groups. Adding age as a further descriptor accuracy improved in all the models. The best results were obtained by using all the 5 descriptors both in PD and non-PD subjects (CR and CL + PR and PL + age = 96.4% and 94.1%, respectively). Similar results were observed for the "five

  5. Multiple mental tasks classification based on nonlinear parameter of mean period using support vector machines

    Institute of Scientific and Technical Information of China (English)

    Liu Hailong; Wang Jue; Zheng Chongxun

    2007-01-01

    Mental task classification is one of the most important problems in Brain-computer interface. This paper studies the classification of five-class mental tasks. The nonlinear parameter of mean period obtained from frequency domain information was used as features for classification implemented by using the method of SVM (support vector machines). The averaged classification accuracy of 85.6% over 7 subjects was achieved for 2-second EEG segments. And the results for EEG segments of 0.5s and 5.0s compared favorably to those of Garrett's. The results indicate that the parameter of mean period represents mental tasks well for classification. Furthermore, the method of mean period is less computationally demanding, which indicates its potential use for online BCI systems.

  6. Design and implementation of an SVM-based computer classification system for discriminating depressive patients from healthy controls using the P600 component of ERP signals.

    Science.gov (United States)

    Kalatzis, I; Piliouras, N; Ventouras, E; Papageorgiou, C C; Rabavilas, A D; Cavouras, D

    2004-07-01

    A computer-based classification system has been designed capable of distinguishing patients with depression from normal controls by event-related potential (ERP) signals using the P600 component. Clinical material comprised 25 patients with depression and an equal number of gender and aged-matched healthy controls. All subjects were evaluated by a computerized version of the digit span Wechsler test. EEG activity was recorded and digitized from 15 scalp electrodes (leads). Seventeen features related to the shape of the waveform were generated and were employed in the design of an optimum support vector machine (SVM) classifier at each lead. The outcomes of those SVM classifiers were selected by a majority-vote engine (MVE), which assigned each subject to either the normal or depressive classes. MVE classification accuracy was 94% when using all leads and 92% or 82% when using only the right or left scalp leads, respectively. These findings support the hypothesis that depression is associated with dysfunction of right hemisphere mechanisms mediating the processing of information that assigns a specific response to a specific stimulus, as those mechanisms are reflected by the P600 component of ERPs. Our method may aid the further understanding of the neurophysiology underlying depression, due to its potentiality to integrate theories of depression and psychophysiology.

  7. AFREET: HUMAN-INSPIRED SPATIO-SPECTRAL FEATURE CONSTRUCTION FOR IMAGE CLASSIFICATION WITH SUPPORT VECTOR MACHINES

    Energy Technology Data Exchange (ETDEWEB)

    S. PERKINS; N. HARVEY

    2001-02-01

    The authors examine the task of pixel-by-pixel classification of the multispectral and grayscale images typically found in remote-sensing and medical applications. Simple machine learning techniques have long been applied to remote-sensed image classification, but almost always using purely spectral information about each pixel. Humans can often outperform these systems, and make extensive use of spatial context to make classification decisions. They present AFREET: an SVM-based learning system which attempts to automatically construct and refine spatio-spectral features in a somewhat human-inspired fashion. Comparisons with traditionally used machine learning techniques show that AFREET achieves significantly higher performance. The use of spatial context is particularly useful for medical imagery, where multispectral images are still rare.

  8. Analytic estimation of statistical significance maps for support vector machine based multi-variate image analysis and classification.

    Science.gov (United States)

    Gaonkar, Bilwaj; Davatzikos, Christos

    2013-09-01

    Multivariate pattern analysis (MVPA) methods such as support vector machines (SVMs) have been increasingly applied to fMRI and sMRI analyses, enabling the detection of distinctive imaging patterns. However, identifying brain regions that significantly contribute to the classification/group separation requires computationally expensive permutation testing. In this paper we show that the results of SVM-permutation testing can be analytically approximated. This approximation leads to more than a thousandfold speedup of the permutation testing procedure, thereby rendering it feasible to perform such tests on standard computers. The speedup achieved makes SVM based group difference analysis competitive with standard univariate group difference analysis methods.

  9. A fuzzy based feature selection from independent component subspace for machine learning classification of microarray data

    Directory of Open Access Journals (Sweden)

    Rabia Aziz

    2016-06-01

    Full Text Available Feature (gene selection and classification of microarray data are the two most interesting machine learning challenges. In the present work two existing feature selection/extraction algorithms, namely independent component analysis (ICA and fuzzy backward feature elimination (FBFE are used which is a new combination of selection/extraction. The main objective of this paper is to select the independent components of the DNA microarray data using FBFE to improve the performance of support vector machine (SVM and Naïve Bayes (NB classifier, while making the computational expenses affordable. To show the validity of the proposed method, it is applied to reduce the number of genes for five DNA microarray datasets namely; colon cancer, acute leukemia, prostate cancer, lung cancer II, and high-grade glioma. Now these datasets are then classified using SVM and NB classifiers. Experimental results on these five microarray datasets demonstrate that gene selected by proposed approach, effectively improve the performance of SVM and NB classifiers in terms of classification accuracy. We compare our proposed method with principal component analysis (PCA as a standard extraction algorithm and find that the proposed method can obtain better classification accuracy, using SVM and NB classifiers with a smaller number of selected genes than the PCA. The curve between the average error rate and number of genes with each dataset represents the selection of required number of genes for the highest accuracy with our proposed method for both the classifiers. ROC shows best subset of genes for both the classifier of different datasets with propose method.

  10. Hyperspectral image preprocessing with bilateral filter for improving the classification accuracy of support vector machines

    Science.gov (United States)

    Sahadevan, Anand S.; Routray, Aurobinda; Das, Bhabani S.; Ahmad, Saquib

    2016-04-01

    Bilateral filter (BF) theory is applied to integrate spatial contextual information into the spectral domain for improving the accuracy of the support vector machine (SVM) classifier. The proposed classification framework is a two-stage process. First, an edge-preserved smoothing is carried out on a hyperspectral image (HSI). Then, the SVM multiclass classifier is applied on the smoothed HSI. One of the advantages of the BF-based implementation is that it considers the spatial as well as spectral closeness for smoothing the HSI. Therefore, the proposed method provides better smoothing in the homogeneous region and preserves the image details, which in turn improves the separability between the classes. The performance of the proposed method is tested using benchmark HSIs obtained from the airborne-visible-infrared-imaging-spectrometer (AVIRIS) and the reflective-optics-system-imaging-spectrometer (ROSIS) sensors. Experimental results demonstrate the effectiveness of the edge-preserved filtering in the classification of the HSI. Average accuracies (with 10% training samples) of the proposed classification framework are 99.04%, 98.11%, and 96.42% for AVIRIS-Salinas, ROSIS-Pavia University, and AVIRIS-Indian Pines images, respectively. Since the proposed method follows a combination of BF and the SVM formulations, it will be quite simple and practical to implement in real applications.

  11. Support Vector Machine-Based Human Behavior Classification in Crowd through Projection and Star Skeletonization

    Directory of Open Access Journals (Sweden)

    Yogameena, B.

    2010-01-01

    Full Text Available Problem statement: Detection of individual’s abnormal human behaviors in the crowd has become a critical problem because in the event of terror strikes. This study presented a real-time video surveillance system which classifies normal and abnormal behaviors in crowds. The aim of this research was to provide a system which can aid in monitoring crowded urban environments. Approach: The proposed behaviour classification was through projection which separated individuals and using star skeletonization the features like body posture and the cyclic motion cues were obtained. Using these cues the Support Vector Machine (SVM classified the normal and abnormal behaviors of human. Results: Experimental results demonstrated the method proposed was robust and efficient in the classification of normal and abnormal human behaviors. A comparative study of classification accuracy between principal component analysis and Support Vector Machine (SVM classification was also presented. Conclusion: The proposed method classified the behavior such as running people in a crowded environment, bending down movement while most are walking or standing, a person carrying a long bar and a person waving hand in the crowd is classified.

  12. Classification of Sets using Restricted Boltzmann Machines

    CERN Document Server

    Louradour, Jérôme

    2011-01-01

    We consider the problem of classification when inputs correspond to sets of vectors. This setting occurs in many problems such as the classification of pieces of mail containing several pages, of web sites with several sections or of images that have been pre-segmented into smaller regions. We propose generalizations of the restricted Boltzmann machine (RBM) that are appropriate in this context and explore how to incorporate different assumptions about the relationship between the input sets and the target class within the RBM. In experiments on standard multiple-instance learning datasets, we demonstrate the competitiveness of approaches based on RBMs and apply the proposed variants to the problem of incoming mail classification.

  13. 偏置b对支持向量机分类问题泛化性能的影响%Influence of Bias b on Generalization Ability of SVM for Classification

    Institute of Scientific and Technical Information of China (English)

    丁晓剑; 赵银亮

    2011-01-01

    Poggio指出支持向量机(Support vector machine,SVM)中偏置b项是为了保证核函数的正定性,当使用的核函数为正定核时,b就不需要存在.为了验证b对SVM分类问题泛化性能的影响,研究了无b SVM的优化问题并给出了相应的有效集求解算法.通过XOR分类问题的实验研究得出约束条件∑N1yiαi=0会影响SVM得到最佳分类超平面.实验中的基准数据集包括了中小数据集、大规模数据集、高维数据集和多类分类数据集,并使用高斯正定核和多项式正定核作为核函数.基于26个标准数据集的实验表明无b SVM在分类问题中的计算代价要低于SVM,泛化性能要好于SVM.参数敏感性测试表明无b SVM对代价参数变化不太敏感,这使得无b SVM能在较少的参数值对中得到最佳测试精度.%It has been pointed out by Poggio that the b term in support vector machine (SVM) is to guarantee the positive definitiveness of kernel and b is not needed if the used kernel is positive definite. To testify the role of b in the generalization ability of SVM for classification, optimization formulation of SVM without 6 is analyzed and the corresponding active set solution algorithm is proposed. By experiments on XOR classification problem, it can be concluded that SVM would fail to reach the optimum classification hyperplane due to the existence of constraint condition ∑1N yiαi=0. Small to medium data sets, large data sets, high-dimension data sets and mutli-class classification data sets are employed in the simulations as well as the Gaussian positive definite kernel and polynomial positive definite kernel are used. The experimental results on 26 benchmark data sets show that the computational cost of SVM without b is over that of SVM and the generalization performance is over SVM. SVM without b is less sensitive to cost parameter, and this makes SVM without b reaches the optimal testing rate with less parameters pair values.

  14. Performance and optimization of support vector machines in high-energy physics classification problems

    CERN Document Server

    Sahin, Mehmet Özgür; Melzer-Pellmann, Isabell-Alissandra

    2016-01-01

    In this paper we promote the use of Support Vector Machines (SVM) as a machine learning tool for searches in high-energy physics. As an example for a new- physics search we discuss the popular case of Supersymmetry at the Large Hadron Collider. We demonstrate that the SVM is a valuable tool and show that an automated discovery- significance based optimization of the SVM hyper-parameters is a highly efficient way to prepare an SVM for such applications. A new C++ LIBSVM interface called SVM-HINT is developed and available on Github.

  15. Performance and optimization of support vector machines in high-energy physics classification problems

    Energy Technology Data Exchange (ETDEWEB)

    Sahin, M.Oe.; Kruecker, D.; Melzer-Pellmann, I.A.

    2016-01-15

    In this paper we promote the use of Support Vector Machines (SVM) as a machine learning tool for searches in high-energy physics. As an example for a new-physics search we discuss the popular case of Supersymmetry at the Large Hadron Collider. We demonstrate that the SVM is a valuable tool and show that an automated discovery-significance based optimization of the SVM hyper-parameters is a highly efficient way to prepare an SVM for such applications. A new C++ LIBSVM interface called SVM-HINT is developed and available on Github.

  16. Generalized SMO algorithm for SVM-based multitask learning.

    Science.gov (United States)

    Cai, Feng; Cherkassky, Vladimir

    2012-06-01

    Exploiting additional information to improve traditional inductive learning is an active research area in machine learning. In many supervised-learning applications, training data can be naturally separated into several groups, and incorporating this group information into learning may improve generalization. Recently, Vapnik proposed a general approach to formalizing such problems, known as "learning with structured data" and its support vector machine (SVM) based optimization formulation called SVM+. Liang and Cherkassky showed the connection between SVM+ and multitask learning (MTL) approaches in machine learning, and proposed an SVM-based formulation for MTL called SVM+MTL for classification. Training the SVM+MTL classifier requires the solution of a large quadratic programming optimization problem which scales as O(n(3)) with sample size n. So there is a need to develop computationally efficient algorithms for implementing SVM+MTL. This brief generalizes Platt's sequential minimal optimization (SMO) algorithm to the SVM+MTL setting. Empirical results show that, for typical SVM+MTL problems, the proposed generalized SMO achieves over 100 times speed-up, in comparison with general-purpose optimization routines.

  17. Towards automatic lithological classification from remote sensing data using support vector machines

    Science.gov (United States)

    Yu, Le; Porwal, Alok; Holden, Eun-Jung; Dentith, Michael

    2010-05-01

    Remote sensing data can be effectively used as a mean to build geological knowledge for poorly mapped terrains. Spectral remote sensing data from space- and air-borne sensors have been widely used to geological mapping, especially in areas of high outcrop density in arid regions. However, spectral remote sensing information by itself cannot be efficiently used for a comprehensive lithological classification of an area due to (1) diagnostic spectral response of a rock within an image pixel is conditioned by several factors including the atmospheric effects, spectral and spatial resolution of the image, sub-pixel level heterogeneity in chemical and mineralogical composition of the rock, presence of soil and vegetation cover; (2) only surface information and is therefore highly sensitive to the noise due to weathering, soil cover, and vegetation. Consequently, for efficient lithological classification, spectral remote sensing data needs to be supplemented with other remote sensing datasets that provide geomorphological and subsurface geological information, such as digital topographic model (DEM) and aeromagnetic data. Each of the datasets contain significant information about geology that, in conjunction, can potentially be used for automated lithological classification using supervised machine learning algorithms. In this study, support vector machine (SVM), which is a kernel-based supervised learning method, was applied to automated lithological classification of a study area in northwestern India using remote sensing data, namely, ASTER, DEM and aeromagnetic data. Several digital image processing techniques were used to produce derivative datasets that contained enhanced information relevant to lithological discrimination. A series of SVMs (trained using k-folder cross-validation with grid search) were tested using various combinations of input datasets selected from among 50 datasets including the original 14 ASTER bands and 36 derivative datasets (including 14

  18. An SVM-Based Classifier for Estimating the State of Various Rotating Components in Agro-Industrial Machinery with a Vibration Signal Acquired from a Single Point on the Machine Chassis

    Directory of Open Access Journals (Sweden)

    Ruben Ruiz-Gonzalez

    2014-11-01

    Full Text Available The goal of this article is to assess the feasibility of estimating the state of various rotating components in agro-industrial machinery by employing just one vibration signal acquired from a single point on the machine chassis. To do so, a Support Vector Machine (SVM-based system is employed. Experimental tests evaluated this system by acquiring vibration data from a single point of an agricultural harvester, while varying several of its working conditions. The whole process included two major steps. Initially, the vibration data were preprocessed through twelve feature extraction algorithms, after which the Exhaustive Search method selected the most suitable features. Secondly, the SVM-based system accuracy was evaluated by using Leave-One-Out cross-validation, with the selected features as the input data. The results of this study provide evidence that (i accurate estimation of the status of various rotating components in agro-industrial machinery is possible by processing the vibration signal acquired from a single point on the machine structure; (ii the vibration signal can be acquired with a uniaxial accelerometer, the orientation of which does not significantly affect the classification accuracy; and, (iii when using an SVM classifier, an 85% mean cross-validation accuracy can be reached, which only requires a maximum of seven features as its input, and no significant improvements are noted between the use of either nonlinear or linear kernels.

  19. Review on Feature Selection Techniques and the Impact of SVM for Cancer Classification using Gene Expression Profile

    CERN Document Server

    George, G Victo Sudha; 10.5121/ijcses.2011.2302

    2011-01-01

    The DNA microarray technology has modernized the approach of biology research in such a way that scientists can now measure the expression levels of thousands of genes simultaneously in a single experiment. Gene expression profiles, which represent the state of a cell at a molecular level, have great potential as a medical diagnosis tool. But compared to the number of genes involved, available training data sets generally have a fairly small sample size for classification. These training data limitations constitute a challenge to certain classification methodologies. Feature selection techniques can be used to extract the marker genes which influence the classification accuracy effectively by eliminating the un wanted noisy and redundant genes This paper presents a review of feature selection techniques that have been employed in micro array data based cancer classification and also the predominant role of SVM for cancer classification.

  20. Classification of EEG data using FHT and SVM based on Bayesian Network

    Directory of Open Access Journals (Sweden)

    V. Baby Deepa

    2011-09-01

    Full Text Available Brain Computer Interface (BCI enables the capturing and processing of motor imagery related brain signals which can be interpreted by computers. BCI systems capture the motor imagery signals via Electroencephalogram or Electrocorticogram. The processing of the signal is usually attempted by extracting feature vectors in the frequency domain and using classification algorithms to interpret the motor imagery action. In this paper we investigate the motor imagery signals obtained from BCI competition dataset IVA using the Fast Hartley Transform (FHT for feature vector extraction and feature reduction using support vector machine. The processed data is trained and classified using the Bayes Net.

  1. Applying Machine Learning to Star Cluster Classification

    Science.gov (United States)

    Fedorenko, Kristina; Grasha, Kathryn; Calzetti, Daniela; Mahadevan, Sridhar

    2016-01-01

    Catalogs describing populations of star clusters are essential in investigating a range of important issues, from star formation to galaxy evolution. Star cluster catalogs are typically created in a two-step process: in the first step, a catalog of sources is automatically produced; in the second step, each of the extracted sources is visually inspected by 3-to-5 human classifiers and assigned a category. Classification by humans is labor-intensive and time consuming, thus it creates a bottleneck, and substantially slows down progress in star cluster research.We seek to automate the process of labeling star clusters (the second step) through applying supervised machine learning techniques. This will provide a fast, objective, and reproducible classification. Our data is HST (WFC3 and ACS) images of galaxies in the distance range of 3.5-12 Mpc, with a few thousand star clusters already classified by humans as a part of the LEGUS (Legacy ExtraGalactic UV Survey) project. The classification is based on 4 labels (Class 1 - symmetric, compact cluster; Class 2 - concentrated object with some degree of asymmetry; Class 3 - multiple peak system, diffuse; and Class 4 - spurious detection). We start by looking at basic machine learning methods such as decision trees. We then proceed to evaluate performance of more advanced techniques, focusing on convolutional neural networks and other Deep Learning methods. We analyze the results, and suggest several directions for further improvement.

  2. Analytic estimation of statistical significance maps for support vector machine based multi-variate image analysis and classification

    OpenAIRE

    Gaonkar, Bilwaj; Davatzikos, Christos

    2013-01-01

    Multivariate pattern analysis (MVPA) methods such as support vector machines (SVMs) have been increasingly applied to fMRI and sMRI analyses, enabling the detection of distinctive imaging patterns. However, identifying brain regions that significantly contribute to the classification/group separation requires computationally expensive permutation testing. In this paper we show that the results of SVM-permutation testing can be analytically approximated. This approximation leads to more than a...

  3. Combining MLC and SVM Classifiers for Learning Based Decision Making: Analysis and Evaluations.

    Science.gov (United States)

    Zhang, Yi; Ren, Jinchang; Jiang, Jianmin

    2015-01-01

    Maximum likelihood classifier (MLC) and support vector machines (SVM) are two commonly used approaches in machine learning. MLC is based on Bayesian theory in estimating parameters of a probabilistic model, whilst SVM is an optimization based nonparametric method in this context. Recently, it is found that SVM in some cases is equivalent to MLC in probabilistically modeling the learning process. In this paper, MLC and SVM are combined in learning and classification, which helps to yield probabilistic output for SVM and facilitate soft decision making. In total four groups of data are used for evaluations, covering sonar, vehicle, breast cancer, and DNA sequences. The data samples are characterized in terms of Gaussian/non-Gaussian distributed and balanced/unbalanced samples which are then further used for performance assessment in comparing the SVM and the combined SVM-MLC classifier. Interesting results are reported to indicate how the combined classifier may work under various conditions.

  4. Combining MLC and SVM Classifiers for Learning Based Decision Making: Analysis and Evaluations

    Directory of Open Access Journals (Sweden)

    Yi Zhang

    2015-01-01

    Full Text Available Maximum likelihood classifier (MLC and support vector machines (SVM are two commonly used approaches in machine learning. MLC is based on Bayesian theory in estimating parameters of a probabilistic model, whilst SVM is an optimization based nonparametric method in this context. Recently, it is found that SVM in some cases is equivalent to MLC in probabilistically modeling the learning process. In this paper, MLC and SVM are combined in learning and classification, which helps to yield probabilistic output for SVM and facilitate soft decision making. In total four groups of data are used for evaluations, covering sonar, vehicle, breast cancer, and DNA sequences. The data samples are characterized in terms of Gaussian/non-Gaussian distributed and balanced/unbalanced samples which are then further used for performance assessment in comparing the SVM and the combined SVM-MLC classifier. Interesting results are reported to indicate how the combined classifier may work under various conditions.

  5. Tackling EEG signal classification with least squares support vector machines: a sensitivity analysis study.

    Science.gov (United States)

    Lima, Clodoaldo A M; Coelho, André L V; Eisencraft, Marcio

    2010-08-01

    The electroencephalogram (EEG) signal captures the electrical activity of the brain and is an important source of information for studying neurological disorders. The proper analysis of this biological signal plays an important role in the domain of brain-computer interface, which aims at the construction of communication channels between human brain and computers. In this paper, we investigate the application of least squares support vector machines (LS-SVM) to the task of epilepsy diagnosis through automatic EEG signal classification. More specifically, we present a sensitivity analysis study by means of which the performance levels exhibited by standard and least squares SVM classifiers are contrasted, taking into account the setting of the kernel function and of its parameter value. Results of experiments conducted over different types of features extracted from a benchmark EEG signal dataset evidence that the sensitivity profiles of the kernel machines are qualitatively similar, both showing notable performance in terms of accuracy and generalization. In addition, the performance accomplished by optimally configured LS-SVM models is also quantitatively contrasted with that obtained by related approaches for the same dataset. Copyright 2010 Elsevier Ltd. All rights reserved.

  6. Automatic classification of athletes with residual functional deficits following concussion by means of EEG signal using support vector machine.

    Science.gov (United States)

    Cao, Cheng; Tutwiler, Richard Laurence; Slobounov, Semyon

    2008-08-01

    There is a growing body of knowledge indicating long-lasting residual electroencephalography (EEG) abnormalities in concussed athletes that may persist up to 10-year postinjury. Most often, these abnormalities are initially overlooked using traditional concussion assessment tools. Accordingly, premature return to sport participation may lead to recurrent episodes of concussion, increasing the risk of recurrent concussions with more severe consequences. Sixty-one athletes at high risk for concussion (i.e., collegiate rugby and football players) were recruited and underwent EEG baseline assessment. Thirty of these athletes suffered from concussion and were retested at day 30 postinjury. A number of task-related EEG recordings were conducted. A novel classification algorithm, the support vector machine (SVM), was applied as a classifier to identify residual functional abnormalities in athletes suffering from concussion using a multichannel EEG data set. The total accuracy of the classifier using the 10 features was 77.1%. The classifier has a high sensitivity of 96.7% (linear SVM), 80.0% (nonlinear SVM), and a relatively lower but acceptable selectivity of 69.1% (linear SVM) and 75.0% (nonlinear SVM). The major findings of this report are as follows: 1) discriminative features were observed at theta, alpha, and beta frequency bands, 2) the minimal redundancy relevance method was identified as being superior to the univariate t -test method in selecting features for the model calculation, 3) the EEG features selected for the classification model are linked to temporal and occipital areas, and 4) postural parameters influence EEG data set and can be used as discriminative features for the classification model. Overall, this report provides sufficient evidence that 10 EEG features selected for final analysis and SVM may be potentially used in clinical practice for automatic classification of athletes with residual brain functional abnormalities following a concussion

  7. Thermo-refrigerating machineries. Classification; Machines thermofrigorifiques. Classification

    Energy Technology Data Exchange (ETDEWEB)

    Duminil, M. [Association Francaise du Froid (AFF), 75 - Paris (France)

    2002-07-01

    Thermo-refrigerating systems transfer the heat extracted from a cold source towards a heat source and consume thermal energy from a third source. This article proposes a classification of thermo-refrigerating systems in three categories: the systems with a changing state working fluid (physical change of the refrigerant: dissociable systems, integrated systems (ejection systems, sorption systems); chemical change of the refrigerant), the systems where the working fluid stays in the same physical state (dissociable systems (Brayton, Siemens, Stirling and Ericsson cycles), integrated systems (Vuilleumier cycle systems, thermochemical systems)) and the other systems (Seebeck thermoelectric generator with Peltier effect modules). Dissociable thermo-refrigerating systems are made of the grouping of two separate thermal machines: a thermal engine and a mechanical-refrigerating machine. (J.S.)

  8. A comparative study of surface EMG classification by fuzzy relevance vector machine and fuzzy support vector machine.

    Science.gov (United States)

    Xie, Hong-Bo; Huang, Hu; Wu, Jianhua; Liu, Lei

    2015-02-01

    We present a multiclass fuzzy relevance vector machine (FRVM) learning mechanism and evaluate its performance to classify multiple hand motions using surface electromyographic (sEMG) signals. The relevance vector machine (RVM) is a sparse Bayesian kernel method which avoids some limitations of the support vector machine (SVM). However, RVM still suffers the difficulty of possible unclassifiable regions in multiclass problems. We propose two fuzzy membership function-based FRVM algorithms to solve such problems, based on experiments conducted on seven healthy subjects and two amputees with six hand motions. Two feature sets, namely, AR model coefficients and room mean square value (AR-RMS), and wavelet transform (WT) features, are extracted from the recorded sEMG signals. Fuzzy support vector machine (FSVM) analysis was also conducted for wide comparison in terms of accuracy, sparsity, training and testing time, as well as the effect of training sample sizes. FRVM yielded comparable classification accuracy with dramatically fewer support vectors in comparison with FSVM. Furthermore, the processing delay of FRVM was much less than that of FSVM, whilst training time of FSVM much faster than FRVM. The results indicate that FRVM classifier trained using sufficient samples can achieve comparable generalization capability as FSVM with significant sparsity in multi-channel sEMG classification, which is more suitable for sEMG-based real-time control applications.

  9. The VIMOS Public Extragalactic Redshift Survey (VIPERS). A Support Vector Machine classification of galaxies, stars and AGNs

    CERN Document Server

    Malek, K; Pollo, A; Fritz, A; Garilli, B; Scodeggio, M; Iovino, A; Granett, B R; Abbas, U; Adami, C; Arnouts, S; Bel, J; Bolzonella, M; Bottini, D; Branchini, E; Cappi, A; Coupon, J; Cucciati, O; Davidzon, I; De Lucia, G; de la Torre, S; Franzetti, P; Fumana, M; Guzzo, L; Ilbert, O; Krywult, J; Brun, V Le; Fevre, O Le; Maccagni, D; Marulli, F; McCracken, H J; Paioro, L; Polletta, M; Schlagenhaufer, H; Tasca, L A M; Tojeiro, R; Vergani, D; Zanichelli, A; Burden, A; Di Porto, C; Marchetti, A; Marinoni, C; Mellier, Y; Moscardini, L; Nichol, R C; Peacock, J A; Percival, W J; Phleps, S; Wolk, M; Zamorani, G

    2013-01-01

    The aim of this work is to develop a comprehensive method for classifying sources in large sky surveys and we apply the techniques to the VIMOS Public Extragalactic Redshift Survey (VIPERS). Using the optical (u*, g', r', i') and NIR data (z', Ks), we develop a classifier for identifying stars, AGNs and galaxies improving the purity of the VIPERS sample. Support Vector Machine (SVM) supervised learning algorithms allow the automatic classification of objects into two or more classes based on a multidimensional parameter space. In this work, we tailored the SVM for classifying stars, AGNs and galaxies, and applied this classification to the VIPERS data. We train the SVM using spectroscopically confirmed sources from the VIPERS and VVDS surveys. We tested two SVM classifiers and concluded that including NIR data can significantly improve the efficiency of the classifier. The self-check of the best optical + NIR classifier has shown a 97% accuracy in the classification of galaxies, 97 for stars, and 95 for AGNs ...

  10. 集群SVM大规模数据分类算法%Ensemble SVM classification for large scale data sets

    Institute of Scientific and Technical Information of China (English)

    邝神芬; 李银

    2011-01-01

    In order to improve the efficiency and precision of the SVM that training on large scale data sets, after preprocessing the data,we run the unsupervised clustering which holds at certain rules by which selecting a sample of training vectors that are useful for SVM, Then incorporate the enhanced AdaBoost algorithm to improve the SVM ability for classification and generalization,Finally we use dataset Kdd Cup 99 to verified performance of the algorithm.%为了提高SVM在大规模数据集上的训练效率和检测精度,对训练数据预处理后进行无监督聚类,通过一定规则选取对训练SVM有用的样本向量,并结合改进的AdaBoost算法来增强SVM在大规模数据的分类和泛化能力,最后通过Kdd Cup 99数据进行实验验证算法性能.

  11. Clustering Categories in Support Vector Machines

    DEFF Research Database (Denmark)

    Carrizosa, Emilio; Nogales-Gómez, Amaya; Morales, Dolores Romero

    2017-01-01

    The support vector machine (SVM) is a state-of-the-art method in supervised classification. In this paper the Cluster Support Vector Machine (CLSVM) methodology is proposed with the aim to increase the sparsity of the SVM classifier in the presence of categorical features, leading to a gain in in...

  12. Robust automated detection of microstructural white matter degeneration in Alzheimer's disease using machine learning classification of multicenter DTI data.

    Directory of Open Access Journals (Sweden)

    Martin Dyrba

    Full Text Available Diffusion tensor imaging (DTI based assessment of white matter fiber tract integrity can support the diagnosis of Alzheimer's disease (AD. The use of DTI as a biomarker, however, depends on its applicability in a multicenter setting accounting for effects of different MRI scanners. We applied multivariate machine learning (ML to a large multicenter sample from the recently created framework of the European DTI study on Dementia (EDSD. We hypothesized that ML approaches may amend effects of multicenter acquisition. We included a sample of 137 patients with clinically probable AD (MMSE 20.6±5.3 and 143 healthy elderly controls, scanned in nine different scanners. For diagnostic classification we used the DTI indices fractional anisotropy (FA and mean diffusivity (MD and, for comparison, gray matter and white matter density maps from anatomical MRI. Data were classified using a Support Vector Machine (SVM and a Naïve Bayes (NB classifier. We used two cross-validation approaches, (i test and training samples randomly drawn from the entire data set (pooled cross-validation and (ii data from each scanner as test set, and the data from the remaining scanners as training set (scanner-specific cross-validation. In the pooled cross-validation, SVM achieved an accuracy of 80% for FA and 83% for MD. Accuracies for NB were significantly lower, ranging between 68% and 75%. Removing variance components arising from scanners using principal component analysis did not significantly change the classification results for both classifiers. For the scanner-specific cross-validation, the classification accuracy was reduced for both SVM and NB. After mean correction, classification accuracy reached a level comparable to the results obtained from the pooled cross-validation. Our findings support the notion that machine learning classification allows robust classification of DTI data sets arising from multiple scanners, even if a new data set comes from a scanner that

  13. Robust automated detection of microstructural white matter degeneration in Alzheimer's disease using machine learning classification of multicenter DTI data.

    Science.gov (United States)

    Dyrba, Martin; Ewers, Michael; Wegrzyn, Martin; Kilimann, Ingo; Plant, Claudia; Oswald, Annahita; Meindl, Thomas; Pievani, Michela; Bokde, Arun L W; Fellgiebel, Andreas; Filippi, Massimo; Hampel, Harald; Klöppel, Stefan; Hauenstein, Karlheinz; Kirste, Thomas; Teipel, Stefan J

    2013-01-01

    Diffusion tensor imaging (DTI) based assessment of white matter fiber tract integrity can support the diagnosis of Alzheimer's disease (AD). The use of DTI as a biomarker, however, depends on its applicability in a multicenter setting accounting for effects of different MRI scanners. We applied multivariate machine learning (ML) to a large multicenter sample from the recently created framework of the European DTI study on Dementia (EDSD). We hypothesized that ML approaches may amend effects of multicenter acquisition. We included a sample of 137 patients with clinically probable AD (MMSE 20.6±5.3) and 143 healthy elderly controls, scanned in nine different scanners. For diagnostic classification we used the DTI indices fractional anisotropy (FA) and mean diffusivity (MD) and, for comparison, gray matter and white matter density maps from anatomical MRI. Data were classified using a Support Vector Machine (SVM) and a Naïve Bayes (NB) classifier. We used two cross-validation approaches, (i) test and training samples randomly drawn from the entire data set (pooled cross-validation) and (ii) data from each scanner as test set, and the data from the remaining scanners as training set (scanner-specific cross-validation). In the pooled cross-validation, SVM achieved an accuracy of 80% for FA and 83% for MD. Accuracies for NB were significantly lower, ranging between 68% and 75%. Removing variance components arising from scanners using principal component analysis did not significantly change the classification results for both classifiers. For the scanner-specific cross-validation, the classification accuracy was reduced for both SVM and NB. After mean correction, classification accuracy reached a level comparable to the results obtained from the pooled cross-validation. Our findings support the notion that machine learning classification allows robust classification of DTI data sets arising from multiple scanners, even if a new data set comes from a scanner that was

  14. Color Image Classification Using Support Vector Machines

    Institute of Scientific and Technical Information of China (English)

    冯霞

    2003-01-01

    An efficient method using various histogram-based (high-dimensional) image content descriptors for automatically classifying general color photos into relevant categories is presented. Principal component analysis(PCA) is used to project the original high dimensional histograms onto their eigenspaees. Lower dimensional eigenfeatures are then used to train support vector machines(SVMs) to classify images into their categories. Experimental results show that even though different descriptors perform differently,they are all highly redundant. It is shown that the dimensionality of all these descriptors,regardless of their performances,can be significantly reduced without affecting classification accuracy, Such scheme would be useful when it is used in an interactive setting for relevant feedback in content-based image retrieval,where low dimensional content descriptors will enable fast online learning and reclassification of results.

  15. A Protein Classification Benchmark collection for machine learning

    NARCIS (Netherlands)

    Sonego, P.; Pacurar, M.; Dhir, S.; Kertész-Farkas, A.; Kocsor, A.; Gáspári, Z.; Leunissen, J.A.M.; Pongor, S.

    2007-01-01

    Protein classification by machine learning algorithms is now widely used in structural and functional annotation of proteins. The Protein Classification Benchmark collection (http://hydra.icgeb.trieste.it/benchmark) was created in order to provide standard datasets on which the performance of machin

  16. Classification of Brain Tumor Using Support Vector Machine Classfiers

    Directory of Open Access Journals (Sweden)

    Dr.D. J. Pete

    2014-03-01

    Full Text Available Magnetic resonance imagi ng (MRI is an imaging technique that has played an important role in neuro science research for studying brain images. Classification is an important part in order to distinguish between normal patients and those who have the possibility of having abnormalities or tumor. The proposed method consists of two stages: feature extraction and classification. In first stage features are extracted from images using GLCM. In the next stage, extracted features are fed as input to Kernel-Based SVM classifier. It classifies the images between normal and abnormal along with Grade of tumor depending upon features. For Brain MRI images; features extracted with GLCM gives 98% accuracy with Kernel-Based SVM Classifiesr. Software used is MATLAB R2011a.

  17. A “Salt and Pepper” Noise Reduction Scheme for Digital Images Based on Support Vector Machines Classification and Regression

    Directory of Open Access Journals (Sweden)

    Hilario Gómez-Moreno

    2014-01-01

    Full Text Available We present a new impulse noise removal technique based on Support Vector Machines (SVM. Both classification and regression were used to reduce the “salt and pepper” noise found in digital images. Classification enables identification of noisy pixels, while regression provides a means to determine reconstruction values. The training vectors necessary for the SVM were generated synthetically in order to maintain control over quality and complexity. A modified median filter based on a previous noise detection stage and a regression-based filter are presented and compared to other well-known state-of-the-art noise reduction algorithms. The results show that the filters proposed achieved good results, outperforming other state-of-the-art algorithms for low and medium noise ratios, and were comparable for very highly corrupted images.

  18. Breast tumor classification in ultrasound images using support vector machines and neural networks

    Directory of Open Access Journals (Sweden)

    Carmina Dessana Lima Nascimento

    Full Text Available Abstract Introduction The use of tools for computer-aided diagnosis (CAD has been proposed for detection and classification of breast cancer. Concerning breast cancer image diagnosing with ultrasound, some results found in literature show that morphological features perform better than texture features for lesions differentiation, and indicate that a reduced set of features performs better than a larger one. Methods This study evaluated the performance of support vector machines (SVM with different kernels combinations, and neural networks with different stop criteria, for classifying breast cancer nodules. Twenty-two morphological features from the contour of 100 BUS images were used as input for classifiers and then a scalar feature selection technique with correlation was used to reduce the features dataset. Results The best results obtained for accuracy and area under ROC curve were 96.98% and 0.980, respectively, both with neural networks using the whole set of features. Conclusion The performance obtained with neural networks with the selected stop criterion was better than the ones obtained with SVM. Whilst using neural networks the results were better with all 22 features, SVM classifiers performed better with a reduced set of 6 features.

  19. Recurrence quantification analysis and support vector machines for golf handicap and low back pain EMG classification.

    Science.gov (United States)

    Silva, Luís; Vaz, João Rocha; Castro, Maria António; Serranho, Pedro; Cabri, Jan; Pezarat-Correia, Pedro

    2015-08-01

    The quantification of non-linear characteristics of electromyography (EMG) must contain information allowing to discriminate neuromuscular strategies during dynamic skills. There are a lack of studies about muscle coordination under motor constrains during dynamic contractions. In golf, both handicap (Hc) and low back pain (LBP) are the main factors associated with the occurrence of injuries. The aim of this study was to analyze the accuracy of support vector machines SVM on EMG-based classification to discriminate Hc (low and high handicap) and LBP (with and without LPB) in the main phases of golf swing. For this purpose recurrence quantification analysis (RQA) features of the trunk and the lower limb muscles were used to feed a SVM classifier. Recurrence rate (RR) and the ratio between determinism (DET) and RR showed a high discriminant power. The Hc accuracy for the swing, backswing, and downswing were 94.4±2.7%, 97.1±2.3%, and 95.3±2.6%, respectively. For LBP, the accuracy was 96.9±3.8% for the swing, and 99.7±0.4% in the backswing. External oblique (EO), biceps femoris (BF), semitendinosus (ST) and rectus femoris (RF) showed high accuracy depending on the laterality within the phase. RQA features and SVM showed a high muscle discriminant capacity within swing phases by Hc and by LBP. Low back pain golfers showed different neuromuscular coordination strategies when compared with asymptomatic. Copyright © 2015 Elsevier Ltd. All rights reserved.

  20. Improving the performance of extreme learning machine for hyperspectral image classification

    Science.gov (United States)

    Li, Jiaojiao; Du, Qian; Li, Wei; Li, Yunsong

    2015-05-01

    Extreme learning machine (ELM) and kernel ELM (KELM) can offer comparable performance as the standard powerful classifier―support vector machine (SVM), but with much lower computational cost due to extremely simple training step. However, their performance may be sensitive to several parameters, such as the number of hidden neurons. An empirical linear relationship between the number of training samples and the number of hidden neurons is proposed. Such a relationship can be easily estimated with two small training sets and extended to large training sets so as to greatly reduce computational cost. Other parameters, such as the steepness parameter in the sigmodal activation function and regularization parameter in the KELM, are also investigated. The experimental results show that classification performance is sensitive to these parameters; fortunately, simple selections will result in suboptimal performance.

  1. Applying a Machine Learning Technique to Classification of Japanese Pressure Patterns

    Directory of Open Access Journals (Sweden)

    H Kimura

    2009-04-01

    Full Text Available In climate research, pressure patterns are often very important. When a climatologists need to know the days of a specific pressure pattern, for example "low pressure in Western areas of Japan and high pressure in Eastern areas of Japan (Japanese winter-type weather," they have to visually check a huge number of surface weather charts. To overcome this problem, we propose an automatic classification system using a support vector machine (SVM, which is a machine-learning method. We attempted to classify pressure patterns into two classes: "winter type" and "non-winter type". For both training datasets and test datasets, we used the JRA-25 dataset from 1981 to 2000. An experimental evaluation showed that our method obtained a greater than 0.8 F-measure. We noted that variations in results were based on differences in training datasets.

  2. New Fuzzy Support Vector Machine for the Class Imbalance Problem in Medical Datasets Classification

    Directory of Open Access Journals (Sweden)

    Xiaoqing Gu

    2014-01-01

    Full Text Available In medical datasets classification, support vector machine (SVM is considered to be one of the most successful methods. However, most of the real-world medical datasets usually contain some outliers/noise and data often have class imbalance problems. In this paper, a fuzzy support machine (FSVM for the class imbalance problem (called FSVM-CIP is presented, which can be seen as a modified class of FSVM by extending manifold regularization and assigning two misclassification costs for two classes. The proposed FSVM-CIP can be used to handle the class imbalance problem in the presence of outliers/noise, and enhance the locality maximum margin. Five real-world medical datasets, breast, heart, hepatitis, BUPA liver, and pima diabetes, from the UCI medical database are employed to illustrate the method presented in this paper. Experimental results on these datasets show the outperformed or comparable effectiveness of FSVM-CIP.

  3. Performance of machine learning methods for classification tasks

    OpenAIRE

    B. Krithika; Dr. V. Ramalingam; Rajan, K

    2013-01-01

    In this paper, the performance of various machine learning methods on pattern classification and recognition tasks are proposed. The proposed method for evaluating performance will be based on the feature representation, feature selection and setting model parameters. The nature of the data, the methods of feature extraction and feature representation are discussed. The results of the Machine Learning algorithms on the classification task are analysed. The performance of Machine Learning meth...

  4. Osteoporosis Recognition Based on Similarity Metric with SVM

    Directory of Open Access Journals (Sweden)

    Ke Zhou

    2016-06-01

    Full Text Available The purpose: Applying different techniques of classification to osteoporotic bone tissue texture analysis, exploring the recognition rate of the different classification methods. Methods: Using gray-level co-occurrence matrix (GLCM and running a length matrix texture analysis to extract bone tissue slice image characteristic parameters, and to classify respectively 4x and 10x microscope images of the two groups: the sham (SHAM and the ovariectomized (OVX group image. Results: The metric support vector machine (SVM classification algorithm, based on SVM learning or recognition rate, was higher than the stand-alone measure, and the classification results were stable. Conclusion: Measurement of the SVM classification algorithm for osteoporotic bone slices texture analysis revealed a high recognition rate.

  5. [MicroRNA Target Prediction Based on Support Vector Machine Ensemble Classification Algorithm of Under-sampling Technique].

    Science.gov (United States)

    Chen, Zhiru; Hong, Wenxue

    2016-02-01

    Considering the low accuracy of prediction in the positive samples and poor overall classification effects caused by unbalanced sample data of MicroRNA (miRNA) target, we proposes a support vector machine (SVM)-integration of under-sampling and weight (IUSM) algorithm in this paper, an under-sampling based on the ensemble learning algorithm. The algorithm adopts SVM as learning algorithm and AdaBoost as integration framework, and embeds clustering-based under-sampling into the iterative process, aiming at reducing the degree of unbalanced distribution of positive and negative samples. Meanwhile, in the process of adaptive weight adjustment of the samples, the SVM-IUSM algorithm eliminates the abnormal ones in negative samples with robust sample weights smoothing mechanism so as to avoid over-learning. Finally, the prediction of miRNA target integrated classifier is achieved with the combination of multiple weak classifiers through the voting mechanism. The experiment revealed that the SVM-IUSW, compared with other algorithms on unbalanced dataset collection, could not only improve the accuracy of positive targets and the overall effect of classification, but also enhance the generalization ability of miRNA target classifier.

  6. [Integration of soft and hard classifications using linear spectral mixture model and support vector machines].

    Science.gov (United States)

    Hu, Tan-Gao; Pan, Yao-Zhong; Zhang, Jin-Shui; Li, Ling-Ling; Le, Li

    2011-02-01

    This paper presents a new soft and hard classification. By analyzing the target objects in the image distribution, and calculating the adaptive threshold automatically, the image is divided into three regions: pure regions, non-target objects regions and mixed regions. For pure regions and non-target objects regions, hard classification method (support vector machine) is used to quickly extract classified results; For mixed regions, soft classification method (selective endmember for linear spectral mixture model) is used to extract the abundance of target objects. Finally, it generates an integrated soft and hard classification map. In order to evaluate the accuracy of this new method, it is compared with SVM and LSMM using ALOS image. The RMSE value of new method is 0.203, and total accuracy is 95.48%. Both overall accuracies and RMSE show that integration of hard and soft classification has a higher accuracy than single hard or soft classification. Experimental results prove that the new method can effectively solve the problem of mixed pixels, and can obviously improve image classification accuracy.

  7. A Study of BCI Signal Pattern Recognition by Using Quasi-Newton-SVM Method

    Institute of Scientific and Technical Information of China (English)

    YANG Chang-chun; MA Zheng-hua; SUN Yu-qiang; ZOU Ling

    2006-01-01

    The recognition of electroencephalogram (EEG) signals is the key of brain computer interface (BCI).Aimed at the problem that the recognition rate of EEG by using support vector machine (SVM) is low in BCI,based on the assumption that a well-defined physiological signal which also has a smooth form"hides" inside the noisy EEG signal,a Quasi-Newton-SVM recognition method based on Quasi-Newton method and SVM algorithm was presented.Firstly,the EEG signals were preprocessed by Quasi-Newton method and got the signals which were fit for SVM.Secondly,the preprocessed signals were classified by SVM method.The present simulation results indicated the Quasi-Newton-SVM approach improved the recognition rate compared with using SVM method; we also discussed the relationship between the artificial smooth signals and the classification errors.

  8. Classification of juvenile myoclonic epilepsy data acquired through scanning electromyography with machine learning algorithms.

    Science.gov (United States)

    Goker, Imran; Osman, Onur; Ozekes, Serhat; Baslo, M Baris; Ertas, Mustafa; Ulgen, Yekta

    2012-10-01

    In this paper, classification of Juvenile Myoclonic Epilepsy (JME) patients and healthy volunteers included into Normal Control (NC) groups was established using Feed-Forward Neural Networks (NN), Support Vector Machines (SVM), Decision Trees (DT), and Naïve Bayes (NB) methods by utilizing the data obtained through the scanning EMG method used in a clinical study. An experimental setup was built for this purpose. 105 motor units were measured. 44 of them belonged to JME group consisting of 9 patients and 61 of them belonged to NC group comprising ten healthy volunteers. k-fold cross validation was applied to train and test the models. ROC curves were drawn for k values of 4, 6, 8 and 10. 100% of detection sensitivity was obtained for DT, NN, and NB classification methods. The lowest FP number, which was obtained by NN, was 5.

  9. Investigation into machine learning algorithms as applied to motor cortex signals for classification of movement stages.

    Science.gov (United States)

    Hollingshead, Robert L; Putrino, David; Ghosh, Soumya; Tan, Tele

    2014-01-01

    Neuroinformatics has recently emerged as a powerful field for the statistical analysis of neural data. This study uses machine learning techniques to analyze neural spiking activities within a population of neurons with the aim of finding spiking patterns associated with different stages of movement. Neural data was recorded during many experimental trials of a cat performing a skilled reach and withdrawal task. Using Weka and the LibSVM classifier, movement stages of the skilled task were identified with a high degree of certainty achieving an area-under-curve (AUC) of the Receiver Operating Characteristic of between 0.900 and 0.997 for the combined data set. Through feature selection, the identification of significant neurons has been made easier. Given this encouraging classification performance, the extension to automatic classification and updating of control models for use with neural prostheses will enable regular adjustments capable of compensating for neural changes.

  10. Correlation technique and least square support vector machine combine for frequency domain based ECG beat classification.

    Science.gov (United States)

    Dutta, Saibal; Chatterjee, Amitava; Munshi, Sugata

    2010-12-01

    The present work proposes the development of an automated medical diagnostic tool that can classify ECG beats. This is considered an important problem as accurate, timely detection of cardiac arrhythmia can help to provide proper medical attention to cure/reduce the ailment. The proposed scheme utilizes a cross-correlation based approach where the cross-spectral density information in frequency domain is used to extract suitable features. A least square support vector machine (LS-SVM) classifier is developed utilizing the features so that the ECG beats are classified into three categories: normal beats, PVC beats and other beats. This three-class classification scheme is developed utilizing a small training dataset and tested with an enormous testing dataset to show the generalization capability of the scheme. The scheme, when employed for 40 files in the MIT/BIH arrhythmia database, could produce high classification accuracy in the range 95.51-96.12% and could outperform several competing algorithms.

  11. A linear-RBF multikernel SVM to classify big text corpora.

    Science.gov (United States)

    Romero, R; Iglesias, E L; Borrajo, L

    2015-01-01

    Support vector machine (SVM) is a powerful technique for classification. However, SVM is not suitable for classification of large datasets or text corpora, because the training complexity of SVMs is highly dependent on the input size. Recent developments in the literature on the SVM and other kernel methods emphasize the need to consider multiple kernels or parameterizations of kernels because they provide greater flexibility. This paper shows a multikernel SVM to manage highly dimensional data, providing an automatic parameterization with low computational cost and improving results against SVMs parameterized under a brute-force search. The model consists in spreading the dataset into cohesive term slices (clusters) to construct a defined structure (multikernel). The new approach is tested on different text corpora. Experimental results show that the new classifier has good accuracy compared with the classic SVM, while the training is significantly faster than several other SVM classifiers.

  12. 基于SVM的录音设备分类研究%Recording Equipment Classification Study Based on SVM

    Institute of Scientific and Technical Information of China (English)

    丛韫; 杜状状; 高冲红; 童茜雯; 郑义; 仲倩

    2016-01-01

    为解决音频取证中私录音频由何种录音设备所录的问题,针对不同设备所采用的压缩算法不同,就会导致录音信号中蕴含着区别于其他录音设备的个性特征,本文从压缩算法出发,提出了一种基于 SVM 对录音设备的分类方法。首先获取不同录音格式的音频,然后针对音频分别用MATLAB对其求改进 MFCC 倒谱参数,接着选定测试集和训练集,使用交叉验证方法得到倒谱数据的最佳参数,之后用训练集对 SVM 进行训练,再用得到的模型来预测测试集的分类标签。通过仿真与实验,结果表明,该方法能够较好的区分不同压缩算法下的音频特性,平均识别率达97%。%To solve the problem of which kind of recording equipment is used for private audio recorded in audio forensic, the article presents a classification method for recording equipment based on SVM embarking from the compression algorithm, which is based on the fact that the recorded signals from different devices with different compression algorithms contain personality characteristics different from other recording devices.Audios in different format are collected at first.Then its improved MFCCs are extracted respective⁃ly by MATLAB and testing and training sets are selected. Then Cross Validation method is used to get the optimal parameters of cepstrum data. The SVM is trained with the training set and the classification label of the testing set is predicted with the model obtained.The simulation and test results show that the method can distinguish audio features among different compression algorithms better,and the average recognition rate is 97%.

  13. A study on monitoring land use/cover change of mining area based on ticket-voting SVM classification

    Science.gov (United States)

    Lin, Yi; Yu, Jie; Ying, Min; Shen, Mingge

    2015-08-01

    Based on the development of classification algorithm applied in monitoring spatio-temporal dynamic changes of coal-- mining areas, several improvements were made on feature space and classification model in this paper. There were two innovations in our study: 1) During building the feature spaces, a new index for extracting information about mining area was created, which can classify mining area and settlements efficiently; 2) a special ticket-voting SVM algorithm with wavelet kernel function was proposed, which provides higher classification accuracy than other traditional classifiers via the secondary classification. Here we took the northeast plain of Pei county in Xuzhou city as a studying region, applying the proposed method to implement the classification by using the image of multi-temporal TM/ETM from the year of 1987 to 2013. How to carry on deep analysis combined with various non-spatial data is much more significant. Then we studied the rules of dynamic changes of land use/cover and further analyzed their driving factors by combining RS interpretation with GIS spatial analysis techniques. In this study, image recognition technology was applied to the problems of environmental change in coal mining area. These explanations provide some valuable supports for human to recognize and deal with the conflicts between economic development and environmental protection in coal mining areas.

  14. Development and evaluation of cost-sensitive universum-SVM.

    Science.gov (United States)

    Dhar, Sauptik; Cherkassky, Vladimir

    2015-04-01

    Many machine learning applications involve analysis of high-dimensional data, where the number of input features is larger than/comparable to the number of data samples. Standard classification methods may not be sufficient for such data, and this provides motivation for nonstandard learning settings. One such new learning methodology is called learning through contradiction or Universum-support vector machine (U-SVM). Recent studies have shown U-SVM to be quite effective for sparse high-dimensional data sets. However, all these earlier studies have used balanced data sets with equal misclassification costs. This paper extends the U-SVM formulation to problems with different misclassification costs, and presents practical conditions for the effectiveness of this cost-sensitive U-SVM. Several empirical comparisons are presented to validate the proposed approach.

  15. Multi-polarized HRRP classification by SVM and DS evidence theory%结合 SVM 和 DS 证据理论的多极化 HRRP 分类研究

    Institute of Scientific and Technical Information of China (English)

    雷蕾; 王晓丹; 邢雅琼; 毕凯

    2013-01-01

      针对雷达目标一维距离像(HRRP)识别问题,结合支持向量机(SVM)和 DS 证据理论提出一种多极化HRRP 分类方法—–SDHRRP.该方法通过混淆矩阵获取基分类器之间的距离,从而根据基分类器对不同目标类的分类能力给其赋予不同的可信度.将该可信度值与 SVM 后验概率结合到 DS 证据理论的基本概率赋值(BPA)中,以实现 SVM 和 DS 证据理论在目标识别中的有效结合.对实测目标数据的实验结果表明,基于分类器可信度得到的 BPA 能够有效避免证据冲突, SDHRRP 方法可以有效降低融合分类的误差率.%For the hotspot of high resolution range profile(HRRP) usage in radar target recognition, a multi-polarized HRRP classification approach combing SVM and DS evidence theory‘SDHRRP’is presented. The method defines different confidence for the classifiers based on the distance between each other given by confusion matrix. Then the value and the posterior probability of SVM are integrated into the BPA(basic probability assignment), which achieves the combination of SVM and the evidence theory in target recognition. The gained BPA based on classifier reliability can avoid the evidence conflict efficiently. The experimental results based on the measured data show the effectiveness of the proposed approach.

  16. The efficacy of support vector machines (SVM) in robust determination of earthquake early warning magnitudes in central Japan

    Indian Academy of Sciences (India)

    Ramakrushna Reddy; Rajesh R Nair

    2013-10-01

    This work deals with a methodology applied to seismic early warning systems which are designed to provide real-time estimation of the magnitude of an event. We will reappraise the work of Simons et al. (2006), who on the basis of wavelet approach predicted a magnitude error of ±1. We will verify and improve upon the methodology of Simons et al. (2006) by applying an SVM statistical learning machine on the time-scale wavelet decomposition methods. We used the data of 108 events in central Japan with magnitude ranging from 3 to 7.4 recorded at KiK-net network stations, for a source–receiver distance of up to 150 km during the period 1998–2011. We applied a wavelet transform on the seismogram data and calculating scale-dependent threshold wavelet coefficients. These coefficients were then classified into low magnitude and high magnitude events by constructing a maximum margin hyperplane between the two classes, which forms the essence of SVMs. Further, the classified events from both the classes were picked up and linear regressions were plotted to determine the relationship between wavelet coefficient magnitude and earthquake magnitude, which in turn helped us to estimate the earthquake magnitude of an event given its threshold wavelet coefficient. At wavelet scale number 7, we predicted the earthquake magnitude of an event within 2.7 seconds. This means that a magnitude determination is available within 2.7 s after the initial onset of the P-wave. These results shed light on the application of SVM as a way to choose the optimal regression function to estimate the magnitude from a few seconds of an incoming seismogram. This would improve the approaches from Simons et al. (2006) which use an average of the two regression functions to estimate the magnitude.

  17. Landslides Identification Using Airborne Laser Scanning Data Derived Topographic Terrain Attributes and Support Vector Machine Classification

    Science.gov (United States)

    Pawłuszek, Kamila; Borkowski, Andrzej

    2016-06-01

    Since the availability of high-resolution Airborne Laser Scanning (ALS) data, substantial progress in geomorphological research, especially in landslide analysis, has been carried out. First and second order derivatives of Digital Terrain Model (DTM) have become a popular and powerful tool in landslide inventory mapping. Nevertheless, an automatic landslide mapping based on sophisticated classifiers including Support Vector Machine (SVM), Artificial Neural Network or Random Forests is often computationally time consuming. The objective of this research is to deeply explore topographic information provided by ALS data and overcome computational time limitation. For this reason, an extended set of topographic features and the Principal Component Analysis (PCA) were used to reduce redundant information. The proposed novel approach was tested on a susceptible area affected by more than 50 landslides located on Rożnów Lake in Carpathian Mountains, Poland. The initial seven PCA components with 90% of the total variability in the original topographic attributes were used for SVM classification. Comparing results with landslide inventory map, the average user's accuracy (UA), producer's accuracy (PA), and overall accuracy (OA) were calculated for two models according to the classification results. Thereby, for the PCA-feature-reduced model UA, PA, and OA were found to be 72%, 76%, and 72%, respectively. Similarly, UA, PA, and OA in the non-reduced original topographic model, was 74%, 77% and 74%, respectively. Using the initial seven PCA components instead of the twenty original topographic attributes does not significantly change identification accuracy but reduce computational time.

  18. SVM在图像分类中的应用%Application of SVM in image classification

    Institute of Scientific and Technical Information of China (English)

    章智儒

    2009-01-01

    支持向量机(SVM)是一种新的机器学习技术.本文采用一对一方法构建多分类SVM分类器.利用常用的灰度共生矩阵方法提取图像纹理特征,组成特征向量,输入构建好的SVM多分类器中进行分类.对从Brodatz纹理库中选取的4张纹理图像进行了分类实验,取得较好的分类结果.

  19. Model selection for SVM using mutative scale chaos optimization algorithm%变尺度混沌优化支持向量机模型选择

    Institute of Scientific and Technical Information of China (English)

    刘清坤; 阙沛文; 费春国; 宋寿鹏

    2006-01-01

    This paper proposes a new search strategy using mutative scale chaos optimization algorithm (MSCO) for model selection of support vector machine (SVM). It searches the parameter space of SVM with a very high efficiency and finds the optimum parameter setting for a practical classification problem with very low time cost. To demonstrate the performance of the proposed method it is applied to model selection of SVM in ultrasonic flaw classification and compared with grid search for model selection. Experimental results show that MSCO is a very powerful tool for model selection of SVM, and outperforms grid search in search speed and precision in ultrasonic flaw classification.

  20. 基于SVM-KNN的文本分类算法及其分析%Analysis on Text Classification Algorithm Based on SVM-KNN

    Institute of Scientific and Technical Information of China (English)

    匡春临; 夏清强

    2010-01-01

    通过实验对SVM、KNN文本分类算法进行了深入探讨.基于KNN和SVM算法.提出了一种SVM-KNN算法.该算法结合KNN和SVM两种分类器,并通过分类预测概率的反馈和修正来提高分类器性能.在CWT100G中文网页分类测试系统中,对SVM-KNN算法的实际效果进行了测试和算法性能验证.

  1. 基于DBN,SVM和BP神经网络的光谱分类比较%The Comparison of Spectral Classification Based on DBN,BP Neural Network and SVM

    Institute of Scientific and Technical Information of China (English)

    李俊峰; 汪月乐; 胡升; 何慧灵

    2016-01-01

    The stellar classification was an important research field for understanding the formation and evolution of stars and galaxies.With large sky surveys and its massive data,the speed and accuracy of the celestial automatic classification was very important.The depth confidence neural network (DBN),support vector machines (SVM)and BP neural networks used in the star classification were compared in this paper.And the applicability of star classification with these three methods was analyzed. First,K,F stars are classified according to the depth of confidence neural network and BP neural network and support vector machine.Then the K1,K3,K5 sub-type and F2,F5,F9 sub-type were separately identified.Finally,the data which did not be-long to the k sub-type were excluded by a secondary classification model based on SVM support vector machine .The results shows that:the depth of belief networks is better for K,F-type star classification,but it is poor for K,F sub-type classification results;The recognition rate of SVM is high for the K,F-type stars and the classification effects of this method is better for K, F-type stars than the corresponding sub-type stars by comparison;The recognition rate of BP neural network is ordinary general for K,F-type stars and their sub-types.The experiment showed that the accuracy of excluding non-k-sub-type data can be up to 100% which indicates that the unknown spectral data can be screened and classified with SVM.%恒星的分类对了解恒星和星系形成与演化历史具有重要的研究价值。面对大型巡天计划及由此产生的海量数据,如何迅速准确地将天体自动分类显得尤为重要。通过对SDSS DR9的恒星光谱数据进行深度置信神经网络(DBN)、神经网络和支持向量机(SVM)等算法分类的对比,分析三种自动光谱分类方法在恒星分类上的适用性。首先利用上述三种方法对K,F恒星进行识别分类,然后再分别对 K1,K3和 K5次型和F2,F5,F9次型识别,

  2. Enhanced land use/cover classification of heterogeneous tropical landscapes using support vector machines and textural homogeneity

    Science.gov (United States)

    Paneque-Gálvez, Jaime; Mas, Jean-François; Moré, Gerard; Cristóbal, Jordi; Orta-Martínez, Martí; Luz, Ana Catarina; Guèze, Maximilien; Macía, Manuel J.; Reyes-García, Victoria

    2013-08-01

    Land use/cover classification is a key research field in remote sensing and land change science as thematic maps derived from remotely sensed data have become the basis for analyzing many socio-ecological issues. However, land use/cover classification remains a difficult task and it is especially challenging in heterogeneous tropical landscapes where nonetheless such maps are of great importance. The present study aims at establishing an efficient classification approach to accurately map all broad land use/cover classes in a large, heterogeneous tropical area, as a basis for further studies (e.g., land use/cover change, deforestation and forest degradation). Specifically, we first compare the performance of parametric (maximum likelihood), non-parametric (k-nearest neighbor and four different support vector machines - SVM), and hybrid (unsupervised-supervised) classifiers, using hard and soft (fuzzy) accuracy assessments. We then assess, using the maximum likelihood algorithm, what textural indices from the gray-level co-occurrence matrix lead to greater classification improvements at the spatial resolution of Landsat imagery (30 m), and rank them accordingly. Finally, we use the textural index that provides the most accurate classification results to evaluate whether its usefulness varies significantly with the classifier used. We classified imagery corresponding to dry and wet seasons and found that SVM classifiers outperformed all the rest. We also found that the use of some textural indices, but particularly homogeneity and entropy, can significantly improve classifications. We focused on the use of the homogeneity index, which has so far been neglected in land use/cover classification efforts, and found that this index along with reflectance bands significantly increased the overall accuracy of all the classifiers, but particularly of SVM. We observed that improvements in producer's and user's accuracies through the inclusion of homogeneity were different

  3. Deep Extreme Learning Machine and Its Application in EEG Classification

    Directory of Open Access Journals (Sweden)

    Shifei Ding

    2015-01-01

    Full Text Available Recently, deep learning has aroused wide interest in machine learning fields. Deep learning is a multilayer perceptron artificial neural network algorithm. Deep learning has the advantage of approximating the complicated function and alleviating the optimization difficulty associated with deep models. Multilayer extreme learning machine (MLELM is a learning algorithm of an artificial neural network which takes advantages of deep learning and extreme learning machine. Not only does MLELM approximate the complicated function but it also does not need to iterate during the training process. We combining with MLELM and extreme learning machine with kernel (KELM put forward deep extreme learning machine (DELM and apply it to EEG classification in this paper. This paper focuses on the application of DELM in the classification of the visual feedback experiment, using MATLAB and the second brain-computer interface (BCI competition datasets. By simulating and analyzing the results of the experiments, effectiveness of the application of DELM in EEG classification is confirmed.

  4. Implementation of algorithms based on support vector machine (SVM for electric systems: topic review

    Directory of Open Access Journals (Sweden)

    Jefferson Jara Estupiñan

    2016-06-01

    Full Text Available Objective: To perform a review of implementation of algorithms based on support vectore machine applied to electric systems. Method: A paper search is done mainly on Biblio­graphic Indexes (BI and Bibliographic Bases with Selection Committee (BBSC about support vector machine. This work shows a qualitative and/or quan­titative description about advances and applications in the electrical environment, approaching topics such as: electrical market prediction, demand predic­tion, non-technical losses (theft, alternative energy source and transformers, among others, in each work the respective citation is done in order to guarantee the copy right and allow to the reader a dynamic mo­vement between the reading and the cited works. Results: A detailed review is done, focused on the searching of implemented algorithms in electric sys­tems and innovating application areas. Conclusion: Support vector machines have a lot of applications due to their multiple benefits, however in the electric energy area; they have not been tota­lly applied, this allow to identify a promising area of researching.

  5. Feature Selection Method Based on Artificial Bee Colony Algorithm and Support Vector Machines for Medical Datasets Classification

    Directory of Open Access Journals (Sweden)

    Mustafa Serter Uzer

    2013-01-01

    Full Text Available This paper offers a hybrid approach that uses the artificial bee colony (ABC algorithm for feature selection and support vector machines for classification. The purpose of this paper is to test the effect of elimination of the unimportant and obsolete features of the datasets on the success of the classification, using the SVM classifier. The developed approach conventionally used in liver diseases and diabetes diagnostics, which are commonly observed and reduce the quality of life, is developed. For the diagnosis of these diseases, hepatitis, liver disorders and diabetes datasets from the UCI database were used, and the proposed system reached a classification accuracies of 94.92%, 74.81%, and 79.29%, respectively. For these datasets, the classification accuracies were obtained by the help of the 10-fold cross-validation method. The results show that the performance of the method is highly successful compared to other results attained and seems very promising for pattern recognition applications.

  6. [Hyperspectral image classification based on 3-D gabor filter and support vector machines].

    Science.gov (United States)

    Feng, Xiao; Xiao, Peng-feng; Li, Qi; Liu, Xiao-xi; Wu, Xiao-cui

    2014-08-01

    A three-dimensional Gabor filter was developed for classification of hyperspectral remote sensing image. This method is based on the characteristics of hyperspectral image and the principle of texture extraction with 2-D Gabor filters. Three-dimensional Gabor filter is able to filter all the bands of hyperspectral image simultaneously, capturing the specific responses in different scales, orientations, and spectral-dependent properties from enormous image information, which greatly reduces the time consumption in hyperspectral image texture extraction, and solve the overlay difficulties of filtered spectrums. Using the designed three-dimensional Gabor filters in different scales and orientations, Hyperion image which covers the typical area of Qi Lian Mountain was processed with full bands to get 26 Gabor texture features and the spatial differences of Gabor feature textures corresponding to each land types were analyzed. On the basis of automatic subspace separation, the dimensions of the hyperspectral image were reduced by band index (BI) method which provides different band combinations for classification in order to search for the optimal magnitude of dimension reduction. Adding three-dimensional Gabor texture features successively according to its discrimination to the given land types, supervised classification was carried out with the classifier support vector machines (SVM). It is shown that the method using three-dimensional Gabor texture features and BI band selection based on automatic subspace separation for hyperspectral image classification can not only reduce dimensions; but also improve the classification accuracy and efficiency of hyperspectral image.

  7. CCH-based geometric algorithms for SVM and applications

    Institute of Scientific and Technical Information of China (English)

    Xin-jun PENG; Yi-fei WANG

    2009-01-01

    The support vector machine (SVM) is a novel machine learning tool in data mining. In this paper, the geometric approach based on the compressed convex hull (CCH) with a mathematical framework is introduced to solve SVM classification problems. Compared with the reduced convex hull (RCH), CCH preserves the shape of geometric solids for data sets; meanwhile, it is easy to give the necessary and sufficient condition for determining its extreme points. As practical applications of CCH, spare and probabilistic speed-up geometric algorithms are developed. Results of numerical experiments show that the proposed algorithms can reduce kernel calculations and display nice performances.

  8. A Novel Extreme Learning Machine Classification Model for e-Nose Application Based on the Multiple Kernel Approach.

    Science.gov (United States)

    Jian, Yulin; Huang, Daoyu; Yan, Jia; Lu, Kun; Huang, Ying; Wen, Tailai; Zeng, Tanyue; Zhong, Shijie; Xie, Qilong

    2017-06-19

    A novel classification model, named the quantum-behaved particle swarm optimization (QPSO)-based weighted multiple kernel extreme learning machine (QWMK-ELM), is proposed in this paper. Experimental validation is carried out with two different electronic nose (e-nose) datasets. Being different from the existing multiple kernel extreme learning machine (MK-ELM) algorithms, the combination coefficients of base kernels are regarded as external parameters of single-hidden layer feedforward neural networks (SLFNs). The combination coefficients of base kernels, the model parameters of each base kernel, and the regularization parameter are optimized by QPSO simultaneously before implementing the kernel extreme learning machine (KELM) with the composite kernel function. Four types of common single kernel functions (Gaussian kernel, polynomial kernel, sigmoid kernel, and wavelet kernel) are utilized to constitute different composite kernel functions. Moreover, the method is also compared with other existing classification methods: extreme learning machine (ELM), kernel extreme learning machine (KELM), k-nearest neighbors (KNN), support vector machine (SVM), multi-layer perceptron (MLP), radical basis function neural network (RBFNN), and probabilistic neural network (PNN). The results have demonstrated that the proposed QWMK-ELM outperforms the aforementioned methods, not only in precision, but also in efficiency for gas classification.

  9. Large-Scale Machine Learning for Classification and Search

    Science.gov (United States)

    Liu, Wei

    2012-01-01

    With the rapid development of the Internet, nowadays tremendous amounts of data including images and videos, up to millions or billions, can be collected for training machine learning models. Inspired by this trend, this thesis is dedicated to developing large-scale machine learning techniques for the purpose of making classification and nearest…

  10. Large-Scale Machine Learning for Classification and Search

    Science.gov (United States)

    Liu, Wei

    2012-01-01

    With the rapid development of the Internet, nowadays tremendous amounts of data including images and videos, up to millions or billions, can be collected for training machine learning models. Inspired by this trend, this thesis is dedicated to developing large-scale machine learning techniques for the purpose of making classification and nearest…

  11. Epileptic EEG classification based on extreme learning machine and nonlinear features.

    Science.gov (United States)

    Yuan, Qi; Zhou, Weidong; Li, Shufang; Cai, Dongmei

    2011-09-01

    The automatic detection and classification of epileptic EEG are significant in the evaluation of patients with epilepsy. This paper presents a new EEG classification approach based on the extreme learning machine (ELM) and nonlinear dynamical features. The theory of nonlinear dynamics has been a powerful tool for understanding brain electrical activities. Nonlinear features extracted from EEG signals such as approximate entropy (ApEn), Hurst exponent and scaling exponent obtained with detrended fluctuation analysis (DFA) are employed to characterize interictal and ictal EEGs. The statistics indicate that the differences of those nonlinear features between interictal and ictal EEGs are statistically significant. The ELM algorithm is employed to train a single hidden layer feedforward neural network (SLFN) with EEG nonlinear features. The experiments demonstrate that compared with the backpropagation (BP) algorithm and support vector machine (SVM), the performance of the ELM is better in terms of training time and classification accuracy which achieves a satisfying recognition accuracy of 96.5% for interictal and ictal EEG signals. Copyright © 2011 Elsevier B.V. All rights reserved.

  12. 一种SVM多分类算法%A multi-class SVM classification algorithm

    Institute of Scientific and Technical Information of China (English)

    2016-01-01

    为了使用支持向量机(SVM)算法进行多类分类,在SVM二分类基础上,提出使用排序算法中冒泡排序的思想进行SVM多类别数据分类.使用该方法在选取的UCI数据集进行实验,结果表明,在保证较高正确率的情况下,相对传统一对一的多分类方法,该方法较大幅地减少了分类时间,是一种应用性较强的SVM多类分类方法.

  13. Fusion of HJ1B and ALOS PALSAR data for land cover classification using machine learning methods

    Science.gov (United States)

    Wang, X. Y.; Guo, Y. G.; He, J.; Du, L. T.

    2016-10-01

    Image classification from remote sensing is becoming increasingly urgent for monitoring environmental changes. Exploring effective algorithms to increase classification accuracy is critical. This paper explores the use of multispectral HJ1B and ALOS (Advanced Land Observing Satellite) PALSAR L-band (Phased Array type L-band Synthetic Aperture Radar) for land cover classification using learning-based algorithms. Pixel-based and object-based image analysis approaches for classifying HJ1B data and the HJ1B and ALOS/PALSAR fused-images were compared using two machine learning algorithms, support vector machine (SVM) and random forest (RF), to test which algorithm can achieve the best classification accuracy in arid and semiarid regions. The overall accuracies of the pixel-based (Fused data: 79.0%; HJ1B data: 81.46%) and object-based classifications (Fused data: 80.0%; HJ1B data: 76.9%) were relatively close when using the SVM classifier. The pixel-based classification achieved a high overall accuracy (85.5%) using the RF algorithm for classifying the fused data, whereas the RF classifier using the object-based image analysis produced a lower overall accuracy (70.2%). The study demonstrates that the pixel-based classification utilized fewer variables and performed relatively better than the object-based classification using HJ1B imagery and the fused data. Generally, the integration of the HJ1B and ALOS/PALSAR imagery can improve the overall accuracy of 5.7% using the pixel-based image analysis and RF classifier.

  14. Classification and Identification of Over-voltage Based on HHT and SVM

    Institute of Scientific and Technical Information of China (English)

    WANG Jing; YANG Qing; CHEN Lin; SIMA Wenxia

    2012-01-01

    This paper proposes an effective method for over-voltage classification based on the Hilbert-Huang transform(HHT) method.Hilbert-Huang transform method is composed of empirical mode decomposition(EMD) and Hilbert transform.Nine kinds of common power system over-voltages are calculated and analyzed by HHT.Based on the instantaneous amplitude spectrum,Hilbert marginal spectrum and Hilbert time-frequency spectrum,three kinds of over-voltage characteristic quantities are obtained.A hierarchical classification system is built based on HHT and support vector machine(SVM).This classification system is tested by 106 field over-voltage signals,and the average classification rate is 94.3%.This research shows that HHT is an effective time-frequency analysis algorithms in the application of over-voltage classification and identification.

  15. Credit risk evaluation using adaptive Lq penalty SVM with Gauss kernel

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    In order to improve the performance of support vector machine (SVM) applications in the field of credit risk evaluation, an adaptive Lq SVM model with Gauss kernel (ALqG-SVM) is proposed to evaluate credit risks. The non-adaptive penalty of the object function is extended to (0, 2] to increase classification accuracy. To further improve the generalization performance of the proposed model, the Gauss kernel is introduced, thus the non-linear classification problem can be linearly separated in higher dimensio...

  16. Impact of corpus domain for sentiment classification: An evaluation study using supervised machine learning techniques

    Science.gov (United States)

    Karsi, Redouane; Zaim, Mounia; El Alami, Jamila

    2017-07-01

    Thanks to the development of the internet, a large community now has the possibility to communicate and express its opinions and preferences through multiple media such as blogs, forums, social networks and e-commerce sites. Today, it becomes clearer that opinions published on the web are a very valuable source for decision-making, so a rapidly growing field of research called “sentiment analysis” is born to address the problem of automatically determining the polarity (Positive, negative, neutral,…) of textual opinions. People expressing themselves in a particular domain often use specific domain language expressions, thus, building a classifier, which performs well in different domains is a challenging problem. The purpose of this paper is to evaluate the impact of domain for sentiment classification when using machine learning techniques. In our study three popular machine learning techniques: Support Vector Machines (SVM), Naive Bayes and K nearest neighbors(KNN) were applied on datasets collected from different domains. Experimental results show that Support Vector Machines outperforms other classifiers in all domains, since it achieved at least 74.75% accuracy with a standard deviation of 4,08.

  17. Performance of machine learning methods for classification tasks

    Directory of Open Access Journals (Sweden)

    B. Krithika

    2013-06-01

    Full Text Available In this paper, the performance of various machine learning methods on pattern classification and recognition tasks are proposed. The proposed method for evaluating performance will be based on the feature representation, feature selection and setting model parameters. The nature of the data, the methods of feature extraction and feature representation are discussed. The results of the Machine Learning algorithms on the classification task are analysed. The performance of Machine Learning methods on classifying Tamil word patterns, i.e., classification of noun and verbs are analysed.The software WEKA (data mining tool is used for evaluating the performance. WEKA has several machine learning algorithms like Bayes, Trees, Lazy, Rule based classifiers.

  18. Support Vector Machine Model for Automatic Detection and Classification of Seismic Events

    Science.gov (United States)

    Barros, Vesna; Barros, Lucas

    2016-04-01

    The automated processing of multiple seismic signals to detect, localize and classify seismic events is a central tool in both natural hazards monitoring and nuclear treaty verification. However, false detections and missed detections caused by station noise and incorrect classification of arrivals are still an issue and the events are often unclassified or poorly classified. Thus, machine learning techniques can be used in automatic processing for classifying the huge database of seismic recordings and provide more confidence in the final output. Applied in the context of the International Monitoring System (IMS) - a global sensor network developed for the Comprehensive Nuclear-Test-Ban Treaty (CTBT) - we propose a fully automatic method for seismic event detection and classification based on a supervised pattern recognition technique called the Support Vector Machine (SVM). According to Kortström et al., 2015, the advantages of using SVM are handleability of large number of features and effectiveness in high dimensional spaces. Our objective is to detect seismic events from one IMS seismic station located in an area of high seismicity and mining activity and classify them as earthquakes or quarry blasts. It is expected to create a flexible and easily adjustable SVM method that can be applied in different regions and datasets. Taken a step further, accurate results for seismic stations could lead to a modification of the model and its parameters to make it applicable to other waveform technologies used to monitor nuclear explosions such as infrasound and hydroacoustic waveforms. As an authorized user, we have direct access to all IMS data and bulletins through a secure signatory account. A set of significant seismic waveforms containing different types of events (e.g. earthquake, quarry blasts) and noise is being analysed to train the model and learn the typical pattern of the signal from these events. Moreover, comparing the performance of the support

  19. Robust algorithm for arrhythmia classification in ECG using extreme learning machine

    Directory of Open Access Journals (Sweden)

    Shin Kwangsoo

    2009-10-01

    Full Text Available Abstract Background Recently, extensive studies have been carried out on arrhythmia classification algorithms using artificial intelligence pattern recognition methods such as neural network. To improve practicality, many studies have focused on learning speed and the accuracy of neural networks. However, algorithms based on neural networks still have some problems concerning practical application, such as slow learning speeds and unstable performance caused by local minima. Methods In this paper we propose a novel arrhythmia classification algorithm which has a fast learning speed and high accuracy, and uses Morphology Filtering, Principal Component Analysis and Extreme Learning Machine (ELM. The proposed algorithm can classify six beat types: normal beat, left bundle branch block, right bundle branch block, premature ventricular contraction, atrial premature beat, and paced beat. Results The experimental results of the entire MIT-BIH arrhythmia database demonstrate that the performances of the proposed algorithm are 98.00% in terms of average sensitivity, 97.95% in terms of average specificity, and 98.72% in terms of average accuracy. These accuracy levels are higher than or comparable with those of existing methods. We make a comparative study of algorithm using an ELM, back propagation neural network (BPNN, radial basis function network (RBFN, or support vector machine (SVM. Concerning the aspect of learning time, the proposed algorithm using ELM is about 290, 70, and 3 times faster than an algorithm using a BPNN, RBFN, and SVM, respectively. Conclusion The proposed algorithm shows effective accuracy performance with a short learning time. In addition we ascertained the robustness of the proposed algorithm by evaluating the entire MIT-BIH arrhythmia database.

  20. A Machine Learning-based Method for Question Type Classification in Biomedical Question Answering.

    Science.gov (United States)

    Sarrouti, Mourad; Ouatik El Alaoui, Said

    2017-05-18

    Biomedical question type classification is one of the important components of an automatic biomedical question answering system. The performance of the latter depends directly on the performance of its biomedical question type classification system, which consists of assigning a category to each question in order to determine the appropriate answer extraction algorithm. This study aims to automatically classify biomedical questions into one of the four categories: (1) yes/no, (2) factoid, (3) list, and (4) summary. In this paper, we propose a biomedical question type classification method based on machine learning approaches to automatically assign a category to a biomedical question. First, we extract features from biomedical questions using the proposed handcrafted lexico-syntactic patterns. Then, we feed these features for machine-learning algorithms. Finally, the class label is predicted using the trained classifiers. Experimental evaluations performed on large standard annotated datasets of biomedical questions, provided by the BioASQ challenge, demonstrated that our method exhibits significant improved performance when compared to four baseline systems. The proposed method achieves a roughly 10-point increase over the best baseline in terms of accuracy. Moreover, the obtained results show that using handcrafted lexico-syntactic patterns as features' provider of support vector machine (SVM) lead to the highest accuracy of 89.40 %. The proposed method can automatically classify BioASQ questions into one of the four categories: yes/no, factoid, list, and summary. Furthermore, the results demonstrated that our method produced the best classification performance compared to four baseline systems.

  1. Using support vector machine ensembles for target audience classification on Twitter.

    Science.gov (United States)

    Lo, Siaw Ling; Chiong, Raymond; Cornforth, David

    2015-01-01

    The vast amount and diversity of the content shared on social media can pose a challenge for any business wanting to use it to identify potential customers. In this paper, our aim is to investigate the use of both unsupervised and supervised learning methods for target audience classification on Twitter with minimal annotation efforts. Topic domains were automatically discovered from contents shared by followers of an account owner using Twitter Latent Dirichlet Allocation (LDA). A Support Vector Machine (SVM) ensemble was then trained using contents from different account owners of the various topic domains identified by Twitter LDA. Experimental results show that the methods presented are able to successfully identify a target audience with high accuracy. In addition, we show that using a statistical inference approach such as bootstrapping in over-sampling, instead of using random sampling, to construct training datasets can achieve a better classifier in an SVM ensemble. We conclude that such an ensemble system can take advantage of data diversity, which enables real-world applications for differentiating prospective customers from the general audience, leading to business advantage in the crowded social media space.

  2. Power quality events recognition using a SVM-based method

    Energy Technology Data Exchange (ETDEWEB)

    Cerqueira, Augusto Santiago; Ferreira, Danton Diego; Ribeiro, Moises Vidal; Duque, Carlos Augusto [Department of Electrical Circuits, Federal University of Juiz de Fora, Campus Universitario, 36036 900, Juiz de Fora MG (Brazil)

    2008-09-15

    In this paper, a novel SVM-based method for power quality event classification is proposed. A simple approach for feature extraction is introduced, based on the subtraction of the fundamental component from the acquired voltage signal. The resulting signal is presented to a support vector machine for event classification. Results from simulation are presented and compared with two other methods, the OTFR and the LCEC. The proposed method shown an improved performance followed by a reasonable computational cost. (author)

  3. Least squares twin support vector machine with Universum data for classification

    Science.gov (United States)

    Xu, Yitian; Chen, Mei; Li, Guohui

    2016-11-01

    Universum, a third class not belonging to either class of the classification problem, allows to incorporate the prior knowledge into the learning process. A lot of previous work have demonstrated that the Universum is helpful to the supervised and semi-supervised classification. Moreover, Universum has already been introduced into the support vector machine (SVM) and twin support vector machine (TSVM) to enhance the generalisation performance. To further increase the generalisation performance, we propose a least squares TSVM with Universum data (?-TSVM) in this paper. Our ?-TSVM possesses the following advantages: first, it exploits Universum data to improve generalisation performance. Besides, it implements the structural risk minimisation principle by adding a regularisation to the objective function. Finally, it costs less computing time by solving two small-sized systems of linear equations instead of a single larger-sized quadratic programming problem. To verify the validity of our proposed algorithm, we conduct various experiments around the size of labelled samples and the number of Universum data on data-sets including seven benchmark data-sets, Toy data, MNIST and Face images. Empirical experiments indicate that Universum contributes to making prediction accuracy improved even stable. Especially when fewer labelled samples given, ?-TSVM is far superior to the improved LS-TSVM (ILS-TSVM), and slightly superior to the ?-TSVM.

  4. The VIMOS Public Extragalactic Redshift Survey (VIPERS). A support vector machine classification of galaxies, stars, and AGNs

    Science.gov (United States)

    Małek, K.; Solarz, A.; Pollo, A.; Fritz, A.; Garilli, B.; Scodeggio, M.; Iovino, A.; Granett, B. R.; Abbas, U.; Adami, C.; Arnouts, S.; Bel, J.; Bolzonella, M.; Bottini, D.; Branchini, E.; Cappi, A.; Coupon, J.; Cucciati, O.; Davidzon, I.; De Lucia, G.; de la Torre, S.; Franzetti, P.; Fumana, M.; Guzzo, L.; Ilbert, O.; Krywult, J.; Le Brun, V.; Le Fevre, O.; Maccagni, D.; Marulli, F.; McCracken, H. J.; Paioro, L.; Polletta, M.; Schlagenhaufer, H.; Tasca, L. A. M.; Tojeiro, R.; Vergani, D.; Zanichelli, A.; Burden, A.; Di Porto, C.; Marchetti, A.; Marinoni, C.; Mellier, Y.; Moscardini, L.; Nichol, R. C.; Peacock, J. A.; Percival, W. J.; Phleps, S.; Wolk, M.; Zamorani, G.

    2013-09-01

    Aims: The aim of this work is to develop a comprehensive method for classifying sources in large sky surveys and to apply the techniques to the VIMOS Public Extragalactic Redshift Survey (VIPERS). Using the optical (u∗,g',r',i') and near-infrared (NIR) data (z', Ks), we develop a classifier, based on broad-band photometry, for identifying stars, active galactic nuclei (AGNs), and galaxies, thereby improving the purity of the VIPERS sample. Methods: Support vector machine (SVM) supervised learning algorithms allow the automatic classification of objects into two or more classes based on a multidimensional parameter space. In this work, we tailored the SVM to classifying stars, AGNs, and galaxies and applied this classification to the VIPERS data. We trained the SVM using spectroscopically confirmed sources from the VIPERS and VVDS surveys. Results: We tested two SVM classifiers and concluded that including NIR data can significantly improve the efficiency of the classifier. The self-check of the best optical + NIR classifier has shown 97% accuracy in the classification of galaxies, 97% for stars, and 95% for AGNs in the 5-dimensional colour space. In the test of VIPERS sources with 99% redshift confidence, the classifier gives an accuracy equal to 94% for galaxies, 93% for stars, and 82% for AGNs. The method was applied to sources with low-quality spectra to verify their classification, hence increasing the security of measurements for almost 4900 objects. Conclusions: We conclude that the SVM algorithm trained on a carefully selected sample of galaxies, AGNs, and stars outperforms simple colour-colour selection methods and can be regarded as a very efficient classification method particularly suitable for modern large surveys. Based on observations collected at the European Southern Observatory, Cerro Paranal, Chile, using the Very Large Telescope under programme 182.A-0886 and partly 070.A-9007. Also based on observations obtained with MegaPrime/MegaCam, a joint

  5. Image Classification Workflow Using Machine Learning Methods

    Science.gov (United States)

    Christoffersen, M. S.; Roser, M.; Valadez-Vergara, R.; Fernández-Vega, J. A.; Pierce, S. A.; Arora, R.

    2016-12-01

    Recent increases in the availability and quality of remote sensing datasets have fueled an increasing number of scientifically significant discoveries based on land use classification and land use change analysis. However, much of the software made to work with remote sensing data products, specifically multispectral images, is commercial and often prohibitively expensive. The free to use solutions that are currently available come bundled up as small parts of much larger programs that are very susceptible to bugs and difficult to install and configure. What is needed is a compact, easy to use set of tools to perform land use analysis on multispectral images. To address this need, we have developed software using the Python programming language with the sole function of land use classification and land use change analysis. We chose Python to develop our software because it is relatively readable, has a large body of relevant third party libraries such as GDAL and Spectral Python, and is free to install and use on Windows, Linux, and Macintosh operating systems. In order to test our classification software, we performed a K-means unsupervised classification, Gaussian Maximum Likelihood supervised classification, and a Mahalanobis Distance based supervised classification. The images used for testing were three Landsat rasters of Austin, Texas with a spatial resolution of 60 meters for the years of 1984 and 1999, and 30 meters for the year 2015. The testing dataset was easily downloaded using the Earth Explorer application produced by the USGS. The software should be able to perform classification based on any set of multispectral rasters with little to no modification. Our software makes the ease of land use classification using commercial software available without an expensive license.

  6. Support Vector Machines for Pattern Classification

    CERN Document Server

    Abe, Shigeo

    2010-01-01

    A guide on the use of SVMs in pattern classification, including a rigorous performance comparison of classifiers and regressors. The book presents architectures for multiclass classification and function approximation problems, as well as evaluation criteria for classifiers and regressors. Features: Clarifies the characteristics of two-class SVMs; Discusses kernel methods for improving the generalization ability of neural networks and fuzzy systems; Contains ample illustrations and examples; Includes performance evaluation using publicly available data sets; Examines Mahalanobis kernels, empir

  7. Automated classification of tropical shrub species: a hybrid of leaf shape and machine learning approach.

    Science.gov (United States)

    Murat, Miraemiliana; Chang, Siow-Wee; Abu, Arpah; Yap, Hwa Jen; Yong, Kien-Thai

    2017-01-01

    Plants play a crucial role in foodstuff, medicine, industry, and environmental protection. The skill of recognising plants is very important in some applications, including conservation of endangered species and rehabilitation of lands after mining activities. However, it is a difficult task to identify plant species because it requires specialized knowledge. Developing an automated classification system for plant species is necessary and valuable since it can help specialists as well as the public in identifying plant species easily. Shape descriptors were applied on the myDAUN dataset that contains 45 tropical shrub species collected from the University of Malaya (UM), Malaysia. Based on literature review, this is the first study in the development of tropical shrub species image dataset and classification using a hybrid of leaf shape and machine learning approach. Four types of shape descriptors were used in this study namely morphological shape descriptors (MSD), Histogram of Oriented Gradients (HOG), Hu invariant moments (Hu) and Zernike moments (ZM). Single descriptor, as well as the combination of hybrid descriptors were tested and compared. The tropical shrub species are classified using six different classifiers, which are artificial neural network (ANN), random forest (RF), support vector machine (SVM), k-nearest neighbour (k-NN), linear discriminant analysis (LDA) and directed acyclic graph multiclass least squares twin support vector machine (DAG MLSTSVM). In addition, three types of feature selection methods were tested in the myDAUN dataset, Relief, Correlation-based feature selection (CFS) and Pearson's coefficient correlation (PCC). The well-known Flavia dataset and Swedish Leaf dataset were used as the validation dataset on the proposed methods. The results showed that the hybrid of all descriptors of ANN outperformed the other classifiers with an average classification accuracy of 98.23% for the myDAUN dataset, 95.25% for the Flavia dataset and 99

  8. Support vector machine regression (LS-SVM)--an alternative to artificial neural networks (ANNs) for the analysis of quantum chemistry data?

    Science.gov (United States)

    Balabin, Roman M; Lomakina, Ekaterina I

    2011-06-28

    A multilayer feed-forward artificial neural network (MLP-ANN) with a single, hidden layer that contains a finite number of neurons can be regarded as a universal non-linear approximator. Today, the ANN method and linear regression (MLR) model are widely used for quantum chemistry (QC) data analysis (e.g., thermochemistry) to improve their accuracy (e.g., Gaussian G2-G4, B3LYP/B3-LYP, X1, or W1 theoretical methods). In this study, an alternative approach based on support vector machines (SVMs) is used, the least squares support vector machine (LS-SVM) regression. It has been applied to ab initio (first principle) and density functional theory (DFT) quantum chemistry data. So, QC + SVM methodology is an alternative to QC + ANN one. The task of the study was to estimate the Møller-Plesset (MPn) or DFT (B3LYP, BLYP, BMK) energies calculated with large basis sets (e.g., 6-311G(3df,3pd)) using smaller ones (6-311G, 6-311G*, 6-311G**) plus molecular descriptors. A molecular set (BRM-208) containing a total of 208 organic molecules was constructed and used for the LS-SVM training, cross-validation, and testing. MP2, MP3, MP4(DQ), MP4(SDQ), and MP4/MP4(SDTQ) ab initio methods were tested. Hartree-Fock (HF/SCF) results were also reported for comparison. Furthermore, constitutional (CD: total number of atoms and mole fractions of different atoms) and quantum-chemical (QD: HOMO-LUMO gap, dipole moment, average polarizability, and quadrupole moment) molecular descriptors were used for the building of the LS-SVM calibration model. Prediction accuracies (MADs) of 1.62 ± 0.51 and 0.85 ± 0.24 kcal mol(-1) (1 kcal mol(-1) = 4.184 kJ mol(-1)) were reached for SVM-based approximations of ab initio and DFT energies, respectively. The LS-SVM model was more accurate than the MLR model. A comparison with the artificial neural network approach shows that the accuracy of the LS-SVM method is similar to the accuracy of ANN. The extrapolation and interpolation results show that LS-SVM is

  9. Virtual Vector Machine for Bayesian Online Classification

    CERN Document Server

    Minka, Thomas P; Yuan,; Qi,

    2012-01-01

    In a typical online learning scenario, a learner is required to process a large data stream using a small memory buffer. Such a requirement is usually in conflict with a learner's primary pursuit of prediction accuracy. To address this dilemma, we introduce a novel Bayesian online classi cation algorithm, called the Virtual Vector Machine. The virtual vector machine allows you to smoothly trade-off prediction accuracy with memory size. The virtual vector machine summarizes the information contained in the preceding data stream by a Gaussian distribution over the classi cation weights plus a constant number of virtual data points. The virtual data points are designed to add extra non-Gaussian information about the classi cation weights. To maintain the constant number of virtual points, the virtual vector machine adds the current real data point into the virtual point set, merges two most similar virtual points into a new virtual point or deletes a virtual point that is far from the decision boundary. The info...

  10. 基于 SVM 的便携式睡眠监测系统设计%A design of sleep monitoring system based on support vector machines

    Institute of Scientific and Technical Information of China (English)

    林秀晶; 钱松荣

    2015-01-01

    Objective Sleep monitoring is an important part of the analysis of sleep quality , yet the sleep monitoring system available now is complex and cumbersome .A portable sleep monitoring system based on support vector machines ( SVM) is proposed in this paper with great convenience and efficiency .Methods The system’ s hardware consists of the server and the user equipment .The user equipment with high portability is used for data acquisition and data transmission . The server is used for data analysis and resource maintenance.SVM is adopted as the automatic sleep analysis algorithm in the server .Based on extracted features, sleep stages are got with directed acyclic graph as the multi-classification method.Results The research results based on patient EEG analysis show that the system can reach a high accuracy rate and take short analysis time average analysis time of 1.45 seconds.Conclusions The compact user equipment is highly portable , and it can feedback the correct result to the users in real time , thus confirming that the design has a promising future in sleep monitoring .%目的:睡眠监测是睡眠质量分析中重要的环节,但目前的睡眠监测系统复杂而且难以携带。本文提出基于支持向量机的便携式睡眠监测系统,以方便地实时监控睡眠。方法该系统硬件部分由服务器和用户端设备构成,其中用户端设备负责数据采集和数据传输,服务器端负责数据分析及相关的资源管理。睡眠分析软件采用支持向量机( support vector machines , SVM)作为分析算法,在提取特征值的基础上,以有向无环图作为多分类策略分析得到睡眠的时相。结果对于患者的睡眠脑电实验表明分析正确率高,所需的分析时间短。结论该系统用户端设备体积小,方便携带,分析正确率高,实时性好,在睡眠监测领域具有良好的应用前景。

  11. A NOVEL SVM FOR GROUND PENETRATING SYNTHETIC APERTURE RADAR LANDMINE DETECTION

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    The use of vehicle- or air-borne Ground Penetrating Synthetic Aperture Radar (GPSAR) to quickly detect landmines over large areas is becoming a trend. However, producing too many false alarms in GPSAR landmine detection is a major challenge in practical applications of GPSAR. Support Vector Machine (SVM), employing structural risk minimization theory, does not need large amounts of training data, which makes it suitable for solving the landmine detection problem. In this paper, a novel SVM with a hypersphere instead of a hyperplane classification boundary is proposed for landmine detection in GPSAR. The HyperSphere-SVM (HS-SVM) can be trained with both landmine and clutter data, or with landmine data only, which are called the two-class HS-SVM and the one-class HS-SVM, respectively. The HS-SVM has better generalization capability than the traditional HyperPlane-SVM (HP-SVM) with respect to varying operating conditions. Quantitative comparisons have been made using real data collected with the rail-GPSAR landmine detection system, which show that both the two-class and the one-class HS-SVMs have better detection performance than the HP-SVM.

  12. Deriving statistical significance maps for SVM based image classification and group comparisons

    OpenAIRE

    Gaonkar, Bilwaj; Davatzikos, Christos

    2012-01-01

    Population based pattern analysis and classification for quantifying structural and functional differences between diverse groups has been shown to be a powerful tool for the study of a number of diseases, and is quite commonly used especially in neuroimaging. The alternative to these pattern analysis methods, namely mass univariate methods such as voxel based analysis and all related methods, cannot detect multivariate patterns associated with group differences, and are not particularly suit...

  13. Quantitative information measurement and application for machine component classification codes

    Institute of Scientific and Technical Information of China (English)

    LI Ling-Feng; TAN Jian-rong; LIU Bo

    2005-01-01

    Information embodied in machine component classification codes has internal relation with the probability distribution of the code symbol. This paper presents a model considering codes as information source based on Shannon's information theory. Using information entropy, it preserves the mathematical form and quantitatively measures the information amount of a symbol and a bit in the machine component classification coding system. It also gets the maximum value of information amount and the corresponding coding scheme when the category of symbols is fixed. Samples are given to show how to evaluate the information amount of component codes and how to optimize a coding system.

  14. 一种新的中文文本分类算法——One Class SVM-KNN算法%A New Text Classification Algorithm-One Class SVM-KNN

    Institute of Scientific and Technical Information of China (English)

    刘文; 吴陈

    2012-01-01

    中文文本分类在数据库及搜索引擎中得到广泛的应用,K-近邻(KNN)算法是常用于中文文本分类中的分类方法,但K-近邻在分类过程中需要存储所有的训练样本,并且直到待测样本需要分类时才建立分类,而且还存在类倾斜现象以及存储和计算的开销大等缺陷.单类SVM对只有一类的分类问题具有很好的效果,但不适用于多类分类问题,因此针对KNN存在的缺陷及单类SVM的特点提出One Class SVM-KNN算法,并给出了算法的定义及详细分析.通过实验证明此方法很好地克服了KNN算法的缺陷,并且查全率、查准率明显优于K-近邻算法.%Text classification is widely used in database and search engine. KNN is widely used in Chinese text categorization,however, KNN has many defects in the application of text classification. The deficiency of KNN classification algorithm is that all the training samples are kept until the samples are classified. When the size of samples is very large, the storage and computation will be costly, which will result in classification deviation. One class SVM is a simple and effective classification algorithm in one class. To solve KNN problems, a new algorithm based on harmonic one-class-SVM and KNN was proposed, which will achieve better classification effect. The experiment result is shown that the recall computed using the proposed method is obviously more highly than the KNN method.

  15. A fast SVM training algorithm based on the set segmentation and k-means clustering

    Institute of Scientific and Technical Information of China (English)

    YANG Xiaowei; LIN Daying; HAO Zhifeng; LIANG Yanchun; LIU Guirong; HAN Xu

    2003-01-01

    At present, studies on training algorithms for support vector machines (SVM) are important issues in the field of machine learning. It is a challenging task to improve the efficiency of the algorithm without reducing the generalization performance of SVM. To face this challenge, a new SVM training algorithm based on the set segmentation and k-means clustering is presented in this paper. The new idea is to divide all the original training data into many subsets, followed by clustering each subset using k-means clustering and finally train SVM using the new data set obtained from clustering centroids. Considering that the decomposition algorithm such as SVMlight is one of the major methods for solving support vector machines, the SVMlight is used in our experiments. Simulations on different types of problems show that the proposed method can solve efficiently not only large linear classification problems but also large nonlinear ones.

  16. Tuning to optimize SVM approach for assisting ovarian cancer diagnosis with photoacoustic imaging.

    Science.gov (United States)

    Wang, Rui; Li, Rui; Lei, Yanyan; Zhu, Quing

    2015-01-01

    Support vector machine (SVM) is one of the most effective classification methods for cancer detection. The efficiency and quality of a SVM classifier depends strongly on several important features and a set of proper parameters. Here, a series of classification analyses, with one set of photoacoustic data from ovarian tissues ex vivo and a widely used breast cancer dataset- the Wisconsin Diagnostic Breast Cancer (WDBC), revealed the different accuracy of a SVM classification in terms of the number of features used and the parameters selected. A pattern recognition system is proposed by means of SVM-Recursive Feature Elimination (RFE) with the Radial Basis Function (RBF) kernel. To improve the effectiveness and robustness of the system, an optimized tuning ensemble algorithm called as SVM-RFE(C) with correlation filter was implemented to quantify feature and parameter information based on cross validation. The proposed algorithm is first demonstrated outperforming SVM-RFE on WDBC. Then the best accuracy of 94.643% and sensitivity of 94.595% were achieved when using SVM-RFE(C) to test 57 new PAT data from 19 patients. The experiment results show that the classifier constructed with SVM-RFE(C) algorithm is able to learn additional information from new data and has significant potential in ovarian cancer diagnosis.

  17. COMBINING FEATURE SCALING ESTIMATION WITH SVM CLASSIFIER DESIGN USING GA APPROACH

    Institute of Scientific and Technical Information of China (English)

    2005-01-01

    This letter adopts a GA (Genetic Algorithm) approach to assist in learning scaling of features that are most favorable to SVM (Support Vector Machines) classifier, which is named as GA-SVM. The relevant coefficients of various features to the classification task, measured by real-valued scaling, are estimated efficiently by using GA. And GA exploits heavy-bias operator to promote sparsity in the scaling of features. There are many potential benefits of this method:Feature selection is performed by eliminating irrelevant features whose scaling is zero, an SVM classifier that has enhanced generalization ability can be learned simultaneously. Experimental comparisons using original SVM and GA-SVM demonstrate both economical feature selection and excellent classification accuracy on junk e-mail recognition problem and Internet ad recognition problem. The experimental results show that comparing with original SVM classifier, the number of support vector decreases significantly and better classification results are achieved based on GA-SVM. It also demonstrates that GA can provide a simple, general, and powerful framework for tuning parameters in optimal problem, which directly improves the recognition performance and recognition rate of SVM.

  18. Multi Target Classification and Recognition Based on Support Vector Machine%基于支持向量机的多目标分类和识别

    Institute of Scientific and Technical Information of China (English)

    侯小丽; 王建国; 王佳丽

    2016-01-01

    Support Vector Machine(SVM),which was developed for binary classification initially, now is widely used in many research fields like pattern recognition. However,its application to multi-classification is inefficient. This paper did research on the extension algorithms of SVM from binary classification to multi-classification,and found that Directed Acyclic Graph(DAG-SVM)is one of the most popularly used. Therefore,this paper focuses on its application of multi-classification in military area,and achieves recognition of many military objects,such as soldiers,armored cars,low altitude targets,and so on.%支持向量机(SVM)算法广泛应用于模式识别等领域,但是SVM最初是针对二类别分类提出,在多分类识别中稍显逊色。对将SVM由二分类扩展到多分类的算法进行了研究,发现有向无环图(DAG-SVM)是其中用的最多的算法之一。因此,针对军事领域图像的多目标分类,选择有向无环图算法来实现军事图像中单兵、装甲、低空等多目标的分类识别。

  19. A survey of supervised machine learning models for mobile-phone based pathogen identification and classification

    Science.gov (United States)

    Ceylan Koydemir, Hatice; Feng, Steve; Liang, Kyle; Nadkarni, Rohan; Tseng, Derek; Benien, Parul; Ozcan, Aydogan

    2017-03-01

    Giardia lamblia causes a disease known as giardiasis, which results in diarrhea, abdominal cramps, and bloating. Although conventional pathogen detection methods used in water analysis laboratories offer high sensitivity and specificity, they are time consuming, and need experts to operate bulky equipment and analyze the samples. Here we present a field-portable and cost-effective smartphone-based waterborne pathogen detection platform that can automatically classify Giardia cysts using machine learning. Our platform enables the detection and quantification of Giardia cysts in one hour, including sample collection, labeling, filtration, and automated counting steps. We evaluated the performance of three prototypes using Giardia-spiked water samples from different sources (e.g., reagent-grade, tap, non-potable, and pond water samples). We populated a training database with >30,000 cysts and estimated our detection sensitivity and specificity using 20 different classifier models, including decision trees, nearest neighbor classifiers, support vector machines (SVMs), and ensemble classifiers, and compared their speed of training and classification, as well as predicted accuracies. Among them, cubic SVM, medium Gaussian SVM, and bagged-trees were the most promising classifier types with accuracies of 94.1%, 94.2%, and 95%, respectively; we selected the latter as our preferred classifier for the detection and enumeration of Giardia cysts that are imaged using our mobile-phone fluorescence microscope. Without the need for any experts or microbiologists, this field-portable pathogen detection platform can present a useful tool for water quality monitoring in resource-limited-settings.

  20. COMPARISON OF SVM AND FUZZY CLASSIFIER FOR AN INDIAN SCRIPT

    Directory of Open Access Journals (Sweden)

    M. J. Baheti

    2012-01-01

    Full Text Available With the advent of technological era, conversion of scanned document (handwritten or printed into machine editable format has attracted many researchers. This paper deals with the problem of recognition of Gujarati handwritten numerals. Gujarati numeral recognition requires performing some specific steps as a part of preprocessing. For preprocessing digitization, segmentation, normalization and thinning are done with considering that the image have almost no noise. Further affine invariant moments based model is used for feature extraction and finally Support Vector Machine (SVM and Fuzzy classifiers are used for numeral classification. . The comparison of SVM and Fuzzy classifier is made and it can be seen that SVM procured better results as compared to Fuzzy Classifier.

  1. 无偏置v-SVM分类优化问题研究%Study on v-SVM for Classification Optimization Problem without Bias

    Institute of Scientific and Technical Information of China (English)

    丁晓剑; 赵银亮

    2011-01-01

    In the high-dimensional space, the classification hyperplane tends to pass through the origin and bias (b) is not need. To study whether v-SVM for classification needs (b), dual optimization formulation of v-SVM without (b) is proposed and the corresponding method of solving the optimization formulation is presented. The dual optimization formulation is transformed into equality constraint sub-optimization formulation by the active set strategy in this method, then the sub-optimization formulation is transformed into the linear equation by lagrange multiplier method. The experimental results show that the existence of (b) would reduce the generalization ability of v-SVM and v-SVM can only obtain the sub-optimal solution of v-SVM without b.%在高维空间中,分类超平面倾向于通过原点,即不需要偏置(b).为了研究在v- SVM分类问题中是否需要b,该文提出了无(b)的v-SVM的对偶优化问题并给出了其优化问题求解方法.该方法通过有效集策略将对偶优化问题转化为等式约束子优化问题,然后通过拉格朗日乘子法将子优化问题转化为线程方程组来求解.实验表明偏置(b)的存在会降低v-SVM的泛化性能,v-SVM只能得到无(b)v-SVM的次优解.

  2. Robust Automated Detection of Microstructural White Matter Degeneration in Alzheimer’s Disease Using Machine Learning Classification of Multicenter DTI Data

    Science.gov (United States)

    Dyrba, Martin; Ewers, Michael; Wegrzyn, Martin; Kilimann, Ingo; Plant, Claudia; Oswald, Annahita; Meindl, Thomas; Pievani, Michela; Bokde, Arun L. W.; Fellgiebel, Andreas; Filippi, Massimo; Hampel, Harald; Klöppel, Stefan; Hauenstein, Karlheinz; Kirste, Thomas; Teipel, Stefan J.

    2013-01-01

    Diffusion tensor imaging (DTI) based assessment of white matter fiber tract integrity can support the diagnosis of Alzheimer’s disease (AD). The use of DTI as a biomarker, however, depends on its applicability in a multicenter setting accounting for effects of different MRI scanners. We applied multivariate machine learning (ML) to a large multicenter sample from the recently created framework of the European DTI study on Dementia (EDSD). We hypothesized that ML approaches may amend effects of multicenter acquisition. We included a sample of 137 patients with clinically probable AD (MMSE 20.6±5.3) and 143 healthy elderly controls, scanned in nine different scanners. For diagnostic classification we used the DTI indices fractional anisotropy (FA) and mean diffusivity (MD) and, for comparison, gray matter and white matter density maps from anatomical MRI. Data were classified using a Support Vector Machine (SVM) and a Naïve Bayes (NB) classifier. We used two cross-validation approaches, (i) test and training samples randomly drawn from the entire data set (pooled cross-validation) and (ii) data from each scanner as test set, and the data from the remaining scanners as training set (scanner-specific cross-validation). In the pooled cross-validation, SVM achieved an accuracy of 80% for FA and 83% for MD. Accuracies for NB were significantly lower, ranging between 68% and 75%. Removing variance components arising from scanners using principal component analysis did not significantly change the classification results for both classifiers. For the scanner-specific cross-validation, the classification accuracy was reduced for both SVM and NB. After mean correction, classification accuracy reached a level comparable to the results obtained from the pooled cross-validation. Our findings support the notion that machine learning classification allows robust classification of DTI data sets arising from multiple scanners, even if a new data set comes from a scanner that was

  3. A New Algorithm of Classification Based on Support Vector ACNN-SVM%一种新的支持向量分类算法ACNN-SVM

    Institute of Scientific and Technical Information of China (English)

    业巧林; 业宁; 张训华; 武波; 宋爱美

    2008-01-01

    针对NN-SVM算法的不足,提出了一种新的支持向量分类算法--ACNN-SVM.先对训练样本集进行最近邻修剪,用SVM训练得到一个SVM模型,然后,计算最近邻修剪后的训练样本集中样本到超平面的距离,如果距离差大于给定的阈值则将其从最近邻修剪后的训练样本集中删除,最后对冉修剪后的样本集用SVM训练得到一个最终的SVM模型.实验表明,ACNN-SVM算法的效果优于NN-SVM算法.

  4. Machine Learning Classification of SDSS Transient Survey Images

    CERN Document Server

    Buisson, L du; Bassett, B A; Smith, M

    2014-01-01

    We show that multiple machine learning algorithms can match human performance in classifying transient imaging data from the SDSS supernova survey into real objects and artefacts. This is the first step in any transient science pipeline and is currently still done by humans, but future surveys such as LSST will necessitate fully machine-enabled solutions. Using features trained from eigenimage analysis (PCA) of single-epoch g, r, i-difference images we can reach a completeness (recall) of 95%, while only incorrectly classifying 18% of artefacts as real objects, corresponding to a precision (purity) of 85%. In general the k-nearest neighbour and the SkyNet artificial neural net algorithms performed most robustly compared to other methods such as naive Bayes and kernel SVM. Our results show that PCA-based machine learning can match human success levels and can naturally be extended by including multiple epochs of data, transient colours and host galaxy information which should allow for significant further impr...

  5. Face Detection Using Adaboosted SVM-Based Component Classifier

    CERN Document Server

    Valiollahzadeh, Seyyed Majid; Nazari, Mohammad

    2008-01-01

    Recently, Adaboost has been widely used to improve the accuracy of any given learning algorithm. In this paper we focus on designing an algorithm to employ combination of Adaboost with Support Vector Machine as weak component classifiers to be used in Face Detection Task. To obtain a set of effective SVM-weaklearner Classifier, this algorithm adaptively adjusts the kernel parameter in SVM instead of using a fixed one. Proposed combination outperforms in generalization in comparison with SVM on imbalanced classification problem. The proposed here method is compared, in terms of classification accuracy, to other commonly used Adaboost methods, such as Decision Trees and Neural Networks, on CMU+MIT face database. Results indicate that the performance of the proposed method is overall superior to previous Adaboost approaches.

  6. Machine Learning for Galaxy Morphology Classification

    CERN Document Server

    Gauci, Adam; Abela, John; Magro, Alessio

    2010-01-01

    In this work, decision tree learning algorithms and fuzzy inferencing systems are applied for galaxy morphology classification. In particular, the CART, the C4.5, the Random Forest and fuzzy logic algorithms are studied and reliable classifiers are developed to distinguish between spiral galaxies, elliptical galaxies or star/unknown galactic objects. Morphology information for the training and testing datasets is obtained from the Galaxy Zoo project while the corresponding photometric and spectra parameters are downloaded from the SDSS DR7 catalogue.

  7. An Efficient Audio Classification Approach Based on Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Lhoucine Bahatti

    2016-05-01

    Full Text Available In order to achieve an audio classification aimed to identify the composer, the use of adequate and relevant features is important to improve performance especially when the classification algorithm is based on support vector machines. As opposed to conventional approaches that often use timbral features based on a time-frequency representation of the musical signal using constant window, this paper deals with a new audio classification method which improves the features extraction according the Constant Q Transform (CQT approach and includes original audio features related to the musical context in which the notes appear. The enhancement done by this work is also lay on the proposal of an optimal features selection procedure which combines filter and wrapper strategies. Experimental results show the accuracy and efficiency of the adopted approach in the binary classification as well as in the multi-class classification.

  8. A head impact detection system using SVM classification and proximity sensing in an instrumented mouthguard.

    Science.gov (United States)

    Wu, Lyndia C; Zarnescu, Livia; Nangia, Vaibhav; Cam, Bruce; Camarillo, David B

    2014-11-01

    Injury from blunt head impacts causes acute neurological deficits and may lead to chronic neurodegeneration. A head impact detection device can serve both as a research tool for studying head injury mechanisms and a clinical tool for real-time trauma screening. The simplest approach is an acceleration thresholding algorithm, which may falsely detect high-acceleration spurious events such as manual manipulation of the device. We designed a head impact detection system that distinguishes head impacts from nonimpacts through two subsystems. First, we use infrared proximity sensing to determine if the mouthguard is worn on the teeth to filter out all off-teeth events. Second, on-teeth, nonimpact events are rejected using a support vector machine classifier trained on frequency domain features of linear acceleration and rotational velocity. The remaining events are classified as head impacts. In a controlled laboratory evaluation, the present system performed substantially better than a 10-g acceleration threshold in head impact detection (98% sensitivity, 99.99% specificity, 99% accuracy, and 99.98% precision, compared to 92% sensitivity, 58% specificity, 65% accuracy, and 37% precision). Once adapted for field deployment by training and validation with field data, this system has the potential to effectively detect head trauma in sports, military service, and other high-risk activities.

  9. Feasibility of Active Machine Learning for Multiclass Compound Classification.

    Science.gov (United States)

    Lang, Tobias; Flachsenberg, Florian; von Luxburg, Ulrike; Rarey, Matthias

    2016-01-25

    A common task in the hit-to-lead process is classifying sets of compounds into multiple, usually structural classes, which build the groundwork for subsequent SAR studies. Machine learning techniques can be used to automate this process by learning classification models from training compounds of each class. Gathering class information for compounds can be cost-intensive as the required data needs to be provided by human experts or experiments. This paper studies whether active machine learning can be used to reduce the required number of training compounds. Active learning is a machine learning method which processes class label data in an iterative fashion. It has gained much attention in a broad range of application areas. In this paper, an active learning method for multiclass compound classification is proposed. This method selects informative training compounds so as to optimally support the learning progress. The combination with human feedback leads to a semiautomated interactive multiclass classification procedure. This method was investigated empirically on 15 compound classification tasks containing 86-2870 compounds in 3-38 classes. The empirical results show that active learning can solve these classification tasks using 10-80% of the data which would be necessary for standard learning techniques.

  10. Support Vector Machine Classification For MRI Images

    OpenAIRE

    Rajeswari S; Theiva Jeyaselvi. K

    2012-01-01

    -Magnetic resonance imaging (MRI) is an imaging technique that has played an important role in neuro science research for studying brain images. Classification is an important part in order to distinguish between normal patients and those who have the p o s s i b i l i t y o f h a v i n g a b n o r m a l i t i e s o r tumor. In this paper, we have obtained the texture based features such as GLCM (Grey Level Co-occurrence Matrix) of MRI images. To select the discriminative features among them ...

  11. MOBILE GEO-LOCATION ALGORITHM BASED ON LS-SVM

    Institute of Scientific and Technical Information of China (English)

    Sun Guolin; Guo Wei

    2005-01-01

    Support Vector Machine (SVM) is a powerful methodology for solving problems in non-linear classification, function estimation and density estimation, which has also led to many other recent developments in kernel based methods in general. This paper presents a highaccuracy and fault-tolerant SVM for the mobile geo-location problem, which is an important component of pervasive computing. Simulation results show its basic location performance, and illustrate impacts of the number of training samples and training area on test location error.

  12. Construction of Pancreatic Cancer Classifier Based on SVM Optimized by Improved FOA

    Directory of Open Access Journals (Sweden)

    Huiyan Jiang

    2015-01-01

    Full Text Available A novel method is proposed to establish the pancreatic cancer classifier. Firstly, the concept of quantum and fruit fly optimal algorithm (FOA are introduced, respectively. Then FOA is improved by quantum coding and quantum operation, and a new smell concentration determination function is defined. Finally, the improved FOA is used to optimize the parameters of support vector machine (SVM and the classifier is established by optimized SVM. In order to verify the effectiveness of the proposed method, SVM and other classification methods have been chosen as the comparing methods. The experimental results show that the proposed method can improve the classifier performance and cost less time.

  13. Construction of Pancreatic Cancer Classifier Based on SVM Optimized by Improved FOA.

    Science.gov (United States)

    Jiang, Huiyan; Zhao, Di; Zheng, Ruiping; Ma, Xiaoqi

    2015-01-01

    A novel method is proposed to establish the pancreatic cancer classifier. Firstly, the concept of quantum and fruit fly optimal algorithm (FOA) are introduced, respectively. Then FOA is improved by quantum coding and quantum operation, and a new smell concentration determination function is defined. Finally, the improved FOA is used to optimize the parameters of support vector machine (SVM) and the classifier is established by optimized SVM. In order to verify the effectiveness of the proposed method, SVM and other classification methods have been chosen as the comparing methods. The experimental results show that the proposed method can improve the classifier performance and cost less time.

  14. Online Fault Diagnosis for Biochemical Process Based on FCM and SVM.

    Science.gov (United States)

    Wang, Xianfang; Du, Haoze; Tan, Jinglu

    2016-12-01

    Fault diagnosis is becoming an important issue in biochemical process, and a novel online fault detection and diagnosis approach is designed by combining fuzzy c-means (FCM) and support vector machine (SVM). The samples are preprocessed via FCM algorithm to enhance the ability of classification firstly. Then, those samples are input to the SVM classifier to realize the biochemical process fault diagnosis. In this study, a glutamic acid fermentation process is chosen as an example to diagnose the fault by this method, the result shows that the diagnosis time is largely shortened, and the accuracy is extremely improved by comparing to a single SVM method.

  15. The Hybrid of Classification Tree and Extreme Learning Machine for Permeability Prediction in Oil Reservoir

    KAUST Repository

    Prasetyo Utomo, Chandra

    2011-06-01

    Permeability is an important parameter connected with oil reservoir. Predicting the permeability could save millions of dollars. Unfortunately, petroleum engineers have faced numerous challenges arriving at cost-efficient predictions. Much work has been carried out to solve this problem. The main challenge is to handle the high range of permeability in each reservoir. For about a hundred year, mathematicians and engineers have tried to deliver best prediction models. However, none of them have produced satisfying results. In the last two decades, artificial intelligence models have been used. The current best prediction model in permeability prediction is extreme learning machine (ELM). It produces fairly good results but a clear explanation of the model is hard to come by because it is so complex. The aim of this research is to propose a way out of this complexity through the design of a hybrid intelligent model. In this proposal, the system combines classification and regression models to predict the permeability value. These are based on the well logs data. In order to handle the high range of the permeability value, a classification tree is utilized. A benefit of this innovation is that the tree represents knowledge in a clear and succinct fashion and thereby avoids the complexity of all previous models. Finally, it is important to note that the ELM is used as a final predictor. Results demonstrate that this proposed hybrid model performs better when compared with support vector machines (SVM) and ELM in term of correlation coefficient. Moreover, the classification tree model potentially leads to better communication among petroleum engineers concerning this important process and has wider implications for oil reservoir management efficiency.

  16. Classification of forest development stages from national low-density lidar datasets: a comparison of machine learning methods

    Directory of Open Access Journals (Sweden)

    R. Valbuena

    2016-02-01

    Full Text Available The area-based method has become a widespread approach in airborne laser scanning (ALS, being mainly employed for the estimation of continuous variables describing forest attributes: biomass, volume, density, etc. However, to date, classification methods based on machine learning, which are fairly common in other remote sensing fields, such as land use / land cover classification using multispectral sensors, have been largely overseen in forestry applications of ALS. In this article, we wish to draw the attention on statistical methods predicting discrete responses, for supervised classification of ALS datasets. A wide spectrum of approaches are reviewed: discriminant analysis (DA using various classifiers –maximum likelihood, minimum volume ellipsoid, naïve Bayes–, support vector machine (SVM, artificial neural networks (ANN, random forest (RF and nearest neighbour (NN methods. They are compared in the context of a classification of forest areas into development classes (DC used in practical silvicultural management in Finland, using their low-density national ALS dataset. We observed that RF and NN had the most balanced error matrices, with cross-validated predictions which were mainly unbiased for all DCs. Although overall accuracies were higher for SVM and ANN, their results were very dissimilar across DCs, and they can therefore be only advantageous if certain DCs are targeted. DA methods underperformed in comparison to other alternatives, and were only advantageous for the detection of seedling stands. These results show that, besides the well demonstrated capacity of ALS for quantifying forest stocks, there is a great deal of potential for predicting categorical variables in general, and forest types in particular. In conclusion, we consider that the presented methodology shall also be adapted to the type of forest classes that can be relevant to Mediterranean ecosystems, opening a range of possibilities for future research, in which

  17. Learning features for tissue classification with the classification restricted Boltzmann machine

    DEFF Research Database (Denmark)

    van Tulder, Gijs; de Bruijne, Marleen

    2014-01-01

    Performance of automated tissue classification in medical imaging depends on the choice of descriptive features. In this paper, we show how restricted Boltzmann machines (RBMs) can be used to learn features that are especially suited for texture-based tissue classification. We introduce...... the convolutional classification RBM, a combination of the existing convolutional RBM and classification RBM, and use it for discriminative feature learning. We evaluate the classification accuracy of convolutional and non-convolutional classification RBMs on two lung CT problems. We find that RBM-learned features...... outperform conventional RBM-based feature learning, which is unsupervised and uses only a generative learning objective, as well as often-used filter banks. We show that a mixture of generative and discriminative learning can produce filters that give a higher classification accuracy....

  18. Reconfiguration-based implementation of SVM classifier on FPGA for Classifying Microarray data.

    Science.gov (United States)

    Hussain, Hanaa M; Benkrid, Khaled; Seker, Huseyin

    2013-01-01

    Classifying Microarray data, which are of high dimensional nature, requires high computational power. Support Vector Machines-based classifier (SVM) is among the most common and successful classifiers used in the analysis of Microarray data but also requires high computational power due to its complex mathematical architecture. Implementing SVM on hardware exploits the parallelism available within the algorithm kernels to accelerate the classification of Microarray data. In this work, a flexible, dynamically and partially reconfigurable implementation of the SVM classifier on Field Programmable Gate Array (FPGA) is presented. The SVM architecture achieved up to 85× speed-up over equivalent general purpose processor (GPP) showing the capability of FPGAs in enhancing the performance of SVM-based analysis of Microarray data as well as future bioinformatics applications.

  19. SVM-based glioma grading: Optimization by feature reduction analysis.

    Science.gov (United States)

    Zöllner, Frank G; Emblem, Kyrre E; Schad, Lothar R

    2012-09-01

    We investigated the predictive power of feature reduction analysis approaches in support vector machine (SVM)-based classification of glioma grade. In 101 untreated glioma patients, three analytic approaches were evaluated to derive an optimal reduction in features; (i) Pearson's correlation coefficients (PCC), (ii) principal component analysis (PCA) and (iii) independent component analysis (ICA). Tumor grading was performed using a previously reported SVM approach including whole-tumor cerebral blood volume (CBV) histograms and patient age. Best classification accuracy was found using PCA at 85% (sensitivity=89%, specificity=84%) when reducing the feature vector from 101 (100-bins rCBV histogram+age) to 3 principal components. In comparison, classification accuracy by PCC was 82% (89%, 77%, 2 dimensions) and 79% by ICA (87%, 75%, 9 dimensions). For improved speed (up to 30%) and simplicity, feature reduction by all three methods provided similar classification accuracy to literature values (∼87%) while reducing the number of features by up to 98%.

  20. Quintic spline smooth semi-supervised support vector classification machine

    Institute of Scientific and Technical Information of China (English)

    Xiaodan Zhang; Jinggai Ma; Aihua Li; Ang Li

    2015-01-01

    A semi-supervised vector machine is a relatively new learning method using both labeled and unlabeled data in classifi-cation. Since the objective function of the model for an unstrained semi-supervised vector machine is not smooth, many fast opti-mization algorithms cannot be applied to solve the model. In order to overcome the difficulty of dealing with non-smooth objective functions, new methods that can solve the semi-supervised vector machine with desired classification accuracy are in great demand. A quintic spline function with three-times differentiability at the ori-gin is constructed by a general three-moment method, which can be used to approximate the symmetric hinge loss function. The approximate accuracy of the quintic spline function is estimated. Moreover, a quintic spline smooth semi-support vector machine is obtained and the convergence accuracy of the smooth model to the non-smooth one is analyzed. Three experiments are performed to test the efficiency of the model. The experimental results show that the new model outperforms other smooth models, in terms of classification performance. Furthermore, the new model is not sensitive to the increasing number of the labeled samples, which means that the new model is more efficient.

  1. Automated Classification of Heritage Buildings for As-Built Bim Using Machine Learning Techniques

    Science.gov (United States)

    Bassier, M.; Vergauwen, M.; Van Genechten, B.

    2017-08-01

    Semantically rich three dimensional models such as Building Information Models (BIMs) are increasingly used in digital heritage. They provide the required information to varying stakeholders during the different stages of the historic buildings life cyle which is crucial in the conservation process. The creation of as-built BIM models is based on point cloud data. However, manually interpreting this data is labour intensive and often leads to misinterpretations. By automatically classifying the point cloud, the information can be proccesed more effeciently. A key aspect in this automated scan-to-BIM process is the classification of building objects. In this research we look to automatically recognise elements in existing buildings to create compact semantic information models. Our algorithm efficiently extracts the main structural components such as floors, ceilings, roofs, walls and beams despite the presence of significant clutter and occlusions. More specifically, Support Vector Machines (SVM) are proposed for the classification. The algorithm is evaluated using real data of a variety of existing buildings. The results prove that the used classifier recognizes the objects with both high precision and recall. As a result, entire data sets are reliably labelled at once. The approach enables experts to better document and process heritage assets.

  2. Automatic retinal vessel classification using a Least Square-Support Vector Machine in VAMPIRE.

    Science.gov (United States)

    Relan, D; MacGillivray, T; Ballerini, L; Trucco, E

    2014-01-01

    It is important to classify retinal blood vessels into arterioles and venules for computerised analysis of the vasculature and to aid discovery of disease biomarkers. For instance, zone B is the standardised region of a retinal image utilised for the measurement of the arteriole to venule width ratio (AVR), a parameter indicative of microvascular health and systemic disease. We introduce a Least Square-Support Vector Machine (LS-SVM) classifier for the first time (to the best of our knowledge) to label automatically arterioles and venules. We use only 4 image features and consider vessels inside zone B (802 vessels from 70 fundus camera images) and in an extended zone (1,207 vessels, 70 fundus camera images). We achieve an accuracy of 94.88% and 93.96% in zone B and the extended zone, respectively, with a training set of 10 images and a testing set of 60 images. With a smaller training set of only 5 images and the same testing set we achieve an accuracy of 94.16% and 93.95%, respectively. This experiment was repeated five times by randomly choosing 10 and 5 images for the training set. Mean classification accuracy are close to the above mentioned result. We conclude that the performance of our system is very promising and outperforms most recently reported systems. Our approach requires smaller training data sets compared to others but still results in a similar or higher classification rate.

  3. Scoliosis curve type classification using kernel machine from 3D trunk image

    Science.gov (United States)

    Adankon, Mathias M.; Dansereau, Jean; Parent, Stefan; Labelle, Hubert; Cheriet, Farida

    2012-03-01

    Adolescent idiopathic scoliosis (AIS) is a deformity of the spine manifested by asymmetry and deformities of the external surface of the trunk. Classification of scoliosis deformities according to curve type is used to plan management of scoliosis patients. Currently, scoliosis curve type is determined based on X-ray exam. However, cumulative exposure to X-rays radiation significantly increases the risk for certain cancer. In this paper, we propose a robust system that can classify the scoliosis curve type from non invasive acquisition of 3D trunk surface of the patients. The 3D image of the trunk is divided into patches and local geometric descriptors characterizing the surface of the back are computed from each patch and forming the features. We perform the reduction of the dimensionality by using Principal Component Analysis and 53 components were retained. In this work a multi-class classifier is built with Least-squares support vector machine (LS-SVM) which is a kernel classifier. For this study, a new kernel was designed in order to achieve a robust classifier in comparison with polynomial and Gaussian kernel. The proposed system was validated using data of 103 patients with different scoliosis curve types diagnosed and classified by an orthopedic surgeon from the X-ray images. The average rate of successful classification was 93.3% with a better rate of prediction for the major thoracic and lumbar/thoracolumbar types.

  4. A machine learning approach for classification of anatomical coverage in CT

    Science.gov (United States)

    Wang, Xiaoyong; Lo, Pechin; Ramakrishna, Bharath; Goldin, Johnathan; Brown, Matthew

    2016-03-01

    Automatic classification of anatomical coverage of medical images is critical for big data mining and as a pre-processing step to automatically trigger specific computer aided diagnosis systems. The traditional way to identify scans through DICOM headers has various limitations due to manual entry of series descriptions and non-standardized naming conventions. In this study, we present a machine learning approach where multiple binary classifiers were used to classify different anatomical coverages of CT scans. A one-vs-rest strategy was applied. For a given training set, a template scan was selected from the positive samples and all other scans were registered to it. Each registered scan was then evenly split into k × k × k non-overlapping blocks and for each block the mean intensity was computed. This resulted in a 1 × k3 feature vector for each scan. The feature vectors were then used to train a SVM based classifier. In this feasibility study, four classifiers were built to identify anatomic coverages of brain, chest, abdomen-pelvis, and chest-abdomen-pelvis CT scans. Each classifier was trained and tested using a set of 300 scans from different subjects, composed of 150 positive samples and 150 negative samples. Area under the ROC curve (AUC) of the testing set was measured to evaluate the performance in a two-fold cross validation setting. Our results showed good classification performance with an average AUC of 0.96.

  5. An extended Lagrangian support vector machine for classifications

    Institute of Scientific and Technical Information of China (English)

    YANG Xiaowei; SHU Lei; HAO Zhifeng; LIANG Yanchun; LIU Guirong; HAN Xu

    2004-01-01

    Lagrangian support vector machine (LSVM) cannot solve large problems for nonlinear kernel classifiers. In order to extend the LSVM to solve very large problems, an extended Lagrangian support vector machine (ELSVM) for classifications based on LSVM and SVMlight is presented in this paper. Our idea for the ELSVM is to divide a large quadratic programming problem into a series of subproblems with small size and to solve them via LSVM. Since the LSVM can solve small and medium problems for nonlinear kernel classifiers, the proposed ELSVM can be used to handle large problems very efficiently. Numerical experiments on different types of problems are performed to demonstrate the high efficiency of the ELSVM.

  6. Quantum support vector machine for big data classification.

    Science.gov (United States)

    Rebentrost, Patrick; Mohseni, Masoud; Lloyd, Seth

    2014-09-26

    Supervised machine learning is the classification of new data based on already classified training examples. In this work, we show that the support vector machine, an optimized binary classifier, can be implemented on a quantum computer, with complexity logarithmic in the size of the vectors and the number of training examples. In cases where classical sampling algorithms require polynomial time, an exponential speedup is obtained. At the core of this quantum big data algorithm is a nonsparse matrix exponentiation technique for efficiently performing a matrix inversion of the training data inner-product (kernel) matrix.

  7. Individual classification of children with epilepsy using support vector machine with multiple indices of diffusion tensor imaging

    Directory of Open Access Journals (Sweden)

    Ishmael Amarreh

    2014-01-01

    Conclusion: DTI-based SVM classification appears promising for distinguishing children with active epilepsy from either those with remitted epilepsy or controls, and the question that arises is whether it will prove useful as a prognostic index of seizure remission. While SVM can correctly identify children with active epilepsy from other groups' diagnosis, further research is needed to determine the efficacy of SVM as a prognostic tool in longitudinal clinical studies.

  8. A Hybrid Machine Learning Method for Fusing fMRI and Genetic Data: Combining both Improves Classification of Schizophrenia

    Directory of Open Access Journals (Sweden)

    Honghui Yang

    2010-10-01

    Full Text Available We demonstrate a hybrid machine learning method to classify schizophrenia patients and healthy controls, using functional magnetic resonance imaging (fMRI and single nucleotide polymorphism (SNP data. The method consists of four stages: (1 SNPs with the most discriminating information between the healthy controls and schizophrenia patients are selected to construct a support vector machine ensemble (SNP-SVME. (2 Voxels in the fMRI map contributing to classification are selected to build another SVME (Voxel-SVME. (3 Components of fMRI activation obtained with independent component analysis (ICA are used to construct a single SVM classifier (ICA-SVMC. (4 The above three models are combined into a single module using a majority voting approach to make a final decision (Combined SNP-fMRI. The method was evaluated by a fully-validated leave-one-out method using 40 subjects (20 patients and 20 controls. The classification accuracy was: 0.74 for SNP-SVME, 0.82 for Voxel-SVME, 0.83 for ICA-SVMC, and 0.87 for Combined SNP-fMRI. Experimental results show that better classification accuracy was achieved by combining genetic and fMRI data than using either alone, indicating that genetic and brain function representing different, but partially complementary aspects, of schizophrenia etiopathology. This study suggests an effective way to reassess biological classification of individuals with schizophrenia, which is also potentially useful for identifying diagnostically important markers for the disorder.

  9. PSO-based support vector machine with cuckoo search technique for clinical disease diagnoses.

    Science.gov (United States)

    Liu, Xiaoyong; Fu, Hui

    2014-01-01

    Disease diagnosis is conducted with a machine learning method. We have proposed a novel machine learning method that hybridizes support vector machine (SVM), particle swarm optimization (PSO), and cuckoo search (CS). The new method consists of two stages: firstly, a CS based approach for parameter optimization of SVM is developed to find the better initial parameters of kernel function, and then PSO is applied to continue SVM training and find the best parameters of SVM. Experimental results indicate that the proposed CS-PSO-SVM model achieves better classification accuracy and F-measure than PSO-SVM and GA-SVM. Therefore, we can conclude that our proposed method is very efficient compared to the previously reported algorithms.

  10. PSO-Based Support Vector Machine with Cuckoo Search Technique for Clinical Disease Diagnoses

    Directory of Open Access Journals (Sweden)

    Xiaoyong Liu

    2014-01-01

    Full Text Available Disease diagnosis is conducted with a machine learning method. We have proposed a novel machine learning method that hybridizes support vector machine (SVM, particle swarm optimization (PSO, and cuckoo search (CS. The new method consists of two stages: firstly, a CS based approach for parameter optimization of SVM is developed to find the better initial parameters of kernel function, and then PSO is applied to continue SVM training and find the best parameters of SVM. Experimental results indicate that the proposed CS-PSO-SVM model achieves better classification accuracy and F-measure than PSO-SVM and GA-SVM. Therefore, we can conclude that our proposed method is very efficient compared to the previously reported algorithms.

  11. Segmentation of Magnetic Resonance Imaging MRI using LS-SVM and Wavelet Multiresolution Analysis

    Directory of Open Access Journals (Sweden)

    Luis A. Muñoz-Bedoya

    2013-11-01

    Full Text Available Currently, support vector machines (SVM have become a powerful tool to solve nonlinear classification problems. For the optimization of the tool, has developed a reformulation known as LS-SVM (Support Vector Machine least squares, which works with a model based on function minimization and Lagrange polynomials. Therefore, this paper presents a method for segmentation of magnetic resonance images specifically to study the morphology of the lungs and reach the quantification of relevant features in these images using SVM and LS-SVM. In addition to sorting technique in this work using techniques such as wavelet analysis to eliminate irrelevant information (compression and Splines algorithms to interpolate the information found and quantify the characteristics, which in this work were based on the recognition area, shape and abnormal structures present in the lung of these images.

  12. Multiple Crop Classification Using Various Support Vector Machine Kernel Functions

    Directory of Open Access Journals (Sweden)

    Rupali R. Surase

    2015-01-01

    Full Text Available This study was carried out with techniques of Remote Sensing (RS based crop discrimination and area estimation with single date approach. Several kernel functions are employed and compared in this study for mapping the input space with including linear, sigmoid, and polynomial and Radial Basis Function (RBF. The present study highlights the advantages of Remote Sensing (RS and Geographic Information System (GIS techniques for analyzing the land use/land cover mapping for Aurangabad region of Maharashtra, India. Single date, cloud free IRS-Resourcesat-1 LISS-III data was used for further classification on training set for supervised classification. ENVI 4.4 is used for image analysis and interpretation. The experimental tests show that system is achieved 94.82% using SVM with kernel functions including Polynomial kernel function compared with Radial Basis Function, Sigmoid and linear kernel. The Overall Accuracy (OA to up to 5.17% in comparison to using sigmoid kernel function, and up to 3.45% in comparison to a 3rd degree polynomial kernel function and RBF with 200 as a penalty parameter.

  13. Extreme learning machine and adaptive sparse representation for image classification.

    Science.gov (United States)

    Cao, Jiuwen; Zhang, Kai; Luo, Minxia; Yin, Chun; Lai, Xiaoping

    2016-09-01

    Recent research has shown the speed advantage of extreme learning machine (ELM) and the accuracy advantage of sparse representation classification (SRC) in the area of image classification. Those two methods, however, have their respective drawbacks, e.g., in general, ELM is known to be less robust to noise while SRC is known to be time-consuming. Consequently, ELM and SRC complement each other in computational complexity and classification accuracy. In order to unify such mutual complementarity and thus further enhance the classification performance, we propose an efficient hybrid classifier to exploit the advantages of ELM and SRC in this paper. More precisely, the proposed classifier consists of two stages: first, an ELM network is trained by supervised learning. Second, a discriminative criterion about the reliability of the obtained ELM output is adopted to decide whether the query image can be correctly classified or not. If the output is reliable, the classification will be performed by ELM; otherwise the query image will be fed to SRC. Meanwhile, in the stage of SRC, a sub-dictionary that is adaptive to the query image instead of the entire dictionary is extracted via the ELM output. The computational burden of SRC thus can be reduced. Extensive experiments on handwritten digit classification, landmark recognition and face recognition demonstrate that the proposed hybrid classifier outperforms ELM and SRC in classification accuracy with outstanding computational efficiency. Copyright © 2016 Elsevier Ltd. All rights reserved.

  14. Extraction and Network Sharing of Forest Vegetation Information based on SVM

    Directory of Open Access Journals (Sweden)

    Zhang Hannv

    2013-05-01

    Full Text Available The support vector machine (SVM is a new method of data mining, which can deal with regression problems (time series analysis, pattern recognition (classification, discriminant analysis and many other issues very well. In recent years, SVM has been widely used in computer classification and recognition of remote sensing images. This paper is based on Landsat TM image data, using a classification method which is based on support vector machine to extract the forest cover information of Dahuanggou tree farm of Changbai Mountain area, and compare with the conventional maximum likelihood classification. The results show that extraction accuracy of forest information based on support vector machine, Kappa values are 0.9810, 0.9716, 0.9753, which are exceeding the extraction accuracy of maximum likelihood method (MLC and Kappa value of 0.9634, the method has good maneuverability and practicality.

  15. SVM-based prediction of caspase substrate cleavage sites

    Directory of Open Access Journals (Sweden)

    Ranganathan Shoba

    2006-12-01

    Full Text Available Abstract Background Caspases belong to a class of cysteine proteases which function as critical effectors in apoptosis and inflammation by cleaving substrates immediately after unique sites. Prediction of such cleavage sites will complement structural and functional studies on substrates cleavage as well as discovery of new substrates. Recently, different computational methods have been developed to predict the cleavage sites of caspase substrates with varying degrees of success. As the support vector machines (SVM algorithm has been shown to be useful in several biological classification problems, we have implemented an SVM-based method to investigate its applicability to this domain. Results A set of unique caspase substrates cleavage sites were obtained from literature and used for evaluating the SVM method. Datasets containing (i the tetrapeptide cleavage sites, (ii the tetrapeptide cleavage sites, augmented by two adjacent residues, P1' and P2' amino acids and (iii the tetrapeptide cleavage sites with ten additional upstream and downstream flanking sequences (where available were tested. The SVM method achieved an accuracy ranging from 81.25% to 97.92% on independent test sets. The SVM method successfully predicted the cleavage of a novel caspase substrate and its mutants. Conclusion This study presents an SVM approach for predicting caspase substrate cleavage sites based on the cleavage sites and the downstream and upstream flanking sequences. The method shows an improvement over existing methods and may be useful for predicting hitherto undiscovered cleavage sites.

  16. Computer-aided classification of Alzheimer's disease based on support vector machine with combination of cerebral image features in MRI

    Science.gov (United States)

    Jongkreangkrai, C.; Vichianin, Y.; Tocharoenchai, C.; Arimura, H.; Alzheimer's Disease Neuroimaging Initiative

    2016-03-01

    Several studies have differentiated Alzheimer's disease (AD) using cerebral image features derived from MR brain images. In this study, we were interested in combining hippocampus and amygdala volumes and entorhinal cortex thickness to improve the performance of AD differentiation. Thus, our objective was to investigate the useful features obtained from MRI for classification of AD patients using support vector machine (SVM). T1-weighted MR brain images of 100 AD patients and 100 normal subjects were processed using FreeSurfer software to measure hippocampus and amygdala volumes and entorhinal cortex thicknesses in both brain hemispheres. Relative volumes of hippocampus and amygdala were calculated to correct variation in individual head size. SVM was employed with five combinations of features (H: hippocampus relative volumes, A: amygdala relative volumes, E: entorhinal cortex thicknesses, HA: hippocampus and amygdala relative volumes and ALL: all features). Receiver operating characteristic (ROC) analysis was used to evaluate the method. AUC values of five combinations were 0.8575 (H), 0.8374 (A), 0.8422 (E), 0.8631 (HA) and 0.8906 (ALL). Although “ALL” provided the highest AUC, there were no statistically significant differences among them except for “A” feature. Our results showed that all suggested features may be feasible for computer-aided classification of AD patients.

  17. Studies of Machine Learning Photometric Classification of Supernovae

    Science.gov (United States)

    Macaluso, Joseph Nicholas; Cunningham, John; Kuhlmann, Stephen; Gupta, Ravi; Kovacs, Eve

    2017-01-01

    We studied the use of machine learning for the photometuric classification of Type Ia (SNIa) and core collapse (SNcc) supernovae. We used a combination of simulated data for the Dark Energy survey (DES) and real data from SDSS and chose our metrics to be the sample purity and the efficiency of identifying SNIa supernovae. Our focus was to quantify the effects of varying the training and parameters for random-forest decision-tree algorithms.

  18. Packet Classification using Support Vector Machines with String Kernels

    Directory of Open Access Journals (Sweden)

    Sarthak Munshi

    2016-08-01

    Full Text Available Since the inception of internet many methods have been devised to keep untrusted and malicious packets away from a user’s system . The traffic / packet classification can be used as an important tool to detect intrusion in the system. Using Machine Learning as an efficient statistical based approach for classifying packets is a novel method in practice today . This paper emphasizes upon using an advanced string kernel method within a support vector machine to classify packets .There exists a paper related to a similar problem using Machine Learning [2]. But the researches mentioned in their paper are not up-to date and doesn’t account for modern day string kernels that are much more efficient . My work extends their research by introducing different approaches to classify encrypted / unencrypted traffic / packets .

  19. Advanced binary search pattern for impedance spectra classification for determining the state of charge of a lithium iron phosphate cell using a support vector machine

    Science.gov (United States)

    Jansen, Patrick; Vollnhals, Michael; Renner, Daniel; Vergossen, David; John, Werner; Götze, Jürgen

    2016-09-01

    Further improvements on the novel method for state of charge (SOC) determination of lithium iron phosphate (LFP) cells based on the impedance spectra classification are presented. A Support Vector Machine (SVM) is applied to impedance spectra of a LFP cell, with each impedance spectrum representing a distinct SOC for a predefined temperature. As a SVM is a binary classifier, only the distinction between two SOC can be computed in one iteration of the algorithm. Therefore a search pattern is necessary. A balanced tree search was implemented with good results. In order to further improvements of the SVM method, this paper discusses two new search pattern, namely a linear search and an imbalanced tree search, the later one based on an initial educated guess. All three search pattern were compared under various aspects like accuracy, efficiency, tolerance of disturbances and temperature dependancy. The imbalanced search tree shows to be the most efficient search pattern if the initial guess is within less than ±5 % SOC of the original SOC in both directions and exhibits the best tolerance for high disturbances. Linear search improves the rate of exact classifications for almost every temperature. It also improves the robustness against high disturbances and can even detect a certain number of false classifications which makes this search pattern unique. The downside is a much lower efficiency as all impedance spectra have to be evaluated while the tree search pattern only evaluate those on the tree path.

  20. A comparison of machine learning algorithms for chemical toxicity classification using a simulated multi-scale data model

    Directory of Open Access Journals (Sweden)

    Li Zhen

    2008-05-01

    Full Text Available Abstract Background Bioactivity profiling using high-throughput in vitro assays can reduce the cost and time required for toxicological screening of environmental chemicals and can also reduce the need for animal testing. Several public efforts are aimed at discovering patterns or classifiers in high-dimensional bioactivity space that predict tissue, organ or whole animal toxicological endpoints. Supervised machine learning is a powerful approach to discover combinatorial relationships in complex in vitro/in vivo datasets. We present a novel model to simulate complex chemical-toxicology data sets and use this model to evaluate the relative performance of different machine learning (ML methods. Results The classification performance of Artificial Neural Networks (ANN, K-Nearest Neighbors (KNN, Linear Discriminant Analysis (LDA, Naïve Bayes (NB, Recursive Partitioning and Regression Trees (RPART, and Support Vector Machines (SVM in the presence and absence of filter-based feature selection was analyzed using K-way cross-validation testing and independent validation on simulated in vitro assay data sets with varying levels of model complexity, number of irrelevant features and measurement noise. While the prediction accuracy of all ML methods decreased as non-causal (irrelevant features were added, some ML methods performed better than others. In the limit of using a large number of features, ANN and SVM were always in the top performing set of methods while RPART and KNN (k = 5 were always in the poorest performing set. The addition of measurement noise and irrelevant features decreased the classification accuracy of all ML methods, with LDA suffering the greatest performance degradation. LDA performance is especially sensitive to the use of feature selection. Filter-based feature selection generally improved performance, most strikingly for LDA. Conclusion We have developed a novel simulation model to evaluate machine learning methods for the

  1. SVM-based multimodal classification of activities of daily living in Health Smart Homes: sensors, algorithms, and first experimental results.

    Science.gov (United States)

    Fleury, Anthony; Vacher, Michel; Noury, Norbert

    2010-03-01

    By 2050, about one third of the French population will be over 65. Our laboratory's current research focuses on the monitoring of elderly people at home, to detect a loss of autonomy as early as possible. Our aim is to quantify criteria such as the international activities of daily living (ADL) or the French Autonomie Gerontologie Groupes Iso-Ressources (AGGIR) scales, by automatically classifying the different ADL performed by the subject during the day. A Health Smart Home is used for this. Our Health Smart Home includes, in a real flat, infrared presence sensors (location), door contacts (to control the use of some facilities), temperature and hygrometry sensor in the bathroom, and microphones (sound classification and speech recognition). A wearable kinematic sensor also informs postural transitions (using pattern recognition) and walk periods (frequency analysis). This data collected from the various sensors are then used to classify each temporal frame into one of the ADL that was previously acquired (seven activities: hygiene, toilet use, eating, resting, sleeping, communication, and dressing/undressing). This is done using support vector machines. We performed a 1-h experimentation with 13 young and healthy subjects to determine the models of the different activities, and then we tested the classification algorithm (cross validation) with real data.

  2. Microcanonical Annealing and Threshold Accepting for Parameter Determination and Feature Selection of Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Seyyid Ahmed Medjahed

    2016-12-01

    Full Text Available Support vector machine (SVM is a popular classification technique with many diverse applications. Parameter determination and feature selection significantly influences the classification accuracy rate and the SVM model quality. This paper proposes two novel approaches based on: Microcanonical Annealing (MA-SVM and Threshold Accepting (TA-SVM to determine the optimal value parameter and the relevant features subset, without reducing SVM classification accuracy. In order to evaluate the performance of MA-SVM and TA-SVM, several public datasets are employed to compute the classification accuracy rate. The proposed approaches were tested in the context of medical diagnosis. Also, we tested the approaches on DNA microarray datasets used for cancer diagnosis. The results obtained by the MA-SVM and TA-SVM algorithms are shown to be superior and have given a good performance in the DNA microarray data sets which are characterized by the large number of features. Therefore, the MA-SVM and TA-SVM approaches are well suited for parameter determination and feature selection in SVM.

  3. Study of Protein Fold Recognition Using Optimization Method Extreme Learning Machine for Classification%极限学习机优化方法在蛋白质折叠类型识别中的应用

    Institute of Scientific and Technical Information of China (English)

    张志锋; 范乃梅

    2013-01-01

    传统的机器学习方法在处理蛋白质折叠类型识别问题时需要花费大量的时间来调节最佳的参数.利用一种新的极限学习机(Extreme Learning Machine,ELM)分类优化方法(Extreme Learning Machine for Classification,ELMC)对蛋白质折叠进行识别,仅需调节很少的参数值就可达到很好的测试精度.与支持向量机(Support Vector Machine,SVM)和推荐相关向量机(Relevance Vector Machine,RVM)相比,ELMC能获得更好的泛化性能,而且在寻找最优解的训练时间比较上,ELMC比SVM平均要快35倍,比RVM要快12倍.%With traditional machine learning methods, one may spends a lot of time adjusting the optimal parameters in tackling the problem of protein fold recognition. A new optimization method of extreme learning machine for classification ( ELMC ) is used to recognize the protein fold, one can only adjusts few parameters to achieve good enough testing accuracy. Compared to support vector machine (SVM)and relevance vector machine (RVM) , better generalization performance can be obtained by extreme earning machine for classification. In the comparison of training time in finding the optimal solution, ELMC is 35 times faster than SVM averagely and is 12 times faster than RVM averagely.

  4. Evaluating automatically parallelized versions of the support vector machine

    NARCIS (Netherlands)

    Codreanu, Valeriu; Droge, Bob; Williams, David; Yasar, Burhan; Yang, Fo; Liu, Baoquan; Dong, Feng; Surinta, Olarik; Schomaker, Lambertus; Roerdink, Jos; Wiering, Marco

    2014-01-01

    The support vector machine (SVM) is a supervised learning algorithm used for recognizing patterns in data. It is a very popular technique in machine learning and has been successfully used in applications such as image classification, protein classification, and handwriting recognition. However, the

  5. A new expert system for diagnosis of lung cancer: GDA-LS_SVM.

    Science.gov (United States)

    Avci, Engin

    2012-06-01

    In nowadays, there are many various diseases, whose diagnosis is very hardly. Lung cancer is one of this type diseases. It begins in the lungs and spreads to other organs of human body. In this paper, an expert diagnostic system based on General Discriminant Analysis (GDA) and Least Square Support Vector Machine (LS-SVM) Classifier for diagnosis of lung cancer. This expert diagnosis system is called as GDA-LS-SVM in rest of this paper. The GDA-LS-SVM expert diagnosis system has two stages. These are 1. Feature extraction and feature reduction stage and 2. Classification stage. In feature extraction and feature reduction stage, lung cancer dataset is obtained and dimension of this lung cancer dataset, which has 57 features, is reduced to eight features using Generalized Discriminant Analysis (GDA) method. Then, in classification stage, these reduced features are given to Least Squares Support Vector Machine (LS-SVM) classifier. The lung cancer dataset used in this study was taken from the UCI machine learning database. The classification accuracy of this GDA-LS-SVM expert system was obtained about 96.875% from results of these experimental studies.

  6. Learning features for tissue classification with the classification restricted Boltzmann machine

    NARCIS (Netherlands)

    G. van Tulder (Gijs); M. de Bruijne (Marleen)

    2014-01-01

    markdownabstract__Abstract__ Performance of automated tissue classification in medical imaging depends on the choice of descriptive features. In this paper, we show how restricted Boltzmann machines (RBMs) can be used to learn features that are especially suited for texture-based tissue

  7. A novel transmission line protection using DOST and SVM

    Directory of Open Access Journals (Sweden)

    M. Jaya Bharata Reddy

    2016-06-01

    Full Text Available This paper proposes a smart fault detection, classification and location (SFDCL methodology for transmission systems with multi-generators using discrete orthogonal Stockwell transform (DOST. The methodology is based on synchronized current measurements from remote telemetry units (RTUs installed at both ends of the transmission line. The energy coefficients extracted from the transient current signals due to occurrence of different types of faults using DOST are being utilized for real-time fault detection and classification. Support vector machine (SVM has been deployed for locating the fault distance using the extracted coefficients. A comparative study is performed for establishing the superiority of SVM over other popular computational intelligence methods, such as adaptive neuro-fuzzy inference system (ANFIS and artificial neural network (ANN, for more precise and reliable estimation of fault distance. The results corroborate the effectiveness of the suggested SFDCL algorithm for real-time transmission line fault detection, classification and localization.

  8. Imputation And Classification Of Missing Data Using Least Square Support Vector Machines – A New Approach In Dementia Diagnosis

    Directory of Open Access Journals (Sweden)

    T R Sivapriya

    2012-07-01

    Full Text Available This paper presents a comparison of different data imputation approaches used in filling missing data and proposes a combined approach to estimate accurately missing attribute values in a patient database. The present study suggests a more robust technique that is likely to supply a value closer to the one that is missing for effective classification and diagnosis. Initially data is clustered and z-score method is used to select possible values of an instance with missing attribute values. Then multiple imputation method using LSSVM (Least Squares Support Vector Machine is applied to select the most appropriate values for the missing attributes. Five imputed datasets have been used to demonstrate the performance of the proposed method. Experimental results show that our method outperforms conventional methods of multiple imputation and mean substitution. Moreover, the proposed method CZLSSVM (Clustered Z-score Least Square Support Vector Machine has been evaluated in two classification problems for incomplete data. The efficacy of the imputation methods have been evaluated using LSSVM classifier. Experimental results indicate that accuracy of the classification is increases with CZLSSVM in the case of missing attribute value estimation. It is found that CZLSSVM outperforms other data imputation approaches like decision tree, rough sets and artificial neural networks, K-NN (K-Nearest Neighbour and SVM. Further it is observed that CZLSSVM yields 95 per cent accuracy and prediction capability than other methods included and tested in the study.

  9. Comparison Between Wind Power Prediction Models Based on Wavelet Decomposition with Least-Squares Support Vector Machine (LS-SVM and Artificial Neural Network (ANN

    Directory of Open Access Journals (Sweden)

    Maria Grazia De Giorgi

    2014-08-01

    Full Text Available A high penetration of wind energy into the electricity market requires a parallel development of efficient wind power forecasting models. Different hybrid forecasting methods were applied to wind power prediction, using historical data and numerical weather predictions (NWP. A comparative study was carried out for the prediction of the power production of a wind farm located in complex terrain. The performances of Least-Squares Support Vector Machine (LS-SVM with Wavelet Decomposition (WD were evaluated at different time horizons and compared to hybrid Artificial Neural Network (ANN-based methods. It is acknowledged that hybrid methods based on LS-SVM with WD mostly outperform other methods. A decomposition of the commonly known root mean square error was beneficial for a better understanding of the origin of the differences between prediction and measurement and to compare the accuracy of the different models. A sensitivity analysis was also carried out in order to underline the impact that each input had in the network training process for ANN. In the case of ANN with the WD technique, the sensitivity analysis was repeated on each component obtained by the decomposition.

  10. A support vector machine for spectral classification of emission-line galaxies from the Sloan Digital Sky Survey

    Science.gov (United States)

    Shi, Fei; Liu, Yu-Yan; Sun, Guang-Lan; Li, Pei-Yu; Lei, Yu-Ming; Wang, Jian

    2015-10-01

    The emission-lines of galaxies originate from massive young stars or supermassive blackholes. As a result, spectral classification of emission-line galaxies into star-forming galaxies, active galactic nucleus (AGN) hosts, or compositions of both relates closely to formation and evolution of galaxy. To find efficient and automatic spectral classification method, especially in large surveys and huge data bases, a support vector machine (SVM) supervised learning algorithm is applied to a sample of emission-line galaxies from the Sloan Digital Sky Survey (SDSS) data release 9 (DR9) provided by the Max Planck Institute and the Johns Hopkins University (MPA/JHU). A two-step approach is adopted. (i) The SVM must be trained with a subset of objects that are known to be AGN hosts, composites or star-forming galaxies, treating the strong emission-line flux measurements as input feature vectors in an n-dimensional space, where n is the number of strong emission-line flux ratios. (ii) After training on a sample of emission-line galaxies, the remaining galaxies are automatically classified. In the classification process, we use a 10-fold cross-validation technique. We show that the classification diagrams based on the [N II]/Hα versus other emission-line ratio, such as [O III]/Hβ, [Ne III]/[O II], ([O III]λ4959+[O III]λ5007)/[O III]λ4363, [O II]/Hβ, [Ar III]/[O III], [S II]/Hα, and [O I]/Hα, plus colour, allows us to separate unambiguously AGN hosts, composites or star-forming galaxies. Among them, the diagram of [N II]/Hα versus [O III]/Hβ achieved an accuracy of 99 per cent to separate the three classes of objects. The other diagrams above give an accuracy of ˜91 per cent.

  11. Machine-learning methods in the classification of water bodies

    Directory of Open Access Journals (Sweden)

    Sołtysiak Marek

    2016-06-01

    Full Text Available Amphibian species have been considered as useful ecological indicators. They are used as indicators of environmental contamination, ecosystem health and habitat quality., Amphibian species are sensitive to changes in the aquatic environment and therefore, may form the basis for the classification of water bodies. Water bodies in which there are a large number of amphibian species are especially valuable even if they are located in urban areas. The automation of the classification process allows for a faster evaluation of the presence of amphibian species in the water bodies. Three machine-learning methods (artificial neural networks, decision trees and the k-nearest neighbours algorithm have been used to classify water bodies in Chorzów – one of 19 cities in the Upper Silesia Agglomeration. In this case, classification is a supervised data mining method consisting of several stages such as building the model, the testing phase and the prediction. Seven natural and anthropogenic features of water bodies (e.g. the type of water body, aquatic plants, the purpose of the water body (destination, position of the water body in relation to any possible buildings, condition of the water body, the degree of littering, the shore type and fishing activities have been taken into account in the classification. The data set used in this study involved information about 71 different water bodies and 9 amphibian species living in them. The results showed that the best average classification accuracy was obtained with the multilayer perceptron neural network.

  12. Expected energy-based restricted Boltzmann machine for classification.

    Science.gov (United States)

    Elfwing, S; Uchibe, E; Doya, K

    2015-04-01

    In classification tasks, restricted Boltzmann machines (RBMs) have predominantly been used in the first stage, either as feature extractors or to provide initialization of neural networks. In this study, we propose a discriminative learning approach to provide a self-contained RBM method for classification, inspired by free-energy based function approximation (FE-RBM), originally proposed for reinforcement learning. For classification, the FE-RBM method computes the output for an input vector and a class vector by the negative free energy of an RBM. Learning is achieved by stochastic gradient-descent using a mean-squared error training objective. In an earlier study, we demonstrated that the performance and the robustness of FE-RBM function approximation can be improved by scaling the free energy by a constant that is related to the size of network. In this study, we propose that the learning performance of RBM function approximation can be further improved by computing the output by the negative expected energy (EE-RBM), instead of the negative free energy. To create a deep learning architecture, we stack several RBMs on top of each other. We also connect the class nodes to all hidden layers to try to improve the performance even further. We validate the classification performance of EE-RBM using the MNIST data set and the NORB data set, achieving competitive performance compared with other classifiers such as standard neural networks, deep belief networks, classification RBMs, and support vector machines. The purpose of using the NORB data set is to demonstrate that EE-RBM with binary input nodes can achieve high performance in the continuous input domain. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.

  13. Classification of HCV NS5B Polymerase Inhibitors Using Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Changyuan Yu

    2012-03-01

    Full Text Available Using a support vector machine (SVM, three classification models were built to predict whether a compound is an active or weakly active inhibitor based on a dataset of 386 hepatitis C virus (HCV NS5B polymerase NNIs (non-nucleoside analogue inhibitors fitting into the pocket of the NNI III binding site. For each molecule, global descriptors, 2D and 3D property autocorrelation descriptors were calculated from the program ADRIANA.Code. Three models were developed with the combination of different types of descriptors. Model 2 based on 16 global and 2D autocorrelation descriptors gave the highest prediction accuracy of 88.24% and MCC (Matthews correlation coefficient of 0.789 on test set. Model 1 based on 13 global descriptors showed the highest prediction accuracy of 86.25% and MCC of 0.732 on external test set (including 80 compounds. Some molecular properties such as molecular shape descriptors (InertiaZ, InertiaX and Span, number of rotatable bonds (NRotBond, water solubility (LogS, and hydrogen bonding related descriptors performed important roles in the interactions between the ligand and NS5B polymerase.

  14. A Novel Classification Algorithm Based on Incremental Semi-Supervised Support Vector Machine

    Science.gov (United States)

    Gao, Fei; Mei, Jingyuan; Sun, Jinping; Wang, Jun; Yang, Erfu; Hussain, Amir

    2015-01-01

    For current computational intelligence techniques, a major challenge is how to learn new concepts in changing environment. Traditional learning schemes could not adequately address this problem due to a lack of dynamic data selection mechanism. In this paper, inspired by human learning process, a novel classification algorithm based on incremental semi-supervised support vector machine (SVM) is proposed. Through the analysis of prediction confidence of samples and data distribution in a changing environment, a “soft-start” approach, a data selection mechanism and a data cleaning mechanism are designed, which complete the construction of our incremental semi-supervised learning system. Noticeably, with the ingenious design procedure of our proposed algorithm, the computation complexity is reduced effectively. In addition, for the possible appearance of some new labeled samples in the learning process, a detailed analysis is also carried out. The results show that our algorithm does not rely on the model of sample distribution, has an extremely low rate of introducing wrong semi-labeled samples and can effectively make use of the unlabeled samples to enrich the knowledge system of classifier and improve the accuracy rate. Moreover, our method also has outstanding generalization performance and the ability to overcome the concept drift in a changing environment. PMID:26275294

  15. A Novel Classification Algorithm Based on Incremental Semi-Supervised Support Vector Machine.

    Directory of Open Access Journals (Sweden)

    Fei Gao

    Full Text Available For current computational intelligence techniques, a major challenge is how to learn new concepts in changing environment. Traditional learning schemes could not adequately address this problem due to a lack of dynamic data selection mechanism. In this paper, inspired by human learning process, a novel classification algorithm based on incremental semi-supervised support vector machine (SVM is proposed. Through the analysis of prediction confidence of samples and data distribution in a changing environment, a "soft-start" approach, a data selection mechanism and a data cleaning mechanism are designed, which complete the construction of our incremental semi-supervised learning system. Noticeably, with the ingenious design procedure of our proposed algorithm, the computation complexity is reduced effectively. In addition, for the possible appearance of some new labeled samples in the learning process, a detailed analysis is also carried out. The results show that our algorithm does not rely on the model of sample distribution, has an extremely low rate of introducing wrong semi-labeled samples and can effectively make use of the unlabeled samples to enrich the knowledge system of classifier and improve the accuracy rate. Moreover, our method also has outstanding generalization performance and the ability to overcome the concept drift in a changing environment.

  16. Dimension Reduction via Unsupervised Learning Yields Significant Computational Improvements for Support Vector Machine Based Protein Family Classification.

    Energy Technology Data Exchange (ETDEWEB)

    Webb-Robertson, Bobbie-Jo M.; Matzke, Melissa M.; Oehmen, Christopher S.

    2009-02-26

    Reducing the dimension of vectors used in training support vector machines (SVMs) results in a proportional speedup in training time. For large-scale problems this can make the difference between tractable and intractable training tasks. However, it is critical that classifiers trained on reduced datasets perform as reliably as their counterparts trained on high-dimensional data. We assessed principal component analysis (PCA) and sequential project pursuit (SPP) as dimension reduction strategies in the biology application of classifying proteins into well-defined functional ‘families’ (SVM-based protein family classification) by their impact on run-time, sensitivity and selectivity. Homology vectors of 4352 elements were reduced to approximately 2% of the original data size without significantly affecting accuracy using PCA and SPP, while leading to approximately a 28-fold speedup in run-time.

  17. Identification of eggs from different production systems based on hyperspectra and CS-SVM.

    Science.gov (United States)

    Sun, J; Cong, S; Mao, H; Zhou, X; Wu, X; Zhang, X

    2017-01-19

    1. To identify the origin of table eggs more accurately, a method based on hyperspectral imaging technology was studied. 2. The hyperspectral data of 200 samples of intensive and extensive eggs were collected. Standard normalised variables (SNV) combined with Savitzky-Golay (SG) were used to eliminate noise, then stepwise regression (SWR) was used for feature selection. Grid search algorithm (GS), genetic search algorithm (GA), particle swarm optimisation algorithm (PSO) and cuckoo search algorithm (CS) were applied by support vector machine (SVM) to establish a SVM identification model with the optimal parameters. The full spectrum data and the data after feature selection were the input of the model while egg category was the output. 3. The SWR-CS-SVM model performed better than the other models, including SWR-GS-SVM, SWR-GA-SVM, SWR-PSO-SVM and others based on full spectral data. The training and test classification accuracy of the SWR-CS-SVM model were respectively 99.3% and 96%. 4. SWR-CS-SVM proved effective for identifying egg varieties and could also be useful for the non-destructive identification of other types of egg.

  18. Characterization and classification of tumor lesions using computerized fractal-based texture analysis and support vector machines in digital mammograms.

    Science.gov (United States)

    Guo, Qi; Shao, Jiaqing; Ruiz, Virginie F

    2009-01-01

    This paper presents a detailed study of fractal-based methods for texture characterization of mammographic mass lesions and architectural distortion. The purpose of this study is to explore the use of fractal and lacunarity analysis for the characterization and classification of both tumor lesions and normal breast parenchyma in mammography. We conducted comparative evaluations of five popular fractal dimension estimation methods for the characterization of the texture of mass lesions and architectural distortion. We applied the concept of lacunarity to the description of the spatial distribution of the pixel intensities in mammographic images. These methods were tested with a set of 57 breast masses and 60 normal breast parenchyma (dataset1), and with another set of 19 architectural distortions and 41 normal breast parenchyma (dataset2). Support vector machines (SVM) were used as a pattern classification method for tumor classification. Experimental results showed that the fractal dimension of region of interest (ROIs) depicting mass lesions and architectural distortion was statistically significantly lower than that of normal breast parenchyma for all five methods. Receiver operating characteristic (ROC) analysis showed that fractional Brownian motion (FBM) method generated the highest area under ROC curve (A ( z ) = 0.839 for dataset1, 0.828 for dataset2, respectively) among five methods for both datasets. Lacunarity analysis showed that the ROIs depicting mass lesions and architectural distortion had higher lacunarities than those of ROIs depicting normal breast parenchyma. The combination of FBM fractal dimension and lacunarity yielded the highest A ( z ) value (0.903 and 0.875, respectively) than those based on single feature alone for both given datasets. The application of the SVM improved the performance of the fractal-based features in differentiating tumor lesions from normal breast parenchyma by generating higher A ( z ) value. FBM texture model is the

  19. A study of the effectiveness of machine learning methods for classification of clinical interview fragments into a large number of categories.

    Science.gov (United States)

    Hasan, Mehedi; Kotov, Alexander; Idalski Carcone, April; Dong, Ming; Naar, Sylvie; Brogan Hartlieb, Kathryn

    2016-08-01

    This study examines the effectiveness of state-of-the-art supervised machine learning methods in conjunction with different feature types for the task of automatic annotation of fragments of clinical text based on codebooks with a large number of categories. We used a collection of motivational interview transcripts consisting of 11,353 utterances, which were manually annotated by two human coders as the gold standard, and experimented with state-of-art classifiers, including Naïve Bayes, J48 Decision Tree, Support Vector Machine (SVM), Random Forest (RF), AdaBoost, DiscLDA, Conditional Random Fields (CRF) and Convolutional Neural Network (CNN) in conjunction with lexical, contextual (label of the previous utterance) and semantic (distribution of words in the utterance across the Linguistic Inquiry and Word Count dictionaries) features. We found out that, when the number of classes is large, the performance of CNN and CRF is inferior to SVM. When only lexical features were used, interview transcripts were automatically annotated by SVM with the highest classification accuracy among all classifiers of 70.8%, 61% and 53.7% based on the codebooks consisting of 17, 20 and 41 codes, respectively. Using contextual and semantic features, as well as their combination, in addition to lexical ones, improved the accuracy of SVM for annotation of utterances in motivational interview transcripts with a codebook consisting of 17 classes to 71.5%, 74.2%, and 75.1%, respectively. Our results demonstrate the potential of using machine learning methods in conjunction with lexical, semantic and contextual features for automatic annotation of clinical interview transcripts with near-human accuracy.

  20. Medical Dataset Classification: A Machine Learning Paradigm Integrating Particle Swarm Optimization with Extreme Learning Machine Classifier

    OpenAIRE

    C. V. Subbulakshmi; Deepa, S. N.

    2015-01-01

    Medical data classification is a prime data mining problem being discussed about for a decade that has attracted several researchers around the world. Most classifiers are designed so as to learn from the data itself using a training process, because complete expert knowledge to determine classifier parameters is impracticable. This paper proposes a hybrid methodology based on machine learning paradigm. This paradigm integrates the successful exploration mechanism called self-regulated learni...

  1. 一种基于 QBC 的 SVM 主动学习算法%Active learning algorithm for SVM based on QBC

    Institute of Scientific and Technical Information of China (English)

    徐海龙; 别晓峰; 冯卉; 吴天爱

    2015-01-01

    To the problem that large-scale labeled samples is not easy to acquire and the class-unbalanced dataset in the course of souport vector machine (SVM)training,an active learning algorithm based on query by committee (QBC)for SVM(QBC-ASVM)is proposed,which efficiently combines the improved QBC active learning and the weighted SVM.In this method,QBC active learning is used to select the samples which are the most valuable to the current SVM classifier,and the weighted SVM is used to reduce the impact of the unba-lanced data set on SVMs active learning.The experimental results show that the proposed approach can consid-erably reduce the labeled samples and costs compared with the passive SVM,and at the same time,it can ensure that the accurate classification performance is kept as the passive SVM,and the proposed method improves gen-eralization performance and also expedites the SVM training.%针对支持向量机(souport vector machine,SVM)训练学习过程中样本分布不均衡、难以获得大量带有类标注样本的问题,提出一种基于委员会投票选择(query by committee,QBC)的 SVM 主动学习算法 QBC-AS-VM,将改进的 QBC 主动学习方法与加权 SVM 方法有机地结合应用于 SVM 训练学习中,通过改进的 QBC 主动学习,主动选择那些对当前 SVM 分类器最有价值的样本进行标注,在 SVM 主动学习中应用改进的加权 SVM,减少了样本分布不均衡对 SVM 主动学习性能的影响,实验结果表明在保证不影响分类精度的情况下,所提出的算法需要标记的样本数量大大少于随机采样法需要标记的样本数量,降低了学习的样本标记代价,提高了 SVM 泛化性能而且训练速度同样有所提高。

  2. Comparison of SVM and ANFIS for Snore Related Sounds Classification by Using the Largest Lyapunov Exponent and Entropy

    Science.gov (United States)

    Ankışhan, Haydar; Yılmaz, Derya

    2013-01-01

    Snoring, which may be decisive for many diseases, is an important indicator especially for sleep disorders. In recent years, many studies have been performed on the snore related sounds (SRSs) due to producing useful results for detection of sleep apnea/hypopnea syndrome (SAHS). The first important step of these studies is the detection of snore from SRSs by using different time and frequency domain features. The SRSs have a complex nature that is originated from several physiological and physical conditions. The nonlinear characteristics of SRSs can be examined with chaos theory methods which are widely used to evaluate the biomedical signals and systems, recently. The aim of this study is to classify the SRSs as snore/breathing/silence by using the largest Lyapunov exponent (LLE) and entropy with multiclass support vector machines (SVMs) and adaptive network fuzzy inference system (ANFIS). Two different experiments were performed for different training and test data sets. Experimental results show that the multiclass SVMs can produce the better classification results than ANFIS with used nonlinear quantities. Additionally, these nonlinear features are carrying meaningful information for classifying SRSs and are able to be used for diagnosis of sleep disorders such as SAHS. PMID:24194786

  3. Determination of fetal state from cardiotocogram using LS-SVM with particle swarm optimization and binary decision tree.

    Science.gov (United States)

    Yılmaz, Ersen; Kılıkçıer, Cağlar

    2013-01-01

    We use least squares support vector machine (LS-SVM) utilizing a binary decision tree for classification of cardiotocogram to determine the fetal state. The parameters of LS-SVM are optimized by particle swarm optimization. The robustness of the method is examined by running 10-fold cross-validation. The performance of the method is evaluated in terms of overall classification accuracy. Additionally, receiver operation characteristic analysis and cobweb representation are presented in order to analyze and visualize the performance of the method. Experimental results demonstrate that the proposed method achieves a remarkable classification accuracy rate of 91.62%.

  4. Performance Comparison of SVM and ANN for Handwritten Devnagari Character Recognition

    Directory of Open Access Journals (Sweden)

    Sandhya Arora

    2010-05-01

    Full Text Available Classification methods based on learning from examples have been widely applied to character recognition from the 1990s and have brought forth significant improvements of recognition accuracies. This class of methods includes statistical methods, artificial neural networks, support vector machines (SVM, multiple classifier combination, etc. In this paper, we discuss the characteristics of the some classification methods that have been successfully applied to handwritten Devnagari character recognition and results of SVM and ANNs classification method, applied on Handwritten Devnagari characters. After preprocessing the character image, we extracted shadow features, chain code histogram features, view based features and longest run features. These features are then fed to Neural classifier and in support vector machine for classification. In neural classifier, we explored three ways of combining decisions of four MLP's, designed for four different features.

  5. Performance Comparison of SVM and ANN for Handwritten Devnagari Character Recognition

    CERN Document Server

    Arora, Sandhya; Nasipuri, Mita; Malik, L; Kundu, M; Basu, D K

    2010-01-01

    Classification methods based on learning from examples have been widely applied to character recognition from the 1990s and have brought forth significant improvements of recognition accuracies. This class of methods includes statistical methods, artificial neural networks, support vector machines (SVM), multiple classifier combination, etc. In this paper, we discuss the characteristics of the some classification methods that have been successfully applied to handwritten Devnagari character recognition and results of SVM and ANNs classification method, applied on Handwritten Devnagari characters. After preprocessing the character image, we extracted shadow features, chain code histogram features, view based features and longest run features. These features are then fed to Neural classifier and in support vector machine for classification. In neural classifier, we explored three ways of combining decisions of four MLP's designed for four different features.

  6. 可保证分类性能的最小二乘支持向量机%Least squares support vector machine classifiers with guaranteed classification performance

    Institute of Scientific and Technical Information of China (English)

    徐金宝; 廖雷; 业巧林

    2009-01-01

    Support Vector Machine (SVM) is one of focuses of research and application in classification.A new least-squares-based algorithm that introduces a within-class scatter with guaranteed classification performance(VSLSVM) in the design of least squares support vector machines(LS-SVM) is presented.This algorithm can obtain better correctness that reformulates primal LS-SVM problems with optimality criterion Min w'Mw where w is the weight vector corresponding the primal LS-SVM problems,M is the within-class scatter matrix.This method only requires to solve a linear system instead of a quadratic programming problem. Experiments are included to compare SVM and Suykens' approach.%当前支持向量机是分类研究与应用的一个热点.提出了一个新的最小二乘支持向量机算法,该算法向最小二乘支持向量机(LS-SVM)优化模型中融入了类内散度(VSLSVM)思想,即用优化准则Min w′Mw对原LS-SVM进行重组合,w为对应LS-SVM中的权向量,M是类内散度矩阵.提出的方法仅仅需要求解一个线性系统而不是凸规划问题,实验主要对SVM和Suykens等人的方法进行了比较,并验证了提出的算法的有效性.

  7. SVM加权学习下的机载LiDAR数据多元分类研究%Aerial LiDAR Data Classification Using Weighted Support Vector Machines

    Institute of Scientific and Technical Information of China (English)

    吴军; 刘荣; 郭宁; 刘丽娟

    2013-01-01

    This paper presents our research on classifying scattered 3D aerial LiDAR height data into ground, vegetable (trees) and man-made object (buildings) using improved Support Vector Machine algorithm. To this end, the most basic theory of SVM is first outlined and with the fact that features are differed in their contribution to identify certain class or classes simultaneously, Weighted Support Vector Machine (W-SVM) technique is developed for maximizing the "recognition" capacity of SVM features in classifying scattered 3D LiDAR height data. Second, we give a proof that the implement of W-SVM is equal to the features normalization multiplied by one weight that indicates feature's contribution to certain class or multi-class as a whole. The weight calculation for each feature is discussed as well. Third, Based on W-SVM technique, one 1AAA1 solution to multi-class classification is proposed by integration "one against one" and "one against all" solution together. Finally, the experiment of classifying LiDAR data with presented technique is presented and shows encouraging improvement classification accuracy, compared to tradition SVM technique. Valuable conclusions are given as well.%基于支持向量机统计学习分类过程中不同特征对分类结果贡献存在差异的问题,提出了支持向量机加权学习下的训练、分类新方法,以实现对城区机载LiDAR数据多元分类(地面、树木、建筑),并对特征矢量加权归一化、特征权重计算以及该方式下多元分类策略的建立进行了讨论,实验证明了该方法的有效性.

  8. Predicting the Metabolic Sites by Flavin-Containing Monooxygenase on Drug Molecules Using SVM Classification on Computed Quantum Mechanics and Circular Fingerprints Molecular Descriptors

    Science.gov (United States)

    Fu, Chien-wei; Lin, Thy-Hou

    2017-01-01

    As an important enzyme in Phase I drug metabolism, the flavin-containing monooxygenase (FMO) also metabolizes some xenobiotics with soft nucleophiles. The site of metabolism (SOM) on a molecule is the site where the metabolic reaction is exerted by an enzyme. Accurate prediction of SOMs on drug molecules will assist the search for drug leads during the optimization process. Here, some quantum mechanics features such as the condensed Fukui function and attributes from circular fingerprints (called Molprint2D) are computed and classified using the support vector machine (SVM) for predicting some potential SOMs on a series of drugs that can be metabolized by FMO enzymes. The condensed Fukui function fA− representing the nucleophilicity of central atom A and the attributes from circular fingerprints accounting the influence of neighbors on the central atom. The total number of FMO substrates and non-substrates collected in the study is 85 and they are equally divided into the training and test sets with each carrying roughly the same number of potential SOMs. However, only N-oxidation and S-oxidation features were considered in the prediction since the available C-oxidation data was scarce. In the training process, the LibSVM package of WEKA package and the option of 10-fold cross validation are employed. The prediction performance on the test set evaluated by accuracy, Matthews correlation coefficient and area under ROC curve computed are 0.829, 0.659, and 0.877 respectively. This work reveals that the SVM model built can accurately predict the potential SOMs for drug molecules that are metabolizable by the FMO enzymes. PMID:28072829

  9. Investigating the use of support vector machine classification on structural brain images of preterm-born teenagers as a biological marker.

    Directory of Open Access Journals (Sweden)

    Carlton Chu

    Full Text Available Preterm birth has been shown to induce an altered developmental trajectory of brain structure and function. With the aid support vector machine (SVM classification methods we aimed to investigate whether MRI data, collected in adolescence, could be used to predict whether an individual had been born preterm or at term. To this end we collected T1-weighted anatomical MRI data from 143 individuals (69 controls, mean age 14.6y. The inclusion criteria for those born preterm were birth weight ≤ 1500g and gestational age < 37w. A linear SVM was trained on the grey matter segment of MR images in two different ways. First, all the individuals were used for training and classification was performed by the leave-one-out method, yielding 93% correct classification (sensitivity = 0.905, specificity = 0.942. Separately, a random half of the available data were used for training twice and each time the other, unseen, half of the data was classified, resulting 86% and 91% accurate classifications. Both gestational age (R = -0.24, p<0.04 and birth weight (R = -0.51, p < 0.001 correlated with the distance to decision boundary within the group of individuals born preterm. Statistically significant correlations were also found between IQ (R = -0.30, p < 0.001 and the distance to decision boundary. Those born small for gestational age did not form a separate subgroup in these analyses. The high rate of correct classification by the SVM motivates further investigation. The long-term goal is to automatically and non-invasively predict the outcome of preterm-born individuals on an individual basis using as early a scan as possible.

  10. Stellar classification from single-band imaging using machine learning

    CERN Document Server

    Kuntzer, T; Courbin, F

    2016-01-01

    Information on the spectral types of stars is of great interest in view of the exploitation of space-based imaging surveys. In this article, we investigate the classification of stars into spectral types using only the shape of their diffraction pattern in a single broad-band image. We propose a supervised machine learning approach to this endeavour, based on principal component analysis (PCA) for dimensionality reduction, followed by artificial neural networks (ANNs) estimating the spectral type. Our analysis is performed with image simulations mimicking the Hubble Space Telescope (HST) Advanced Camera for Surveys (ACS) in the F606W and F814W bands, as well as the Euclid VIS imager. We first demonstrate this classification in a simple context, assuming perfect knowledge of the point spread function (PSF) model and the possibility of accurately generating mock training data for the machine learning. We then analyse its performance in a fully data-driven situation, in which the training would be performed with...

  11. Classification of Phishing Email Using Random Forest Machine Learning Technique

    Directory of Open Access Journals (Sweden)

    Andronicus A. Akinyelu

    2014-01-01

    Full Text Available Phishing is one of the major challenges faced by the world of e-commerce today. Thanks to phishing attacks, billions of dollars have been lost by many companies and individuals. In 2012, an online report put the loss due to phishing attack at about $1.5 billion. This global impact of phishing attacks will continue to be on the increase and thus requires more efficient phishing detection techniques to curb the menace. This paper investigates and reports the use of random forest machine learning algorithm in classification of phishing attacks, with the major objective of developing an improved phishing email classifier with better prediction accuracy and fewer numbers of features. From a dataset consisting of 2000 phishing and ham emails, a set of prominent phishing email features (identified from the literature were extracted and used by the machine learning algorithm with a resulting classification accuracy of 99.7% and low false negative (FN and false positive (FP rates.

  12. Clifford support vector machines for classification, regression, and recurrence.

    Science.gov (United States)

    Bayro-Corrochano, Eduardo Jose; Arana-Daniel, Nancy

    2010-11-01

    This paper introduces the Clifford support vector machines (CSVM) as a generalization of the real and complex-valued support vector machines using the Clifford geometric algebra. In this framework, we handle the design of kernels involving the Clifford or geometric product. In this approach, one redefines the optimization variables as multivectors. This allows us to have a multivector as output. Therefore, we can represent multiple classes according to the dimension of the geometric algebra in which we work. We show that one can apply CSVM for classification and regression and also to build a recurrent CSVM. The CSVM is an attractive approach for the multiple input multiple output processing of high-dimensional geometric entities. We carried out comparisons between CSVM and the current approaches to solve multiclass classification and regression. We also study the performance of the recurrent CSVM with experiments involving time series. The authors believe that this paper can be of great use for researchers and practitioners interested in multiclass hypercomplex computing, particularly for applications in complex and quaternion signal and image processing, satellite control, neurocomputation, pattern recognition, computer vision, augmented virtual reality, robotics, and humanoids.

  13. NESVM: a Fast Gradient Method for Support Vector Machines

    CERN Document Server

    Zhou, Tianyi; Wu, Xindong

    2010-01-01

    Support vector machines (SVMs) are invaluable tools for many practical applications in artificial intelligence, e.g., classification and event recognition. However, popular SVM solvers are not sufficiently efficient for applications with a great deal of samples as well as a large number of features. In this paper, thus, we present NESVM, a fast gradient SVM solver that can optimize various SVM models, e.g., classical SVM, linear programming SVM and least square SVM. Compared against SVM-Perf \\cite{SVM_Perf}\\cite{PerfML} (its convergence rate in solving the dual SVM is upper bounded by $\\mathcal O(1/\\sqrt{k})$, wherein $k$ is the number of iterations.) and Pegasos \\cite{Pegasos} (online SVM that converges at rate $\\mathcal O(1/k)$ for the primal SVM), NESVM achieves the optimal convergence rate at $\\mathcal O(1/k^{2})$ and a linear time complexity. In particular, NESVM smoothes the non-differentiable hinge loss and $\\ell_1$-norm in the primal SVM. Then the optimal gradient method without any line search is ado...

  14. Fault diagnosis of a mine hoist using PCA and SVM techniques

    Institute of Scientific and Technical Information of China (English)

    CHANG Yan-wei; WANG Yao-cai; LIU Tao; WANG Zhi-jie

    2008-01-01

    A new method based on principal component analysis (PCA) and support vector machines (SVMs) is proposed for fault diagnosis of mine hoists. PCA is used to extract the principal features associated with the gearbox. Then, with the irrelevant gearbox variables removed, the remaining gearbox, the hydraulic system and the wire rope parameters were used as input to a multi-class SVM. The SVM is first trained by using the one class-based multi-class optimization algorithm and it is then applied to fault identification. Comparison of various methods showed the PCA-SVM method successfully removed redundancy to solve the dimensionality curse. These results show that the algorithm using the RBF kernel function for the SVM had the best classification properties.

  15. Human Walking Pattern Recognition Based on KPCA and SVM with Ground Reflex Pressure Signal

    Directory of Open Access Journals (Sweden)

    Zhaoqin Peng

    2013-01-01

    Full Text Available Algorithms based on the ground reflex pressure (GRF signal obtained from a pair of sensing shoes for human walking pattern recognition were investigated. The dimensionality reduction algorithms based on principal component analysis (PCA and kernel principal component analysis (KPCA for walking pattern data compression were studied in order to obtain higher recognition speed. Classifiers based on support vector machine (SVM, SVM-PCA, and SVM-KPCA were designed, and the classification performances of these three kinds of algorithms were compared using data collected from a person who was wearing the sensing shoes. Experimental results showed that the algorithm fusing SVM and KPCA had better recognition performance than the other two methods. Experimental outcomes also confirmed that the sensing shoes developed in this paper can be employed for automatically recognizing human walking pattern in unlimited environments which demonstrated the potential application in the control of exoskeleton robots.

  16. Approximate entropy and support vector machines for electroencephalogram signal classification*****

    Institute of Scientific and Technical Information of China (English)

    Zhen Zhang; Yi Zhou; Ziyi Chen; Xianghua Tian; Shouhong Du; Ruimei Huang

    2013-01-01

    The automatic detection and identification of electroencephalogram waves play an important role in the prediction, diagnosis and treatment of epileptic seizures. In this study, a nonlinear dynamics index-approximate entropy and a support vector machine that has strong generalization ability were applied to classify electroencephalogram signals at epileptic interictal and ictal periods. Our aim was to verify whether approximate entropy waves can be effectively applied to the automatic real-time detection of epilepsy in the electroencephalogram, and to explore its generalization ability as a classifier trained using a nonlinear dynamics index. Four patients presenting with partial epi-leptic seizures were included in this study. They were al diagnosed with neocortex localized epi-lepsy and epileptic foci were clearly observed by electroencephalogram. The electroencephalogram data form the four involved patients were segmented and the characteristic values of each segment, that is, the approximate entropy, were extracted. The support vector machine classifier was con-structed with the approximate entropy extracted from one epileptic case, and then electroence-phalogram waves of the other three cases were classified, reaching a 93.33%accuracy rate. Our findings suggest that the use of approximate entropy al ows the automatic real-time detection of electroencephalogram data in epileptic cases. The combination of approximate entropy and support vector machines shows good generalization ability for the classification of electroencephalogram signals for epilepsy.

  17. 基于SVM的汉语评论情感分类方法研究%Sentiment Classification for Chinese Reviews Based on SVM

    Institute of Scientific and Technical Information of China (English)

    连凯

    2012-01-01

    情感分类是一项具有较大实用价值的分类技术.它可以对网上纷繁复杂的信息进行情感倾向标注.为用户提供一个简洁的总结信息,进而为人们制定决策提供帮助,然而目前针对汉语的情感分类开展的工作并不多。提出一种基于SVM机器学习的情感分类方法,并引入基于2-POS模型的句子主观性分析方法,利用SVM进行机器学习,实现汉语评论的情感分类。实验表明这种方法能够有效地判定评论信息的情感倾向。%Sentiment classification is a technology with great practical value. It can be used to analyze and mine user's opinion on the Web, help people making decisions. However, there is not too much related work done for Chinese reviews. Presents a sentiment classification method for Chinese reviews based on SVM. And introduces a sentence subjectivity analysis method based on 2-POS model and uses SVM to classify the reviews into positive or negative. Experiment shows that our method can gain a well performance of sentiment classification for Chinese reviews.

  18. 支持向量机多分类问题研究%Research on the Multi-class Classification Problem of Support Vector Machine

    Institute of Scientific and Technical Information of China (English)

    肖晓; 张敏

    2014-01-01

    The support vector machine (SVM)is a typical two-class classification methods and how to extend it to multi-class classification has been increasingly a hotspot in the research of scholars.Experiments with the standard datasets were carried retrospectively on the existing methods of SVM multiclass classification,such as one against one method,one against rest meth-od and directed acyclic graph,showed their respective merits and defects.The results indicate that directed acyclic graph is naturally suitable for the multi-class classification of large-scale data with ideal training speed,which has a certain reference value.%支持向量机是典型的两类分类方法,如何将其推广到多分类问题是学者们正在研究的一个热点。对比分析几种常用的多类方法的优缺点,利用标准数据集对多类支持向量机的速度和精度两方面进行试验分析。研究表明,对于大规模的多类分类问题,有向无环图简单易行,具有理想的训练速度与精度,具有一定的参考价值。

  19. Stellar classification from single-band imaging using machine learning

    Science.gov (United States)

    Kuntzer, T.; Tewes, M.; Courbin, F.

    2016-06-01

    Information on the spectral types of stars is of great interest in view of the exploitation of space-based imaging surveys. In this article, we investigate the classification of stars into spectral types using only the shape of their diffraction pattern in a single broad-band image. We propose a supervised machine learning approach to this endeavour, based on principal component analysis (PCA) for dimensionality reduction, followed by artificial neural networks (ANNs) estimating the spectral type. Our analysis is performed with image simulations mimicking the Hubble Space Telescope (HST) Advanced Camera for Surveys (ACS) in the F606W and F814W bands, as well as the Euclid VIS imager. We first demonstrate this classification in a simple context, assuming perfect knowledge of the point spread function (PSF) model and the possibility of accurately generating mock training data for the machine learning. We then analyse its performance in a fully data-driven situation, in which the training would be performed with a limited subset of bright stars from a survey, and an unknown PSF with spatial variations across the detector. We use simulations of main-sequence stars with flat distributions in spectral type and in signal-to-noise ratio, and classify these stars into 13 spectral subclasses, from O5 to M5. Under these conditions, the algorithm achieves a high success rate both for Euclid and HST images, with typical errors of half a spectral class. Although more detailed simulations would be needed to assess the performance of the algorithm on a specific survey, this shows that stellar classification from single-band images is well possible.

  20. Medical Dataset Classification: A Machine Learning Paradigm Integrating Particle Swarm Optimization with Extreme Learning Machine Classifier

    Directory of Open Access Journals (Sweden)

    C. V. Subbulakshmi

    2015-01-01

    Full Text Available Medical data classification is a prime data mining problem being discussed about for a decade that has attracted several researchers around the world. Most classifiers are designed so as to learn from the data itself using a training process, because complete expert knowledge to determine classifier parameters is impracticable. This paper proposes a hybrid methodology based on machine learning paradigm. This paradigm integrates the successful exploration mechanism called self-regulated learning capability of the particle swarm optimization (PSO algorithm with the extreme learning machine (ELM classifier. As a recent off-line learning method, ELM is a single-hidden layer feedforward neural network (FFNN, proved to be an excellent classifier with large number of hidden layer neurons. In this research, PSO is used to determine the optimum set of parameters for the ELM, thus reducing the number of hidden layer neurons, and it further improves the network generalization performance. The proposed method is experimented on five benchmarked datasets of the UCI Machine Learning Repository for handling medical dataset classification. Simulation results show that the proposed approach is able to achieve good generalization performance, compared to the results of other classifiers.

  1. Medical Dataset Classification: A Machine Learning Paradigm Integrating Particle Swarm Optimization with Extreme Learning Machine Classifier.

    Science.gov (United States)

    Subbulakshmi, C V; Deepa, S N

    2015-01-01

    Medical data classification is a prime data mining problem being discussed about for a decade that has attracted several researchers around the world. Most classifiers are designed so as to learn from the data itself using a training process, because complete expert knowledge to determine classifier parameters is impracticable. This paper proposes a hybrid methodology based on machine learning paradigm. This paradigm integrates the successful exploration mechanism called self-regulated learning capability of the particle swarm optimization (PSO) algorithm with the extreme learning machine (ELM) classifier. As a recent off-line learning method, ELM is a single-hidden layer feedforward neural network (FFNN), proved to be an excellent classifier with large number of hidden layer neurons. In this research, PSO is used to determine the optimum set of parameters for the ELM, thus reducing the number of hidden layer neurons, and it further improves the network generalization performance. The proposed method is experimented on five benchmarked datasets of the UCI Machine Learning Repository for handling medical dataset classification. Simulation results show that the proposed approach is able to achieve good generalization performance, compared to the results of other classifiers.

  2. Protein sequence classification with improved extreme learning machine algorithms.

    Science.gov (United States)

    Cao, Jiuwen; Xiong, Lianglin

    2014-01-01

    Precisely classifying a protein sequence from a large biological protein sequences database plays an important role for developing competitive pharmacological products. Comparing the unseen sequence with all the identified protein sequences and returning the category index with the highest similarity scored protein, conventional methods are usually time-consuming. Therefore, it is urgent and necessary to build an efficient protein sequence classification system. In this paper, we study the performance of protein sequence classification using SLFNs. The recent efficient extreme learning machine (ELM) and its invariants are utilized as the training algorithms. The optimal pruned ELM is first employed for protein sequence classification in this paper. To further enhance the performance, the ensemble based SLFNs structure is constructed where multiple SLFNs with the same number of hidden nodes and the same activation function are used as ensembles. For each ensemble, the same training algorithm is adopted. The final category index is derived using the majority voting method. Two approaches, namely, the basic ELM and the OP-ELM, are adopted for the ensemble based SLFNs. The performance is analyzed and compared with several existing methods using datasets obtained from the Protein Information Resource center. The experimental results show the priority of the proposed algorithms.

  3. Internet Traffic Classification for Educational Institutions Using Machine Learning

    Directory of Open Access Journals (Sweden)

    Jaspreet Kaur

    2012-07-01

    Full Text Available In recent times machine learning algorithms are used for internet traffic classification. The infinite number of websites in the internet world can be classified into different categories in different ways. In educational institutions, these websites can be classified into two categories, educational websites and non-educational websites. Educational websites are used to acquire knowledge, to explore educational topics while the non-educational websites are used for entertainment and to keep in touch with people. In case of blocking these non-educational websites students use proxy websites to unblock them. Therefore, in educational institutes for the optimum use of network resources the use of non-educational and proxy websites should be banned. In this paper, we use five ML classifiers Naïve Bayes, RBF, C4.5, MLP and Bayes Net to classify the educational and non-educational websites. Results show that Bayes Net gives best performance in both full feature and reduced feature data sets for intended classification of internet traffic in terms of classification accuracy, recall and precision values as compared to other classifiers.

  4. Mechanical Fault Diagnosis Using Support Vector Machine

    Institute of Scientific and Technical Information of China (English)

    LI Ling-jun; ZHANG Zhou-suo; HE Zheng-jia

    2003-01-01

    The Support Vector Machine (SVM) is a machine learning algorithm based on the Statistical Learning Theory ( SLT) , which can get good classification effects even with a few learning samples. SVM represents a new approach to pattern classification and has been shown to be particularly successful in many fields such as image identification and face recognition. It also provides us with a new method to develop intelligent fault diagnosis. This paper presents a SVM-based approach for fault diagnosis of rolling bearings. Experimentation with vibration signals of bearings is conducted. The vibration signals acquired from the bearings are used directly in the calculating without the preprocessing of extracting its features. Compared with the methods based on Artificial Neural Network (ANN), the SVM-based meth-od has desirable advantages. It is applicable for on-line diagnosis of mechanical systems.

  5. A one-layer recurrent neural network for support vector machine learning.

    Science.gov (United States)

    Xia, Youshen; Wang, Jun

    2004-04-01

    This paper presents a one-layer recurrent neural network for support vector machine (SVM) learning in pattern classification and regression. The SVM learning problem is first converted into an equivalent formulation, and then a one-layer recurrent neural network for SVM learning is proposed. The proposed neural network is guaranteed to obtain the optimal solution of support vector classification and regression. Compared with the existing two-layer neural network for the SVM classification, the proposed neural network has a low complexity for implementation. Moreover, the proposed neural network can converge exponentially to the optimal solution of SVM learning. The rate of the exponential convergence can be made arbitrarily high by simply turning up a scaling parameter. Simulation examples based on benchmark problems are discussed to show the good performance of the proposed neural network for SVM learning.

  6. Classification of Cytochrome P450 1A2 Inhibitors and Non-Inhibitors by Machine Learning Techniques

    DEFF Research Database (Denmark)

    Vasanthanathan, Poongavanam; Taboureau, Olivier; Oostenbrink, Chris

    2009-01-01

    of CYP1A2 inhibitors and non-inhibitors. Training and test sets consisted of about 400 and 7000 compounds, respectively. Various machine learning techniques, like binary QSAR, support vector machine (SVM), random forest, kappa nearest neighbors (kNN), and decision tree methods were used to develop...

  7. Comparisons of likelihood and machine learning methods of individual classification

    Science.gov (United States)

    Guinand, B.; Topchy, A.; Page, K.S.; Burnham-Curtis, M. K.; Punch, W.F.; Scribner, K.T.

    2002-01-01

    Classification methods used in machine learning (e.g., artificial neural networks, decision trees, and k-nearest neighbor clustering) are rarely used with population genetic data. We compare different nonparametric machine learning techniques with parametric likelihood estimations commonly employed in population genetics for purposes of assigning individuals to their population of origin (“assignment tests”). Classifier accuracy was compared across simulated data sets representing different levels of population differentiation (low and high FST), number of loci surveyed (5 and 10), and allelic diversity (average of three or eight alleles per locus). Empirical data for the lake trout (Salvelinus namaycush) exhibiting levels of population differentiation comparable to those used in simulations were examined to further evaluate and compare classification methods. Classification error rates associated with artificial neural networks and likelihood estimators were lower for simulated data sets compared to k-nearest neighbor and decision tree classifiers over the entire range of parameters considered. Artificial neural networks only marginally outperformed the likelihood method for simulated data (0–2.8% lower error rates). The relative performance of each machine learning classifier improved relative likelihood estimators for empirical data sets, suggesting an ability to “learn” and utilize properties of empirical genotypic arrays intrinsic to each population. Likelihood-based estimation methods provide a more accessible option for reliable assignment of individuals to the population of origin due to the intricacies in development and evaluation of artificial neural networks. In recent years, characterization of highly polymorphic molecular markers such as mini- and microsatellites and development of novel methods of analysis have enabled researchers to extend investigations of ecological and evolutionary processes below the population level to the level of

  8. A machine learning framework for auto classification of imaging system exams in hospital setting for utilization optimization.

    Science.gov (United States)

    Patil, Meru A; Patil, Ravindra B; Krishnamoorthy, P; John, Jacob; Patil, Meru A; Patil, Ravindra B; Krishnamoorthy, P; John, Jacob; Patil, Meru A; John, Jacob; Patil, Ravindra B; Krishnamoorthy, P

    2016-08-01

    In clinical environment, Interventional X-Ray (IXR) system is used on various anatomies and for various types of the procedures. It is important to classify correctly each exam of IXR system into respective procedures and/or assign to correct anatomy. This classification enhances productivity of the system in terms of better scheduling of the Cath lab, also provides means to perform device usage/revenue forecast of the system by hospital management and focus on targeted treatment planning for a disease/anatomy. Although it may appear classification of each exam into respective procedure/anatomy a simple task. However, in real-life hospital settings, it is well-known that same system settings are used to perform different types of procedures. Though, such usage leads to under-utilization of the system. In this work, a method is developed to classify exams into respective anatomical type by applying machine-learning techniques (SVM, KNN and decision trees) on log information of the systems. The classification result is promising with accuracy of greater than 90%.

  9. A Hybrid RBF-SVM Ensemble Approach for Data Mining Applications

    Directory of Open Access Journals (Sweden)

    M.Govindarajan

    2014-02-01

    Full Text Available One of the major developments in machine learning in the past decade is the ensemble method, which finds highly accurate classifier by combining many moderately accurate component classifiers. This paper addresses using an ensemble of classification methods for data mining applications like intrusion detection, direct marketing, and signature verification. In this research work, new hybrid classification method is proposed for heterogeneous ensemble classifiers using arcing and their performances are analyzed in terms of accuracy. A Classifier ensemble is designed using a Radial Basis Function (RBF and Support Vector Machine (SVM as base classifiers. Here, modified training sets are formed by resampling from original training set; classifiers constructed using these training sets and then combined by voting. The proposed RBF-SVM hybrid system is superior to individual approach for intrusion detection, direct marketing, and signature verification in terms of classification accuracy.

  10. Disorder Speech Clustering For Clinical Data Using Fuzzy C-Means Clustering And Comparison With SVM Classification

    Directory of Open Access Journals (Sweden)

    C.R.Bharathi

    2012-11-01

    Full Text Available Speech is the most vital skill of communication. Stammering is speech which is hesitant, stumbling, tense or jerky to the extent that it causes anxiety to the speaker. In the existing system, there are many effective treatments for the problem of stammering. Speech-language therapy is the treatment for most kids with speech and/or language disorders. In this work, mild level of mental retardation (MR children speech samples were taken for consideration. The proposed work is, the acute spot must be identified for affording speech training to the speech disordered children. To begin with the proposed work, initially Clustering of speech is done using Fuzzy C-means Clustering Algorithm. Feature Extraction is implemented using Mel Frequency Cepstrum Coefficients (MFCC and dimensionality reduction of features extracted is implemented using Principal Component Analysis (PCA. Finally the features were clustered using Fuzzy C-Means algorithm and compared with SVM classifier output[13].

  11. Extraction and Analysis of Mega Cities’ Impervious Surface on Pixel-based and Object-oriented Support Vector Machine Classification Technology: A case of Bombay

    Science.gov (United States)

    Yu, S. S.; Sun, Z. C.; Sun, L.; Wu, M. F.

    2017-02-01

    The object of this paper is to study the impervious surface extraction method using remote sensing imagery and monitor the spatiotemporal changing patterns of mega cities. Megacity Bombay was selected as the interesting area. Firstly, the pixel-based and object-oriented support vector machine (SVM) classification methods were used to acquire the land use/land cover (LULC) products of Bombay in 2010. Consequently, the overall accuracy (OA) and overall Kappa (OK) of the pixel-based method were 94.97% and 0.96 with a running time of 78 minutes, the OA and OK of the object-oriented method were 93.72% and 0.94 with a running time of only 17s. Additionally, OA and OK of the object-oriented method after a post-classification were improved up to 95.8% and 0.94. Then, the dynamic impervious surfaces of Bombay in the period 1973-2015 were extracted and the urbanization pattern of Bombay was analysed. Results told that both the two SVM classification methods could accomplish the impervious surface extraction, but the object-oriented method should be a better choice. Urbanization of Bombay experienced a fast extending during the past 42 years, implying a dramatically urban sprawl of mega cities in the developing countries along the One Belt and One Road (OBOR).

  12. PERBANDINGAN TINGKAT PENGENALAN CITRA DIABETIC RETINOPATHY PADA KOMBINASI PRINCIPLE COMPONENT DARI 4 CIRI BERBASIS METODE SVM (SUPPORT VECTOR MACHINE

    Directory of Open Access Journals (Sweden)

    Sari Ayu Wulandari

    2016-06-01

    Full Text Available Perbedaan pigmentasi mempengaruhi me­­­­tode pengenalan pola citra retinopati di­a­betik beserta set­ting poinnya. Di­butuhkan sebuah pe­rangkat lunak, yang mampu menjadi alat bantu pengenalan citra retinopati diabetik. Telah dilakukan penelitian tentang pe­nge­nalan po­la citra retinopati dia­be­tik, dengan meng­gunakan citra kanal ku­ning (Yello­w, dengan menggunakan filter gabor dan ciri yang diambil dari tiap citra ada­lah ciri rerata (Means, variasi Varians, skewness dan entropy, yang dilanjutkan de­ngan ekstraksi ciri  PCA (Principle Com­­ponent Analysis. Pada ekstraksi ci­ri PCA, Matriks hasil PCA meru­pakan ma­triks bujur sangkar, yang jumlah ko­lom­nya, sama dengan jumlah ciri. Pe­ne­li­tian menggunakan 4 ciri, dengan de­mi­­kian, terdapat 4 buah PC (Principle Com­ponent, PC1, PC2, PC3 dan PC4. Pada artikel ini akan dibahas mengenai tingkat akurasi tertinggi dari peng­gunaan pasangan PC. Tingkat aku­ra­si, dihitung dengan meng­gu­­nakan mo­del linear dari SVM. Model de­ngan akurasi tertinggi dan tercepat ada­lah model pasangan PC1 dan PC2, yang mempunyai akurasi citra pem­be­lajaran tertinggi yaitu 100% dan waktu terce­pat, yang secara eksplisit diperli­hat­kan pada jumlah support vektor ter­kecil, yaitu 2. Pasa­ngan yang mempu­nyai ting­kat akurasi terburuk adalah PC3 dan PC4. Pengenalan turun pada citra pengu­jian, yaitu hanya 93,75%, hal ini disebabkan oleh pelebaran daerah ca­ku­pan. Pelebaran daerah cakupan ke­mungkinan disebabkan oleh pemi­lihan nilai rerata pada PCA, sebelum matriks reduksi. Pada penelitian berikutnya, bi­sa dilakukan dengan menggunakan pencarian nilai standart deviasi atau varians, dengan begitu, akan diketahui matriks reduksi yang mewakili sebaran angka pada matriks.

  13. A Machine Learning Framework for Gait Classification Using Inertial Sensors: Application to Elderly, Post-Stroke and Huntington’s Disease Patients

    Directory of Open Access Journals (Sweden)

    Andrea Mannini

    2016-01-01

    Full Text Available Machine learning methods have been widely used for gait assessment through the estimation of spatio-temporal parameters. As a further step, the objective of this work is to propose and validate a general probabilistic modeling approach for the classification of different pathological gaits. Specifically, the presented methodology was tested on gait data recorded on two pathological populations (Huntington’s disease and post-stroke subjects and healthy elderly controls using data from inertial measurement units placed at shank and waist. By extracting features from group-specific Hidden Markov Models (HMMs and signal information in time and frequency domain, a Support Vector Machines classifier (SVM was designed and validated. The 90.5% of subjects was assigned to the right group after leave-one-subject–out cross validation and majority voting. The long-term goal we point to is the gait assessment in everyday life to early detect gait alterations.

  14. Research into a Feature Selection Method for Hyperspectral Imagery Using PSO and SVM

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    Classification and recognition of hyperspectral remote sensing images is not the same as that of conventional multi-spectral remote sensing images.We propose, a novel feature selection and classification method for hyperspectral images by combining the global optimization ability of particle swarm optimization (PSO) algorithm and the superior classification performance of a support vector machine (SVM).Global optimal search performance of PSO is improved by using a chaotic optimization search technique.Granularity based grid search strategy is used to optimize the SVM model parameters.Parameter optimization and classification of the SVM are addressed using the training date corresponding to the feature subset.A false classification rate is adopted as a fitness function.Tests of feature selection and classification are carried out on a hyperspectral data set.Classification performances are also compared among different feature extraction methods commonly used today.Results indicate that this hybrid method has a higher classification accuracy and can effectively extract optimal bands.A feasible approach is provided for feature selection and classification of hyperspectral image data.

  15. A ternary classification using machine learning methods of distinct estrogen receptor activities within a large collection of environmental chemicals.

    Science.gov (United States)

    Zhang, Quan; Yan, Lu; Wu, Yan; Ji, Li; Chen, Yuanchen; Zhao, Meirong; Dong, Xiaowu

    2017-02-15

    Endocrine-disrupting chemicals (EDCs), which can threaten ecological safety and be harmful to human beings, have been cause for wide concern. There is a high demand for efficient methodologies for evaluating potential EDCs in the environment. Herein an evaluation platform was developed using novel and statistically robust ternary models via different machine learning models (i.e., linear discriminant analysis, classification and regression tree, and support vector machines). The platform is aimed at effectively classifying chemicals with agonistic, antagonistic, or no estrogen receptor (ER) activities. A total of 440 chemicals from the literature were selected to derive and optimize the three-class model. One hundred and nine new chemicals appeared on the 2014 EPA list for EDC screening, which were used to assess the predictive performances by comparing the E-screen results with the predicted results of the classification models. The best model was obtained using support vector machines (SVM) which recognized agonists and antagonists with accuracies of 76.6% and 75.0%, respectively, on the test set (with an overall predictive accuracy of 75.2%), and achieved a 10-fold cross-validation (CV) of 73.4%. The external predicted accuracy validated by the E-screen assay was 87.5%, which demonstrated the application value for a virtual alert for EDCs with ER agonistic or antagonistic activities. It was demonstrated that the ternary computational model could be used as a faster and less expensive method to identify EDCs that act through nuclear receptors, and to classify these chemicals into different mechanism groups.

  16. Classification of Motor Imagery EEG Signals with Support Vector Machines and Particle Swarm Optimization

    Science.gov (United States)

    Ma, Yuliang; Ding, Xiaohui; She, Qingshan; Luo, Zhizeng; Potter, Thomas; Zhang, Yingchun

    2016-01-01

    Support vector machines are powerful tools used to solve the small sample and nonlinear classification problems, but their ultimate classification performance depends heavily upon the selection of appropriate kernel and penalty parameters. In this study, we propose using a particle swarm optimization algorithm to optimize the selection of both the kernel and penalty parameters in order to improve the classification performance of support vector machines. The performance of the optimized classifier was evaluated with motor imagery EEG signals in terms of both classification and prediction. Results show that the optimized classifier can significantly improve the classification accuracy of motor imagery EEG signals. PMID:27313656

  17. Automated Classification of L/R Hand Movement EEG Signals using Advanced Feature Extraction and Machine Learning

    Directory of Open Access Journals (Sweden)

    Mohammad H. Alomari

    2013-07-01

    Full Text Available In this paper, we propose an automated computer platform for the purpose of classifying Electroencephalography (EEG signals associated with left and right hand movements using a hybrid system that uses advanced feature extraction techniques and machine learning algorithms. It is known that EEG represents the brain activity by the electrical voltage fluctuations along the scalp, and Brain-Computer Interface (BCI is a device that enables the use of the brain’s neural activity to communicate with others or to control machines, artificial limbs, or robots without direct physical movements. In our research work, we aspired to find the best feature extraction method that enables the differentiation between left and right executed fist movements through various classification algorithms. The EEG dataset used in this research was created and contributed to PhysioNet by the developers of the BCI2000 instrumentation system. Data was preprocessed using the EEGLAB MATLAB toolbox and artifacts removal was done using AAR. Data was epoched on the basis of Event-Related (De Synchronization (ERD/ERS and movement-related cortical potentials (MRCP features. Mu/beta rhythms were isolated for the ERD/ERS analysis and delta rhythms were isolated for the MRCP analysis. The Independent Component Analysis (ICA spatial filter was applied on related channels for noise reduction and isolation of both artifactually and neutrally generated EEG sources. The final feature vector included the ERD, ERS, and MRCP features in addition to the mean, power and energy of the activations of the resulting Independent Components (ICs of the epoched feature datasets. The datasets were inputted into two machine-learning algorithms: Neural Networks (NNs and Support Vector Machines (SVMs. Intensive experiments were carried out and optimum classification performances of 89.8 and 97.1 were obtained using NN and SVM, respectively. This research shows that this method of feature extraction

  18. Fault diagnosis based on support vector machines with parameter optimisation by artificial immunisation algorithm

    Science.gov (United States)

    Yuan, Shengfa; Chu, Fulei

    2007-04-01

    Support vector machines (SVM) is a new general machine-learning tool based on the structural risk minimisation principle that exhibits good generalisation when fault samples are few, it is especially fit for classification, forecasting and estimation in small-sample cases such as fault diagnosis, but some parameters in SVM are selected by man's experience, this has hampered its efficiency in practical application. Artificial immunisation algorithm (AIA) is used to optimise the parameters in SVM in this paper. The AIA is a new optimisation method based on the biologic immune principle of human being and other living beings. It can effectively avoid the premature convergence and guarantees the variety of solution. With the parameters optimised by AIA, the total capability of the SVM classifier is improved. The fault diagnosis of turbo pump rotor shows that the SVM optimised by AIA can give higher recognition accuracy than the normal SVM.

  19. New approach to training support vector machine

    Institute of Scientific and Technical Information of China (English)

    Tang Faming; Chen Mianyun; Wang Zhongdong

    2006-01-01

    Support vector machine has become an increasingly popular tool for machine learning tasks involving classification, regression or novelty detection. Training a support vector machine requires the solution of a very large quadratic programming problem. Traditional optimization methods cannot be directly applied due to memory restrictions. Up to now, several approaches exist for circumventing the above shortcomings and work well. Another learning algorithm, particle swarm optimization, for training SVM is introduted. The method is tested on UCI datasets.

  20. Multi-phase classification by a least-squares support vector machine approach in tomography images of geological samples

    Science.gov (United States)

    Khan, Faisal; Enzmann, Frieder; Kersten, Michael

    2016-03-01

    Image processing of X-ray-computed polychromatic cone-beam micro-tomography (μXCT) data of geological samples mainly involves artefact reduction and phase segmentation. For the former, the main beam-hardening (BH) artefact is removed by applying a best-fit quadratic surface algorithm to a given image data set (reconstructed slice), which minimizes the BH offsets of the attenuation data points from that surface. A Matlab code for this approach is provided in the Appendix. The final BH-corrected image is extracted from the residual data or from the difference between the surface elevation values and the original grey-scale values. For the segmentation, we propose a novel least-squares support vector machine (LS-SVM, an algorithm for pixel-based multi-phase classification) approach. A receiver operating characteristic (ROC) analysis was performed on BH-corrected and uncorrected samples to show that BH correction is in fact an important prerequisite for accurate multi-phase classification. The combination of the two approaches was thus used to classify successfully three different more or less complex multi-phase rock core samples.

  1. Classification of dynamic contrast enhanced MR images of cervical cancers using texture analysis and support vector machines.

    Science.gov (United States)

    Torheim, Turid; Malinen, Eirik; Kvaal, Knut; Lyng, Heidi; Indahl, Ulf G; Andersen, Erlend K F; Futsaether, Cecilia M

    2014-08-01

    Dynamic contrast enhanced MRI (DCE-MRI) provides insight into the vascular properties of tissue. Pharmacokinetic models may be fitted to DCE-MRI uptake patterns, enabling biologically relevant interpretations. The aim of our study was to determine whether treatment outcome for 81 patients with locally advanced cervical cancer could be predicted from parameters of the Brix pharmacokinetic model derived from pre-chemoradiotherapy DCE-MRI. First-order statistical features of the Brix parameters were used. In addition, texture analysis of Brix parameter maps was done by constructing gray level co-occurrence matrices (GLCM) from the maps. Clinical factors and first- and second-order features were used as explanatory variables for support vector machine (SVM) classification, with treatment outcome as response. Classification models were validated using leave-one-out cross-model validation. A random value permutation test was used to evaluate model significance. Features derived from first-order statistics could not discriminate between cured and relapsed patients (specificity 0%-20%, p-values close to unity). However, second-order GLCM features could significantly predict treatment outcome with accuracies (~70%) similar to the clinical factors tumor volume and stage (69%). The results indicate that the spatial relations within the tumor, quantified by texture features, were more suitable for outcome prediction than first-order features.

  2. Support vector machine-based multi-model predictive control

    Institute of Scientific and Technical Information of China (English)

    Zhejing BA; Youxian SUN

    2008-01-01

    In this paper,a support vector machine-based multi-model predictive control is proposed,in which SVM classification combines well with SVM regression.At first,each working environment is modeled by SVM regression and the support vector machine network-based model predictive control(SVMN-MPC)algorithm corresponding to each environment is developed,and then a multi-class SVM model is established to recognize multiple operating conditions.As for control,the current environment is identified by the multi-class SVM model and then the corresponding SVMN.MPCcontroller is activated at each sampling instant.The proposed modeling,switching and controller design is demonstrated in simulation results.

  3. FORECASTING NIKKEI 225 INDEX WITH SUPPORT VECTOR MACHINE

    Institute of Scientific and Technical Information of China (English)

    HUANG Wei; Yoshiteru Nakamori; WANG Shouyang; YU Lean

    2003-01-01

    Support Vector Machine (SVM) is a very specific type of learning algorithms characterized by the capacity control of the decision function, the use of the kernel functions and the sparsity of the solution. In this paper, we investigate the predictability of financial movement direction with SVM by forecasting the weekly movement direction of NIKKEI 225 index. To evaluate the forecasting ability of SVM, we compare the performance with those of Linear Discriminant Analysis, Quadratic Discriminant Analysis and Elman Backpropagation Neural Networks. The experiment results show that SVM outperforms other classification methods. Furthermore, we propose a combining model by integrating SVM with other classification methods. The combining model performs the best among the forecasting methods.

  4. F-SVM: Combination of Feature Transformation and SVM Learning via Convex Relaxation

    OpenAIRE

    Wu, Xiaohe; Zuo, Wangmeng; ZHU, YUANYUAN; Lin, Liang

    2015-01-01

    The generalization error bound of support vector machine (SVM) depends on the ratio of radius and margin, while standard SVM only considers the maximization of the margin but ignores the minimization of the radius. Several approaches have been proposed to integrate radius and margin for joint learning of feature transformation and SVM classifier. However, most of them either require the form of the transformation matrix to be diagonal, or are non-convex and computationally expensive. In this ...

  5. Sales Growth Rate Forecasting Using Improved PSO and SVM

    Directory of Open Access Journals (Sweden)

    Xibin Wang

    2014-01-01

    Full Text Available Accurate forecast of the sales growth rate plays a decisive role in determining the amount of advertising investment. In this study, we present a preclassification and later regression based method optimized by improved particle swarm optimization (IPSO for sales growth rate forecasting. We use support vector machine (SVM as a classification model. The nonlinear relationship in sales growth rate forecasting is efficiently represented by SVM, while IPSO is optimizing the training parameters of SVM. IPSO addresses issues of traditional PSO, such as relapsing into local optimum, slow convergence speed, and low convergence precision in the later evolution. We performed two experiments; firstly, three classic benchmark functions are used to verify the validity of the IPSO algorithm against PSO. Having shown IPSO outperform PSO in convergence speed, precision, and escaping local optima, in our second experiment, we apply IPSO to the proposed model. The sales growth rate forecasting cases are used to testify the forecasting performance of proposed model. According to the requirements and industry knowledge, the sample data was first classified to obtain types of the test samples. Next, the values of the test samples were forecast using the SVM regression algorithm. The experimental results demonstrate that the proposed model has good forecasting performance.

  6. Classification of Electrocardiogram Signals With Extreme Learning Machine and Relevance Vector Machine

    Directory of Open Access Journals (Sweden)

    S. Karpagachelvi

    2011-01-01

    Full Text Available The ECG is one of the most effective diagnostic tools to detect cardiac diseases. It is a method to measure and record different electrical potentials of the heart. The electrical potential generated by electrical activity in cardiac tissue is measured on the surface of the human body. Current flow, in the form of ions, signals contraction of cardiac muscle fibers leading to the heart's pumping action. This ECG can be classified as normal and abnormal signals. In this paper, a thorough experimental study was conducted to show the superiority of the generalization capability of the Relevance Vector Machine (RVM compared with Extreme Learning Machine (ELM approach in the automatic classification of ECG beats. The generalization performance of the ELM classifier has not achieved the nearest maximum accuracy of ECG signal classsification. To achieve the maximum accuracy the RVM classifier design by searching for the best value of the parameters that tune its discriminant function, and upstream by looking for the best subset of features that feed the classifier. The experiments were conducted on the ECG data from the Massachusetts Institute of Technology-Beth Israel Hospital (MIT- BIH arrhythmia database to classify five kinds of abnormal waveforms and normal beats. In particular, the sensitivity of the RVM classifier is tested and that is compared with ELM. Both the approaches are compared by giving raw input data and preprocessed data. The obtained results clearly confirm the superiority of the RVM approach when compared to traditional classifiers.

  7. Recent Advances in Conotoxin Classification by Using Machine Learning Methods

    Directory of Open Access Journals (Sweden)

    Fu-Ying Dao

    2017-06-01

    Full Text Available Conotoxins are disulfide-rich small peptides, which are invaluable peptides that target ion channel and neuronal receptors. Conotoxins have been demonstrated as potent pharmaceuticals in the treatment of a series of diseases, such as Alzheimer’s disease, Parkinson’s disease, and epilepsy. In addition, conotoxins are also ideal molecular templates for the development of new drug lead compounds and play important roles in neurobiological research as well. Thus, the accurate identification of conotoxin types will provide key clues for the biological research and clinical medicine. Generally, conotoxin types are confirmed when their sequence, structure, and function are experimentally validated. However, it is time-consuming and costly to acquire the structure and function information by using biochemical experiments. Therefore, it is important to develop computational tools for efficiently and effectively recognizing conotoxin types based on sequence information. In this work, we reviewed the current progress in computational identification of conotoxins in the following aspects: (i construction of benchmark dataset; (ii strategies for extracting sequence features; (iii feature selection techniques; (iv machine learning methods for classifying conotoxins; (v the results obtained by these methods and the published tools; and (vi future perspectives on conotoxin classification. The paper provides the basis for in-depth study of conotoxins and drug therapy research.

  8. A DSRPCL-SVM Approach to Informative Gene Analysis

    Institute of Scientific and Technical Information of China (English)

    Wei Xiong; Zhibin Cai; Jinwen Ma

    2008-01-01

    Microarray data based tumor diagnosis is a very interesting topic in bioinformatics. One of the key problems is the discovery and analysis of informative genes of a tumor. Although there are many elaborate approaches to this problem, it is still difficult to select a reasonable set of informative genes for tumor diagnosis only with microarray data. In this paper, we classify the genes expressed through microarray data into a number of clusters via the distance sensitive rival penalized competitive learning (DSRPCL) algorithm and then detect the informative gene cluster or set with the help of support vector machine (SVM). Moreover, the critical or powerful informative genes can be found through further classifications and detections on the obtained informative gene clusters. It is well demonstrated by experiments on the colon, leukemia, and breast cancer datasets that our proposed DSRPCL-SVM approach leads to a reasonable selection of informative genes for tumor diagnosis.

  9. 改进的一对一支持向量机多分类算法%Improved multi-classification algorithm of one-against-one SVM

    Institute of Scientific and Technical Information of China (English)

    单玉刚; 王宏; 董爽

    2012-01-01

    支持向量机的一对一多分类算法具有良好的性能,但该算法在分类时存在不可分区域,影响了该方法的应用.因此,提出一种一对一与基于紧密度判决相结合的多分类方法,使用一对一算法分类,采用基于紧密度决策解决不可分区,依据样本到类中心之间的距离和基于kNN (k nearest neighbor)的样本分布情况结合的方式构建判别函数来确定类别归属.使用UCI (university of California Irvine)数据集做测试,测试结果表明,该算法能有效地解决不可分区域问题,而且表现出比其它算法更好的性能.%Multi-class classification algorithm of one-against-one SVM show good performance, but the algorithm exists an un-classifiable region, which affects the application effect of the algorithm. Hence, a multi-classification algorithm of integration of one-against-one and affinity decision is presented. Firstly, the one-against-one multi-class classification algorithm is used to classify samples, and then the affinity decision is used to solve samples in the unclassifiable region and to determine categories of samples, which using the approach of distance between the sample and centers of classes and sample distribution based on kNN (k nearest neighbor) to create decision function. By adopting UCI data sets for testing, the results show that the algorithm can solve unclassifiable region issues, and show better performance than other algorithms.

  10. 基于流形模糊双支持向量机的恒星光谱分类方法%Automatic Classification Method of Star Spectra Data Based on Manifold Fuzzy Twin Support Vector Machine

    Institute of Scientific and Technical Information of China (English)

    刘忠宝; 高艳云; 王建珍

    2015-01-01

    支持向量机(support vector machineSVM )具有良好的学习性能和泛化能力,因而被广泛应用于恒星光谱分类中。然而实际应用面临的数据规模往往很大,SVM 便暴露出计算量大、分类速度慢等问题。为了解决上述问题,Jayadeva等提出双支持向量机(twin support vector machine ,TWSVM ),将计算时间减少至SVM的1/4。然后上述方法仅关注数据的全局特征,对每类数据的局部特征并未关注。鉴于此,提出基于流形模糊双支持向量机(manifold fuzzy twin support vector machine ,MF-TSVM)的恒星光谱分类方法。利用流形判别分析获得数据的全局特征和局部特征,模糊隶属度函数的引入将各类数据区别对待,尽可能减少噪声点和奇异点对分类结果的影响。与C-SVM ,KNN等传统分类方法在SDSS恒星光谱数据集上的比较实验表明了该方法的有效性。%Support vector machine (SVM ) with good leaning ability and generalization is widely used in the star spectra data clas-sification .But when the scale of data becomes larger ,the shortages of SVM appear :the calculation amount is quite large and the classification speed is too slow .In order to solve the above problems ,twin support vector machine (TWSVM ) was proposed by Jayadeva .The advantage of TSVM is that the time cost is reduced to 1/4 of that of SVM .While all the methods mentioned above only focus on the global characteristics and neglect the local characteristics .In view of this ,an automatic classification method of star spectra data based on manifold fuzzy twin support vector machine (MF-TSVM ) is proposed in this paper .In MF-TSVM ,manifold-based discriminant analysis (MDA) is used to obtain the global and local characteristics of the input data and the fuzzy membership is introduced to reduce the influences of noise and singular data on the classification results .Compara-tive experiments with current classification

  11. Classification of fault location and the degree of performance degradation of a rolling bearing based on an improved hyper-sphere-structured multi-class support vector machine

    Science.gov (United States)

    Wang, Yujing; Kang, Shouqiang; Jiang, Yicheng; Yang, Guangxue; Song, Lixin; Mikulovich, V. I.

    2012-05-01

    Effective classification of a rolling bearing fault location and especially its degree of performance degradation provides an important basis for appropriate fault judgment and processing. Two methods are introduced to extract features of the rolling bearing vibration signal—one combining empirical mode decomposition (EMD) with the autoregressive model, whose model parameters and variances of the remnant can be obtained using the Yule-Walker or Ulrych-Clayton method, and the other combining EMD with singular value decomposition. Feature vector matrices obtained are then regarded as the input of the improved hyper-sphere-structured multi-class support vector machine (HSSMC-SVM) for classification. Thereby, multi-status intelligent diagnosis of normal rolling bearings and faulty rolling bearings at different locations and the degrees of performance degradation of the faulty rolling bearings can be achieved simultaneously. Experimental results show that EMD combined with singular value decomposition and the improved HSSMC-SVM intelligent method requires less time and has a higher recognition rate.

  12. Built-up Area Change Analysis in Hanoi Using Support Vector Machine Classification of Landsat Multi-Temporal Image Stacks and Population Data

    Directory of Open Access Journals (Sweden)

    Duong H. Nong

    2015-12-01

    Full Text Available In 1986, the Government of Vietnam implemented free market reforms known as Doi Moi (renovation that provided private ownership of farms and companies, and encouraged deregulation and foreign investment. Since then, the economy of Vietnam has achieved rapid growth in agricultural and industrial production, construction and housing, and exports and foreign investments, each of which have resulted in momentous landscape transformations. One of the most evident changes is urbanization and an accompanying loss of agricultural lands and open spaces. These rapid changes pose enormous challenges for local populations as well as planning authorities. Accurate and timely data on changes in built-up urban environments are essential for supporting sound urban development. In this study, we applied the Support Vector Machine classification (SVM to multi-temporal stacks of Landsat Thematic Mapper (TM and Enhanced Thematic Mapper Plus (ETM+ images from 1993 to 2010 to quantify changes in built-up areas. The SVM classification algorithm produced a highly accurate map of land cover change with an overall accuracy of 95%. The study showed that most urban expansion occurred in the periods 2001–2006 and 2006–2010. The analysis was strengthened by the incorporation of population and other socio-economic data. This study provides state authorities a means to examine correlations between urban growth, spatial expansion, and other socio-economic factors in order to not only assess patterns of urban growth but also become aware of potential environmental, social, and economic problems.

  13. Predicting Prodromal Alzheimer's Disease in Subjects with Mild Cognitive Impairment Using Machine Learning Classification of Multimodal Multicenter Diffusion-Tensor and Magnetic Resonance Imaging Data.

    Science.gov (United States)

    Dyrba, Martin; Barkhof, Frederik; Fellgiebel, Andreas; Filippi, Massimo; Hausner, Lucrezia; Hauenstein, Karlheinz; Kirste, Thomas; Teipel, Stefan J

    2015-01-01

    Alzheimer's disease (AD) patients show early changes in white matter (WM) structural integrity. We studied the use of diffusion tensor imaging (DTI) in assessing WM alterations in the predementia stage of mild cognitive impairment (MCI). We applied a Support Vector Machine (SVM) classifier to DTI and volumetric magnetic resonance imaging data from 35 amyloid-β42 negative MCI subjects (MCI-Aβ42-), 35 positive MCI subjects (MCI-Aβ42+), and 25 healthy controls (HC) retrieved from the European DTI Study on Dementia. The SVM was applied to DTI-derived fractional anisotropy, mean diffusivity (MD), and mode of anisotropy (MO) maps. For comparison, we studied classification based on gray matter (GM) and WM volume. We obtained accuracies of up to 68% for MO and 63% for GM volume when it came to distinguishing between MCI-Aβ42- and MCI-Aβ42+. When it came to separating MCI-Aβ42+ from HC we achieved an accuracy of up to 77% for MD and a significantly lower accuracy of 68% for GM volume. The accuracy of multimodal classification was not higher than the accuracy of the best single modality. Our results suggest that DTI data provide better prediction accuracy than GM volume in predementia AD. Copyright © 2015 by the American Society of Neuroimaging.

  14. Subspace identification and classification of healthy human gait.

    Directory of Open Access Journals (Sweden)

    Vinzenz von Tscharner

    Full Text Available PURPOSE: The classification between different gait patterns is a frequent task in gait assessment. The base vectors were usually found using principal component analysis (PCA is replaced by an iterative application of the support vector machine (SVM. The aim was to use classifyability instead of variability to build a subspace (SVM space that contains the information about classifiable aspects of a movement. The first discriminant of the SVM space will be compared to a discriminant found by an independent component analysis (ICA in the SVM space. METHODS: Eleven runners ran using shoes with different midsoles. Kinematic data, representing the movements during stance phase when wearing the two shoes, was used as input to a PCA and SVM. The data space was decomposed by an iterative application of the SVM into orthogonal discriminants that were able to classify the two movements. The orthogonal discriminants spanned a subspace, the SVM space. It represents the part of the movement that allowed classifying the two conditions. The data in the SVM space was reconstructed for a visual assessment of the movement difference. An ICA was applied to the data in the SVM space to obtain a single discriminant. Cohen's d effect size was used to rank the PCA vectors that could be used to classify the data, the first SVM discriminant or the ICA discriminant. RESULTS: The SVM base contains all the information that discriminates the movement of the two shod conditions. It was shown that the SVM base contains some redundancy and a single ICA discriminant was found by applying an ICA in the SVM space. CONCLUSIONS: A combination of PCA, SVM and ICA is best suited to extract all parts of the gait pattern that discriminates between the two movements and to find a discriminant for the classification of dichotomous kinematic data.

  15. Optical diagnosis of colon and cervical cancer by support vector machine

    Science.gov (United States)

    Mukhopadhyay, Sabyasachi; Kurmi, Indrajit; Dey, Rajib; Das, Nandan K.; Pradhan, Sanjay; Pradhan, Asima; Ghosh, Nirmalya; Panigrahi, Prasanta K.; Mohanty, Samarendra

    2016-05-01

    A probabilistic robust diagnostic algorithm is very much essential for successful cancer diagnosis by optical spectroscopy. We report here support vector machine (SVM) classification to better discriminate the colon and cervical cancer tissues from normal tissues based on elastic scattering spectroscopy. The efficacy of SVM based classification with different kernel has been tested on multifractal parameters like Hurst exponent, singularity spectrum width in order to classify the cancer tissues.

  16. Support vector machine used to diagnose the fault of rotor broken bars of induction motors

    DEFF Research Database (Denmark)

    Zhitong, Cao; Jiazhong, Fang; Hongpingn, Chen

    2003-01-01

    The data-based machine learning is an important aspect of modern intelligent technology, while statistical learning theory (SLT) is a new tool that studies the machine learning methods in the case of a small number of samples. As a common learning method, support vector machine (SVM) is derived...... for the SVM. After a SVM is trained with learning sample vectors, so each kind of the rotor broken bar faults of induction motors can be classified. Finally the retest is demonstrated, which proves that the SVM really has preferable ability of classification. In this paper we tried applying the SVM...... from the SLT. Here we were done some analogical experiments of the rotor broken bar faults of induction motors used, analyzed the signals of the sample currents with Fourier transform, and constructed the spectrum characteristics from low frequency to high frequency used as learning sample vectors...

  17. SVM方法及其在乳制品分类问题上的应用%Introduction of the SVM and Its Application in the Dairy Products Classification

    Institute of Scientific and Technical Information of China (English)

    韩勇鹏

    2009-01-01

    介绍了支撑向量机(Support Vector Machine,SVM)方法的基本原理,针对乳制品分类问题,给出了一个SVM应用实例,并与其他机器学习方法作了比较.结果表明,SVM方法具有有效性和可行性.

  18. Machine Learning Approaches for High-resolution Urban Land Cover Classification: A Comparative Study

    Energy Technology Data Exchange (ETDEWEB)

    Vatsavai, Raju [ORNL; Chandola, Varun [ORNL; Cheriyadat, Anil M [ORNL; Bright, Eddie A [ORNL; Bhaduri, Budhendra L [ORNL; Graesser, Jordan B [ORNL

    2011-01-01

    The proliferation of several machine learning approaches makes it difficult to identify a suitable classification technique for analyzing high-resolution remote sensing images. In this study, ten classification techniques were compared from five broad machine learning categories. Surprisingly, the performance of simple statistical classification schemes like maximum likelihood and Logistic regression over complex and recent techniques is very close. Given that these two classifiers require little input from the user, they should still be considered for most classification tasks. Multiple classifier systems is a good choice if the resources permit.

  19. [Study on application of SVM in prediction of coronary heart disease].

    Science.gov (United States)

    Zhu, Yue; Wu, Jianghua; Fang, Ying

    2013-12-01

    Base on the data of blood pressure, plasma lipid, Glu and UA by physical test, Support Vector Machine (SVM) was applied to identify coronary heart disease (CHD) in patients and non-CHD individuals in south China population for guide of further prevention and treatment of the disease. Firstly, the SVM classifier was built using radial basis kernel function, liner kernel function and polynomial kernel function, respectively. Secondly, the SVM penalty factor C and kernel parameter sigma were optimized by particle swarm optimization (PSO) and then employed to diagnose and predict the CHD. By comparison with those from artificial neural network with the back propagation (BP) model, linear discriminant analysis, logistic regression method and non-optimized SVM, the overall results of our calculation demonstrated that the classification performance of optimized RBF-SVM model could be superior to other classifier algorithm with higher accuracy rate, sensitivity and specificity, which were 94.51%, 92.31% and 96.67%, respectively. So, it is well concluded that SVM could be used as a valid method for assisting diagnosis of CHD.

  20. Robust support vector machine-trained fuzzy system.

    Science.gov (United States)

    Forghani, Yahya; Yazdi, Hadi Sadoghi

    2014-02-01

    Because the SVM (support vector machine) classifies data with the widest symmetric margin to decrease the probability of the test error, modern fuzzy systems use SVM to tune the parameters of fuzzy if-then rules. But, solving the SVM model is time-consuming. To overcome this disadvantage, we propose a rapid method to solve the robust SVM model and use it to tune the parameters of fuzzy if-then rules. The robust SVM is an extension of SVM for interval-valued data classification. We compare our proposed method with SVM, robust SVM, ISVM-FC (incremental support vector machine-trained fuzzy classifier), BSVM-FC (batch support vector machine-trained fuzzy classifier), SOTFN-SV (a self-organizing TS-type fuzzy network with support vector learning) and SCLSE (a TS-type fuzzy system with subtractive clustering for antecedent parameter tuning and LSE for consequent parameter tuning) by using some real datasets. According to experimental results, the use of proposed approach leads to very low training and testing time with good misclassification rate.

  1. An S-Transform and Support Vector Machine (SVM-Based Online Method for Diagnosing Broken Strands in Transmission Lines

    Directory of Open Access Journals (Sweden)

    Caxin Sun

    2011-08-01

    Full Text Available During their long-term outdoor field service, overhead transmission lines will be exposed to strikes by lightning, corrosion by chemical contaminants, ice-shedding, wind vibration of conductors, line galloping, external destructive forces and so on, which will generally cause a series of latent faults such as aluminum strand fracture. This may lead to broken transmission lines which will have a very strong impact on the safe operation of power grids that if the latent faults cannot be recognized and fixed as soon as possible. The detection of broken strands in transmission lines using inspection robots equipped with suitable detectors is a method with good prospects. In this paper, a method for detecting broken strands in transmission lines using an eddy current transducer (ECT carried by a robot is developed, and an approach for identifying broken strands in transmission lines based on an S-transform is proposed. The proposed approach utilizes the S-transform to extract the module and phase information at each frequency point from detection signals. Through module phase and comparison, the characteristic frequency points are ascertained, and the fault information of the detection signal is constructed. The degree of confidence of broken strand identification is defined by the Shannon fuzzy entropy (SFE-BSICD. The proposed approach combines module information while utilizing phase information, SFE-BSICD, and the energy, so the reliability is greatly improved. These characteristic qualities of broken strands in transmission lines are used as the input of a multi-classification SVM, allowing the number of broken strands to be determined. Through experimental field verification, it can be shown that the proposed approach displays high accuracy and the SFE-BSICD is defined reasonably.

  2. Analytic radar micro-Doppler signatures classification

    Science.gov (United States)

    Oh, Beom-Seok; Gu, Zhaoning; Wang, Guan; Toh, Kar-Ann; Lin, Zhiping

    2017-06-01

    Due to its capability of capturing the kinematic properties of a target object, radar micro-Doppler signatures (m-DS) play an important role in radar target classification. This is particularly evident from the remarkable number of research papers published every year on m-DS for various applications. However, most of these works rely on the support vector machine (SVM) for target classification. It is well known that training an SVM is computationally expensive due to its nature of search to locate the supporting vectors. In this paper, the classifier learning problem is addressed by a total error rate (TER) minimization where an analytic solution is available. This largely reduces the search time in the learning phase. The analytically obtained TER solution is globally optimal with respect to the classification total error count rate. Moreover, our empirical results show that TER outperforms SVM in terms of classification accuracy and computational efficiency on a five-category radar classification problem.

  3. SPAM CLASSIFICATION BASED ON SUPERVISED LEARNING USING MACHINE LEARNING TECHNIQUES

    Directory of Open Access Journals (Sweden)

    T. Hamsapriya

    2011-12-01

    Full Text Available E-mail is one of the most popular and frequently used ways of communication due to its worldwide accessibility, relatively fast message transfer, and low sending cost. The flaws in the e-mail protocols and the increasing amount of electronic business and financial transactions directly contribute to the increase in e-mail-based threats. Email spam is one of the major problems of the today’s Internet, bringing financial damage to companies and annoying individual users. Spam emails are invading users without their consent and filling their mail boxes. They consume more network capacity as well as time in checking and deleting spam mails. The vast majority of Internet users are outspoken in their disdain for spam, although enough of them respond to commercial offers that spam remains a viable source of income to spammers. While most of the users want to do right think to avoid and get rid of spam, they need clear and simple guidelines on how to behave. In spite of all the measures taken to eliminate spam, they are not yet eradicated. Also when the counter measures are over sensitive, even legitimate emails will be eliminated. Among the approaches developed to stop spam, filtering is the one of the most important technique. Many researches in spam filtering have been centered on the more sophisticated classifier-related issues. In recent days, Machine learning for spam classification is an important research issue. The effectiveness of the proposed work is explores and identifies the use of different learning algorithms for classifying spam messages from e-mail. A comparative analysis among the algorithms has also been presented.

  4. A Novel Support Vector Machine with Globality-Locality Preserving

    Directory of Open Access Journals (Sweden)

    Cheng-Long Ma

    2014-01-01

    Full Text Available Support vector machine (SVM is regarded as a powerful method for pattern classification. However, the solution of the primal optimal model of SVM is susceptible for class distribution and may result in a nonrobust solution. In order to overcome this shortcoming, an improved model, support vector machine with globality-locality preserving (GLPSVM, is proposed. It introduces globality-locality preserving into the standard SVM, which can preserve the manifold structure of the data space. We complete rich experiments on the UCI machine learning data sets. The results validate the effectiveness of the proposed model, especially on the Wine and Iris databases; the recognition rate is above 97% and outperforms all the algorithms that were developed from SVM.

  5. Classifier transfer with data selection strategies for online support vector machine classification with class imbalance

    Science.gov (United States)

    Krell, Mario Michael; Wilshusen, Nils; Seeland, Anett; Kim, Su Kyoung

    2017-04-01

    Objective. Classifier transfers usually come with dataset shifts. To overcome dataset shifts in practical applications, we consider the limitations in computational resources in this paper for the adaptation of batch learning algorithms, like the support vector machine (SVM). Approach. We focus on data selection strategies which limit the size of the stored training data by different inclusion, exclusion, and further dataset manipulation criteria like handling class imbalance with two new approaches. We provide a comparison of the strategies with linear SVMs on several synthetic datasets with different data shifts as well as on different transfer settings with electroencephalographic (EEG) data. Main results. For the synthetic data, adding only misclassified samples performed astoundingly well. Here, balancing criteria were very important when the other criteria were not well chosen. For the transfer setups, the results show that the best strategy depends on the intensity of the drift during the transfer. Adding all and removing the oldest samples results in the best performance, whereas for smaller drifts, it can be sufficient to only add samples near the decision boundary of the SVM which reduces processing resources. Significance. For brain-computer interfaces based on EEG data, models trained on data from a calibration session, a previous recording session, or even from a recording session with another subject are used. We show, that by using the right combination of data selection criteria, it is possible to adapt the SVM classifier to overcome the performance drop from the transfer.

  6. GPU-based Parallel SVM Algorithm%GPU的并行支持向量机算法

    Institute of Scientific and Technical Information of China (English)

    DO Thanh-Nghi; NGUYEN Van-Hoa; POULET Francois

    2009-01-01

    提出了一种新的并行增量式支持向量机算法来解决图形处理单元(GPU)中大规模数据集的分类问题.SVM以及核相关方法可以用来创建精确分类模型,但学习过程需要大量内存和很长时间.扩展了Suykens和Vandewalle提出的最少次方SVM(LS-SVM)方法来建立增量和并行算法.新算法使用图形处理器以低代价获得高系统性能.实现表明,在UCI和Delve数据集上,基于GPU并行增量算法较CPU实现方法快130倍.而且比现行算法,如LibSVM、SVM-perf和CB-SVM等快的多(超过2500倍).%A new parallel and incremental support vector machine (SVM) algorithm for the classification of very large datasets on graphics processing units (GPUs) is presented. SVM and kernel related methods have shown to build accurate models but the learning task usually needs a quadratic program so that this task for large datasets re-quires large memory capacity and long time. A recent least squares SVM (LS-SVM) proposed by Suykens and Van-dewalle for building incremental and parallel algorithm is extended. The new algorithm uses graphics processors to gain high performance at low cost. Numerical test results on UCI and Delve dataset repositories show that this para-llel incremental algorithm using GPUs is about 130 times faster than its CPU implementation and often significantly faster (over 2 500 times) than state-of-the-art algorithms like LibSVM, SVM-perf and CB-SVM.

  7. Signal peptide discrimination and cleavage site identification using SVM and NN.

    Science.gov (United States)

    Kazemian, H B; Yusuf, S A; White, K

    2014-02-01

    About 15% of all proteins in a genome contain a signal peptide (SP) sequence, at the N-terminus, that targets the protein to intracellular secretory pathways. Once the protein is targeted correctly in the cell, the SP is cleaved, releasing the mature protein. Accurate prediction of the presence of these short amino-acid SP chains is crucial for modelling the topology of membrane proteins, since SP sequences can be confused with transmembrane domains due to similar composition of hydrophobic amino acids. This paper presents a cascaded Support Vector Machine (SVM)-Neural Network (NN) classification methodology for SP discrimination and cleavage site identification. The proposed method utilises a dual phase classification approach using SVM as a primary classifier to discriminate SP sequences from Non-SP. The methodology further employs NNs to predict the most suitable cleavage site candidates. In phase one, a SVM classification utilises hydrophobic propensities as a primary feature vector extraction using symmetric sliding window amino-acid sequence analysis for discrimination of SP and Non-SP. In phase two, a NN classification uses asymmetric sliding window sequence analysis for prediction of cleavage site identification. The proposed SVM-NN method was tested using Uni-Prot non-redundant datasets of eukaryotic and prokaryotic proteins with SP and Non-SP N-termini. Computer simulation results demonstrate an overall accuracy of 0.90 for SP and Non-SP discrimination based on Matthews Correlation Coefficient (MCC) tests using SVM. For SP cleavage site prediction, the overall accuracy is 91.5% based on cross-validation tests using the novel SVM-NN model.

  8. Improvments of Payload-based Intrusion Detection Models by Using Noise Against Fuzzy SVM

    Directory of Open Access Journals (Sweden)

    Guiling Zhang

    2011-02-01

    Full Text Available Intrusion detection plays a very important role in network security system. It is proved to analyze the payload of network protocol and to model a payload-based anomaly detector (PAYL can successfully detect outliers of network servers.  This paper extends these works by applying a new noise-reduced fuzzy support vector machine (fSVM to improve the detection rate at lower false positive rate. The new noisy against fuzzy SVM is applied to analyzing 1-gram, 2-grams and 2v-grams distribution classification of network payloads, which constructs three different intrusion detection models, respectively. These new intrusion detection models employ reconstruction error based fuzzy membership function to reduce the noisy of the data and to solve the sharp boundary problem, respectively. Experimental results based on DARPA data set demonstrated that the proposed schemes can achieve higher detection rate at very low false positive rate than the original and general SVM methods.

  9. Comparison and Retrieval of Liver Diseases Based on the Performance of SVM and SOM

    Directory of Open Access Journals (Sweden)

    R. Suganya

    2012-12-01

    Full Text Available In this study, we distinguish the liver tumor by SVM and SOM classification. LPND (Laplacian Pyramid based Nonlinear DiTusion is the proposed speckle reduction technique for preprocessing the image. In Feature extraction, we segment the image based on mean, variance, entropy and fractal dimension. The four layer hierarchical scheme is used for classifying benign and malignant tumors. In the Wrst layer the normal tissue distinguishes from abnormal tissues. The second layer distinguishes cyst from abnormal tissues. Cavernous Hemangioma is identiWed in third layer. At last hepatoma is identiWed from undeWned tissues. Self Organizing Map (SOM and Support Vector Machine (SVM algorithms are used to classify the features extracted from liver diseases. Using performance metrics such as sensitivity and specificity, our results demonstrate that the SVM provide better retrieval than SOM.

  10. Artificial immune system based on adaptive clonal selection for feature selection and parameters optimisation of support vector machines

    Science.gov (United States)

    Sadat Hashemipour, Maryam; Soleimani, Seyed Ali

    2016-01-01

    Artificial immune system (AIS) algorithm based on clonal selection method can be defined as a soft computing method inspired by theoretical immune system in order to solve science and engineering problems. Support vector machine (SVM) is a popular pattern classification method with many diverse applications. Kernel parameter setting in the SVM training procedure along with the feature selection significantly impacts on the classification accuracy rate. In this study, AIS based on Adaptive Clonal Selection (AISACS) algorithm has been used to optimise the SVM parameters and feature subset selection without degrading the SVM classification accuracy. Several public datasets of University of California Irvine machine learning (UCI) repository are employed to calculate the classification accuracy rate in order to evaluate the AISACS approach then it was compared with grid search algorithm and Genetic Algorithm (GA) approach. The experimental results show that the feature reduction rate and running time of the AISACS approach are better than the GA approach.

  11. Deep Extreme Learning Machine and Its Application in EEG Classification

    OpenAIRE

    Shifei Ding; Nan Zhang; Xinzheng Xu; Lili Guo; Jian Zhang

    2015-01-01

    Recently, deep learning has aroused wide interest in machine learning fields. Deep learning is a multilayer perceptron artificial neural network algorithm. Deep learning has the advantage of approximating the complicated function and alleviating the optimization difficulty associated with deep models. Multilayer extreme learning machine (MLELM) is a learning algorithm of an artificial neural network which takes advantages of deep learning and extreme learning machine. Not only does MLELM appr...

  12. Deep Extreme Learning Machine and Its Application in EEG Classification

    OpenAIRE

    2015-01-01

    Recently, deep learning has aroused wide interest in machine learning fields. Deep learning is a multilayer perceptron artificial neural network algorithm. Deep learning has the advantage of approximating the complicated function and alleviating the optimization difficulty associated with deep models. Multilayer extreme learning machine (MLELM) is a learning algorithm of an artificial neural network which takes advantages of deep learning and extreme learning machine. Not only does MLELM appr...

  13. Application of LCD-SVD Technique and CRO-SVM Method to Fault Diagnosis for Roller Bearing

    Directory of Open Access Journals (Sweden)

    Songrong Luo

    2015-01-01

    Full Text Available Targeting the nonlinear and nonstationary characteristics of vibration signal from fault roller bearing and scarcity of fault samples, a novel method is presented and applied to roller bearing fault diagnosis in this paper. Firstly, the nonlinear and nonstationary vibration signal produced by local faults of roller bearing is decomposed into intrinsic scale components (ISCs by using local characteristic-scale decomposition (LCD method and initial feature vector matrices are obtained. Secondly, fault feature values are extracted by singular value decomposition (SVD techniques to obtain singular values, while avoiding the selection of reconstruction parameters. Thirdly, a support vector machine (SVM classifier based on Chemical Reaction Optimization (CRO algorithm, called CRO-SVM method, is designed for classification of fault location. Lastly, the proposed method is validated by two experimental datasets. Experimental results show that the proposed method based LCD-SVD technique and CRO-SVM method have higher classification accuracy and shorter cost time than the comparative methods.

  14. Comparative study of shape, intensity and texture features and support vector machine for white blood cell classification

    Directory of Open Access Journals (Sweden)

    Mehdi Habibzadeh

    2013-04-01

    Full Text Available The complete blood count (CBC is widely used test for counting and categorizing various peripheral particles in the blood. The main goal of the paper is to count and classify white blood cells (leukocytes in microscopic images into five major categories using features such as shape, intensity and texture features. The first critical step of counting and classification procedure involves segmentation of individual cells in cytological images of thin blood smears. The quality of segmentation has significant impact on the cell type identification, but poor quality, noise, and/or low resolution images make segmentation less reliable. We analyze the performance of our system for three different sets of features and we determine that the best performance is achieved by wavelet features using the Dual-Tree Complex Wavelet Transform (DT-CWT which is based on multi-resolution characteristics of the image. These features are combined with the Support Vector Machine (SVM which classifies white blood cells into their five primary types. This approach was validated with experiments conducted on digital normal blood smear images with low resolution.

  15. P300 EEG Recognition Based on SVM Approach

    Institute of Scientific and Technical Information of China (English)

    LIU Hui; ZHOU Wei-dong; HUANG An-hu

    2009-01-01

    In this paper, we used SVM method to detect P300 signal. Before training a classification parameter for the SVM, several preprocessing operations were applied to the data including filtering, downsampling, single trial extraction, windsorizing, electrode selection et al. With the SVM algorithm, the classification accuracy could be up to above 80%. In some cases, the accuracy could reach 100%. It is suitable to use SVM for P300 EEG recognition in the P300-based brain-computer interface (BCI) system. Our further work will include the improvement to yield higher classification accuracy using fewer trials.

  16. SVM-based glioma grading. Optimization by feature reduction analysis

    Energy Technology Data Exchange (ETDEWEB)

    Zoellner, Frank G.; Schad, Lothar R. [University Medical Center Mannheim, Heidelberg Univ., Mannheim (Germany). Computer Assisted Clinical Medicine; Emblem, Kyrre E. [Massachusetts General Hospital, Charlestown, A.A. Martinos Center for Biomedical Imaging, Boston MA (United States). Dept. of Radiology; Harvard Medical School, Boston, MA (United States); Oslo Univ. Hospital (Norway). The Intervention Center

    2012-11-01

    We investigated the predictive power of feature reduction analysis approaches in support vector machine (SVM)-based classification of glioma grade. In 101 untreated glioma patients, three analytic approaches were evaluated to derive an optimal reduction in features; (i) Pearson's correlation coefficients (PCC), (ii) principal component analysis (PCA) and (iii) independent component analysis (ICA). Tumor grading was performed using a previously reported SVM approach including whole-tumor cerebral blood volume (CBV) histograms and patient age. Best classification accuracy was found using PCA at 85% (sensitivity = 89%, specificity = 84%) when reducing the feature vector from 101 (100-bins rCBV histogram + age) to 3 principal components. In comparison, classification accuracy by PCC was 82% (89%, 77%, 2 dimensions) and 79% by ICA (87%, 75%, 9 dimensions). For improved speed (up to 30%) and simplicity, feature reduction by all three methods provided similar classification accuracy to literature values ({proportional_to}87%) while reducing the number of features by up to 98%. (orig.)

  17. Face recognition with multi-kernel SVM%基于多核SVM的人脸识别

    Institute of Scientific and Technical Information of China (English)

    陆萍

    2016-01-01

    Support Vector Machine (SVM) is one of the most importance linear classifier in machine learning, which can classify the non-linear samples efficiently via the kernel method. However, the accuracy of SVM may be heavily affected due to the characteristics of different kernels. To make better use of different kernels, the different kernels are tried to fuse to design a multi-kernel SVM, and the resulting classifier is evaluated on the ORL and AR face recognition datasets. As for the feature, the Local Ternary Pattern (LTP) is employed. The experimental results show that the multi-kernel SVM can achieve higher classification accuracy than traditional SVM with single kernel.%支持向量机(Support Vector Machine, SVM)是机器学习领域中非常重要的一种线性分类器,借助于核方法, SVM能够实现对非线性样本的有效分类。但是不同类别的核函数具有各自不同的特性,对于SVM分类的准确率也具有很大的影响。为了能够结合不同核函数的优势,本文采用了对不同核函数进行融合的方式来设计多核SVM分类器,并在ORL与AR人脸识别数据集上采用局部三值模式(Local Ternary Pattern, LTP)作为特征描述子进行了验证。实验结果表明,多核SVM比使用普通核函数的SVM具有更优的分类准确率。

  18. Machine Learning Algorithms for Automatic Classification of Marmoset Vocalizations

    Science.gov (United States)

    Ribeiro, Sidarta; Pereira, Danillo R.; Papa, João P.; de Albuquerque, Victor Hugo C.

    2016-01-01

    Automatic classification of vocalization type could potentially become a useful tool for acoustic the monitoring of captive colonies of highly vocal primates. However, for classification to be useful in practice, a reliable algorithm that can be successfully trained on small datasets is necessary. In this work, we consider seven different classification algorithms with the goal of finding a robust classifier that can be successfully trained on small datasets. We found good classification performance (accuracy > 0.83 and F1-score > 0.84) using the Optimum Path Forest classifier. Dataset and algorithms are made publicly available. PMID:27654941

  19. Classification Technique for Hyperspectral Image Based on Subspace of Bands Feature Extraction and LS-SVM%基于波段子集特征提取的最小二乘支持向量机高光谱图像分类技术

    Institute of Scientific and Technical Information of China (English)

    高恒振; 万建伟; 朱珍珍; 王力宝; 粘永健

    2011-01-01

    The present paper proposes a novel hyperspectral image classification algorithm based on LS-SVM (least squares support vector machine). The LS-SVM uses the features extracted from subspace of bands (SOB). The maximum noise fraction (MNF) method is adopted as the feature extraction method. The spectral correlations of the hyperspectral image are used in order to divide the feature space into several SOBs. Then the MNF is used to extract characteristic features of the SOBs. The extracted features are combined into the feature vector for classification. So the strong bands correlation is avoided and the spectral redundancies are reduced. The LS-SVM classifier is adopted, which replaces inequality constraints in SVM by equality constraints. So the computation consumption is reduced and the learning performance is improved. The proposed method optimizes spectral information by feature extraction and reduces the spectral noise. The classifier performance is improved. Experimental results show the superiorities of the proposed algorithm.%针对高光谱图像分类,文章提出一种基于波段子集最大噪声分量特征提取的最小二乘支持向量机的高光谱图像分类算法.利用高光谱图像的谱间相关性将原始光谱波段划分为若干个波段子集,并在各个子集上采用最大噪声分量方法进行特征提取,将提取的特征合成为分类的组合特征矢量,避免了高光谱图像较强的波段相关性,减少了谱间冗余.并且采用了最小二乘支持向量机,用等式约束取代了支持向量机中的不等式约束,降低了运算量,提高了学习效率.该方法利用特征提取优化了光谱信息,降低了谱间噪声,提高了分类器的性能.实验结果证明了本文算法的优越性.

  20. LMD based features for the automatic seizure detection of EEG signals using SVM.

    Science.gov (United States)

    Zhang, Tao; Chen, Wanzhong

    2016-09-20

    Achieving the goal of detecting seizure activity automatically using electroencephalogram (EEG) signals is of great importance and significance for the treatment of epileptic seizures. To realize this aim, a newly-developed time-frequency analytical algorithm, namely local mean decomposition (LMD), is employed in the presented study. LMD is able to decompose an arbitrary signal into a series of product functions (PFs). Primarily, the raw EEG signal is decomposed into several PFs, and then the temporal statistical and non-linear features of the first five PFs are calculated. The features of each PF are fed into five classifiers, including back propagation neural network (BPNN), K-nearest neighbor (KNN), linear discriminant analysis (LDA), un-optimized support vector machine (SVM) and SVM optimized by genetic algorithm (GA-SVM), for five classification cases, respectively. Confluent features of all PFs are further passed into the high-performance GA-SVM for the same classification tasks. Experimental results on the international public Bonn epilepsy EEG dataset show that the average classification accuracy of the presented approach are equal to or higher than 98.10% in all the five cases, and this indicates the effectiveness of the proposed approach for automated seizure detection.

  1. PCVMZM: Using the Probabilistic Classification Vector Machines Model Combined with a Zernike Moments Descriptor to Predict Protein-Protein Interactions from Protein Sequences.

    Science.gov (United States)

    Wang, Yanbin; You, Zhuhong; Li, Xiao; Chen, Xing; Jiang, Tonghai; Zhang, Jingting

    2017-05-11

    Protein-protein interactions (PPIs) are essential for most living organisms' process. Thus, detecting PPIs is extremely important to understand the molecular mechanisms of biological systems. Although many PPIs data have been generated by high-throughput technologies for a variety of organisms, the whole interatom is still far from complete. In addition, the high-throughput technologies for detecting PPIs has some unavoidable defects, including time consumption, high cost, and high error rate. In recent years, with the development of machine learning, computational methods have been broadly used to predict PPIs, and can achieve good prediction rate. In this paper, we present here PCVMZM, a computational method based on a Probabilistic Classification Vector Machines (PCVM) model and Zernike moments (ZM) descriptor for predicting the PPIs from protein amino acids sequences. Specifically, a Zernike moments (ZM) descriptor is used to extract protein evolutionary information from Position-Specific Scoring Matrix (PSSM) generated by Position-Specific Iterated Basic Local Alignment Search Tool (PSI-BLAST). Then, PCVM classifier is used to infer the interactions among protein. When performed on PPIs datasets of Yeast and H. Pylori, the proposed method can achieve the average prediction accuracy of 94.48% and 91.25%, respectively. In order to further evaluate the performance of the proposed method, the state-of-the-art support vector machines (SVM) classifier is used and compares with the PCVM model. Experimental results on the Yeast dataset show that the performance of PCVM classifier is better than that of SVM classifier. The experimental results indicate that our proposed method is robust, powerful and feasible, which can be used as a helpful tool for proteomics research.

  2. SVM multiuser detection based on heuristic kernel

    Institute of Scientific and Technical Information of China (English)

    Yang Tao; Hu Bo

    2007-01-01

    A support vector machine (SVM) based multiuser detection (MUD) scheme in code-division multiple-access (CDMA) system is proposed. In this scheme, the equivalent support vector (SV) is obtained through a kernel sparsity approximation algorithm, which avoids the conventional costly quadratic programming (QP) procedure in SVM. Besides, the coefficient of the SV is attained through the solution to a generalized eigenproblem. Simulation results show that the proposed scheme has almost the same bit error rate (BER) as the standard SVM and is better than minimum mean square error (MMSE) scheme. Meanwhile, it has a low computation complexity.

  3. A Support Vector Machine Classification Model for Benzo[c]phenathridine Analogues with Topoisomerase-I Inhibitory Activity

    Directory of Open Access Journals (Sweden)

    Thanh-Dao Tran

    2012-04-01

    Full Text Available Benzo[c]phenanthridine (BCP derivatives were identified as topoisomerase I (TOP-I targeting agents with pronounced antitumor activity. In this study, a support vector machine model was performed on a series of 73 analogues to classify BCP derivatives according to TOP-I inhibitory activity. The best SVM model with total accuracy of 93% for training set was achieved using a set of 7 descriptors identified from a large set via a random forest algorithm. Overall accuracy of up to 87% and a Matthews coefficient correlation (MCC of 0.71 were obtained after this SVM classifier was validated internally by a test set of 15 compounds. For two external test sets, 89% and 80% BCP compounds, respectively, were correctly predicted. The results indicated that our SVM model could be used as the filter for designing new BCP compounds with higher TOP-I inhibitory activity.

  4. Mangrove classification through the use of object oriented classification and support vector machine of lidar datasets: a case study in Naawan and Manticao, Misamis Oriental, Philippines

    Science.gov (United States)

    Jalbuena, Rey L.; Peralta, Rudolph V.; Tamondong, Ayin M.

    2016-10-01

    Mangroves are trees or shrubs that grows at the surface between the land and the sea in tropical and sub-tropical latitudes. Mangroves are essential in supporting various marine life, thus, it is important to preserve and manage these areas. There are many approaches in creating Mangroves maps, one of which is through the use of Light Detection and Ranging (LiDAR). It is a remote sensing technique which uses light pulses to measure distances and to generate three-dimensional point clouds of the Earth's surface. In this study, the topographic LiDAR Data will be used to analyze the geophysical features of the terrain and create a Mangrove map. The dataset that we have were first pre-processed using the LAStools software. It is a software that is used to process LiDAR data sets and create different layers such as DSM, DTM, nDSM, Slope, LiDAR Intensity, LiDAR number of first returns, and CHM. All the aforementioned layers together was used to derive the Mangrove class. Then, an Object-based Image Analysis (OBIA) was performed using eCognition. OBIA analyzes a group of pixels with similar properties called objects, as compared to the traditional pixel-based which only examines a single pixel. Multi-threshold and multiresolution segmentation were used to delineate the different classes and split the image into objects. There are four levels of classification, first is the separation of the Land from the Water. Then the Land class was further dived into Ground and Non-ground objects. Furthermore classification of Nonvegetation, Mangroves, and Other Vegetation was done from the Non-ground objects. Lastly Separation of the mangrove class was done through the Use of field verified training points which was then run into a Support Vector Machine (SVM) classification. Different classes were separated using the different layer feature properties, such as mean, mode, standard deviation, geometrical properties, neighbor-related properties, and textural properties. Accuracy

  5. Classification of jet fuel properties by near-infrared spectroscopy using fuzzy rule-building expert systems and support vector machines.

    Science.gov (United States)

    Xu, Zhanfeng; Bunker, Christopher E; Harrington, Peter de B

    2010-11-01

    Monitoring the changes of jet fuel physical properties is important because fuel used in high-performance aircraft must meet rigorous specifications. Near-infrared (NIR) spectroscopy is a fast method to characterize fuels. Because of the complexity of NIR spectral data, chemometric techniques are used to extract relevant information from spectral data to accurately classify physical properties of complex fuel samples. In this work, discrimination of fuel types and classification of flash point, freezing point, boiling point (10%, v/v), boiling point (50%, v/v), and boiling point (90%, v/v) of jet fuels (JP-5, JP-8, Jet A, and Jet A1) were investigated. Each physical property was divided into three classes, low, medium, and high ranges, using two evaluations with different class boundary definitions. The class boundaries function as the threshold to alarm when the fuel properties change. Optimal partial least squares discriminant analysis (oPLS-DA), fuzzy rule-building expert system (FuRES), and support vector machines (SVM) were used to build the calibration models between the NIR spectra and classes of physical property of jet fuels. OPLS-DA, FuRES, and SVM were compared with respect to prediction accuracy. The validation of the calibration model was conducted by applying bootstrap Latin partition (BLP), which gives a measure of precision. Prediction accuracy of 97 ± 2% of the flash point, 94 ± 2% of freezing point, 99 ± 1% of the boiling point (10%, v/v), 98 ± 2% of the boiling point (50%, v/v), and 96 ± 1% of the boiling point (90%, v/v) were obtained by FuRES in one boundaries definition. Both FuRES and SVM obtained statistically better prediction accuracy over those obtained by oPLS-DA. The results indicate that combined with chemometric classifiers NIR spectroscopy could be a fast method to monitor the changes of jet fuel physical properties.

  6. Relationship Between Support Vector Set and Kernel Functions in SVM

    Institute of Scientific and Technical Information of China (English)

    张铃; 张钹

    2002-01-01

    Based on a constructive learning approach, covering algorithms, we investigatethe relationship between support vector sets and kernel functions in support vector machines(SVM). An interesting result is obtained. That is, in the linearly non-separable case, any sampleof a given sample set K can become a support vector under a certain kernel function. The resultshows that when the sample set K is linearly non-separable, although the chosen kernel functionsatisfies Mercer's condition its corresponding support vector set is not necessarily the subsetof K that plays a crucial role in classifying K. For a given sample set, what is the subsetthat plays the crucial role in classification? In order to explore the problem, a new concept,boundary or boundary points, is defined and its properties are discussed. Given a sample setK, we show that the decision functions for classifying the boundary points of K are the sameas that for classifying the K itself. And the boundary points of K only depend on K and thestructure of the space at which K is located and independent of the chosen approach for findingthe boundary. Therefore, the boundary point set may become the subset of K that plays acrucial role in classification. These results are of importance to understand the principle of thesupport vector machine (SVM) and to develop new learning algorithms.

  7. Experimental comparison of support vector machines with random forests for hyperspectral image land cover classification

    Indian Academy of Sciences (India)

    B T Abe; O O Olugbara; T Marwala

    2014-06-01

    The performances of regular support vector machines and random forests are experimentally compared for hyperspectral imaging land cover classification. Special characteristics of hyperspectral imaging dataset present diverse processing problems to be resolved under robust mathematical formalisms such as image classification. As a result, pixel purity index algorithm is used to obtain endmember spectral responses from Indiana pine hyperspectral image dataset. The generalized reduced gradient optimization algorithm is thereafter executed on the research data to estimate fractional abundances in the hyperspectral image and thereby obtain the numeric values for land cover classification. The Waikato environment for knowledge analysis (WEKA) data mining framework is selected as a tool to carry out the classification process by using support vector machines and random forests classifiers. Results show that performance of support vector machines is comparable to that of random forests. This study makes a positive contribution to the problem of land cover classification by exploring generalized reduced gradient method, support vector machines, and random forests to improve producer accuracy and overall classification accuracy. The performance comparison of these classifiers is valuable for a decision maker to consider tradeoffs in method accuracy versus method complexity.

  8. Diagnosis of Acute Coronary Syndrome with a Support Vector Machine.

    Science.gov (United States)

    Berikol, Göksu Bozdereli; Yildiz, Oktay; Özcan, I Türkay

    2016-04-01

    Acute coronary syndrome (ACS) is a serious condition arising from an imbalance of supply and demand to meet myocardium's metabolic needs. Patients typically present with retrosternal chest pain radiating to neck and left arm. Electrocardiography (ECG) and laboratory tests are used indiagnosis. However in emergency departments, there are some difficulties for physicians to decide whether hospitalizing, following up or discharging the patient. The aim of the study is to diagnose ACS and helping the physician with his decisionto discharge or to hospitalizevia machine learning techniques such as support vector machine (SVM) by using patient data including age, sex, risk factors, and cardiac enzymes (CK-MB, Troponin I) of patients presenting to emergency department with chest pain. Clinical, laboratory, and imaging data of 228 patients presenting to emergency department with chest pain were reviewedand the performance of support vector machine. Four different methods (Support vector machine (SVM), Artificial neural network (ANN), Naïve Bayes and Logistic Regression) were tested and the results of SVM which has the highest accuracy is reported. Among 228 patients aged 19 to 91 years who were included in the study, 99 (43.4 %) were qualified as ACS, while 129 (56.5 %) had no ACS. The classification model using SVM attained a 99.13 % classification success. The present study showed a 99.13 % classification success for ACS diagnosis attained by Support Vector Machine. This study showed that machine learning techniques may help emergency department staff make decisions by rapidly producing relevant data.

  9. Fault Detection and Diagnosis in Process Data Using Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Fang Wu

    2014-01-01

    Full Text Available For the complex industrial process, it has become increasingly challenging to effectively diagnose complicated faults. In this paper, a combined measure of the original Support Vector Machine (SVM and Principal Component Analysis (PCA is provided to carry out the fault classification, and compare its result with what is based on SVM-RFE (Recursive Feature Elimination method. RFE is used for feature extraction, and PCA is utilized to project the original data onto a lower dimensional space. PCA T2, SPE statistics, and original SVM are proposed to detect the faults. Some common faults of the Tennessee Eastman Process (TEP are analyzed in terms of the practical system and reflections of the dataset. PCA-SVM and SVM-RFE can effectively detect and diagnose these common faults. In RFE algorithm, all variables are decreasingly ordered according to their contributions. The classification accuracy rate is improved by choosing a reasonable number of features.

  10. 一种改进的基于支持向量机的多类分类方法%AN IMPROVED SVM-BASED MULTI-CLASS CLASSIFICATION ALGORITHM

    Institute of Scientific and Technical Information of China (English)

    赵亮

    2014-01-01

    针对现有支持向量机多类分类算法在分类精度上的不足,提出一种改进的支持向量机决策树多类分类算法。为了最大限度地减少误差积累的影响,该算法利用投影向量的思想作为衡量类分离性的标准,由此构建非平衡决策树,并且在决策树节点处对正负样本选取不同的惩罚因子来处理不平衡数据集的影响,最后引入KNN算法与SVM共同识别数据集。通过在手写体数字识别数据集上的仿真实验,分析比较各种方法,表明该方法能有效提高分类精度。%In light of the deficiency of existing SVM multi-class classification algorithm in classification accuracy, we propose an improved SVM decision tree multi-class classification algorithm.In order to minimise the impact of the error accumulation to greatest extent, the algorithm uses the idea of vector projection as the standard to measure class separation, thus constructs an unbalanced decision tree.Furthermore, it selects different punishment factors from positive and negative samples at the nodes of decision tree to counteract the impact from unbalanced data sets.At last, it introduces KNN to co-recognise the data sets with SVM.Analysing and comparing diffident methods by the simulation experiment on handwritten digit recognition data sets, it is shown that this method can effectively improve the classification accuracy.

  11. A Mass Spectrometric Analysis Method Based on PPCA and SVM for Early Detection of Ovarian Cancer.

    Science.gov (United States)

    Wu, Jiang; Ji, Yanju; Zhao, Ling; Ji, Mengying; Ye, Zhuang; Li, Suyi

    2016-01-01

    Background. Surfaced-enhanced laser desorption-ionization-time of flight mass spectrometry (SELDI-TOF-MS) technology plays an important role in the early diagnosis of ovarian cancer. However, the raw MS data is highly dimensional and redundant. Therefore, it is necessary to study rapid and accurate detection methods from the massive MS data. Methods. The clinical data set used in the experiments for early cancer detection consisted of 216 SELDI-TOF-MS samples. An MS analysis method based on probabilistic principal components analysis (PPCA) and support vector machine (SVM) was proposed and applied to the ovarian cancer early classification in the data set. Additionally, by the same data set, we also established a traditional PCA-SVM model. Finally we compared the two models in detection accuracy, specificity, and sensitivity. Results. Using independent training and testing experiments 10 times to evaluate the ovarian cancer detection models, the average prediction accuracy, sensitivity, and specificity of the PCA-SVM model were 83.34%, 82.70%, and 83.88%, respectively. In contrast, those of the PPCA-SVM model were 90.80%, 92.98%, and 88.97%, respectively. Conclusions. The PPCA-SVM model had better detection performance. And the model combined with the SELDI-TOF-MS technology had a prospect in early clinical detection and diagnosis of ovarian cancer.

  12. Neural network and SVM classifiers accurately predict lipid binding proteins, irrespective of sequence homology.

    Science.gov (United States)

    Bakhtiarizadeh, Mohammad Reza; Moradi-Shahrbabak, Mohammad; Ebrahimi, Mansour; Ebrahimie, Esmaeil

    2014-09-07

    Due to the central roles of lipid binding proteins (LBPs) in many biological processes, sequence based identification of LBPs is of great interest. The major challenge is that LBPs are diverse in sequence, structure, and function which results in low accuracy of sequence homology based methods. Therefore, there is a need for developing alternative functional prediction methods irrespective of sequence similarity. To identify LBPs from non-LBPs, the performances of support vector machine (SVM) and neural network were compared in this study. Comprehensive protein features and various techniques were employed to create datasets. Five-fold cross-validation (CV) and independent evaluation (IE) tests were used to assess the validity of the two methods. The results indicated that SVM outperforms neural network. SVM achieved 89.28% (CV) and 89.55% (IE) overall accuracy in identification of LBPs from non-LBPs and 92.06% (CV) and 92.90% (IE) (in average) for classification of different LBPs classes. Increasing the number and the range of extracted protein features as well as optimization of the SVM parameters significantly increased the efficiency of LBPs class prediction in comparison to the only previous report in this field. Altogether, the results showed that the SVM algorithm can be run on broad, computationally calculated protein features and offers a promising tool in detection of LBPs classes. The proposed approach has the potential to integrate and improve the common sequence alignment based methods.

  13. A Mass Spectrometric Analysis Method Based on PPCA and SVM for Early Detection of Ovarian Cancer

    Directory of Open Access Journals (Sweden)

    Jiang Wu

    2016-01-01

    Full Text Available Background. Surfaced-enhanced laser desorption-ionization-time of flight mass spectrometry (SELDI-TOF-MS technology plays an important role in the early diagnosis of ovarian cancer. However, the raw MS data is highly dimensional and redundant. Therefore, it is necessary to study rapid and accurate detection methods from the massive MS data. Methods. The clinical data set used in the experiments for early cancer detection consisted of 216 SELDI-TOF-MS samples. An MS analysis method based on probabilistic principal components analysis (PPCA and support vector machine (SVM was proposed and applied to the ovarian cancer early classification in the data set. Additionally, by the same data set, we also established a traditional PCA-SVM model. Finally we compared the two models in detection accuracy, specificity, and sensitivity. Results. Using independent training and testing experiments 10 times to evaluate the ovarian cancer detection models, the average prediction accuracy, sensitivity, and specificity of the PCA-SVM model were 83.34%, 82.70%, and 83.88%, respectively. In contrast, those of the PPCA-SVM model were 90.80%, 92.98%, and 88.97%, respectively. Conclusions. The PPCA-SVM model had better detection performance. And the model combined with the SELDI-TOF-MS technology had a prospect in early clinical detection and diagnosis of ovarian cancer.

  14. A Method for Aileron Actuator Fault Diagnosis Based on PCA and PGC-SVM

    Directory of Open Access Journals (Sweden)

    Wei-Li Qin

    2016-01-01

    Full Text Available Aileron actuators are pivotal components for aircraft flight control system. Thus, the fault diagnosis of aileron actuators is vital in the enhancement of the reliability and fault tolerant capability. This paper presents an aileron actuator fault diagnosis approach combining principal component analysis (PCA, grid search (GS, 10-fold cross validation (CV, and one-versus-one support vector machine (SVM. This method is referred to as PGC-SVM and utilizes the direct drive valve input, force motor current, and displacement feedback signal to realize fault detection and location. First, several common faults of aileron actuators, which include force motor coil break, sensor coil break, cylinder leakage, and amplifier gain reduction, are extracted from the fault quadrantal diagram; the corresponding fault mechanisms are analyzed. Second, the data feature extraction is performed with dimension reduction using PCA. Finally, the GS and CV algorithms are employed to train a one-versus-one SVM for fault classification, thus obtaining the optimal model parameters and assuring the generalization of the trained SVM, respectively. To verify the effectiveness of the proposed approach, four types of faults are introduced into the simulation model established by AMESim and Simulink. The results demonstrate its desirable diagnostic performance which outperforms that of the traditional SVM by comparison.

  15. An IPSO-SVM algorithm for security state prediction of mine production logistics system

    Science.gov (United States)

    Zhang, Yanliang; Lei, Junhui; Ma, Qiuli; Chen, Xin; Bi, Runfang

    2017-06-01

    A theoretical basis for the regulation of corporate security warning and resources was provided in order to reveal the laws behind the security state in mine production logistics. Considering complex mine production logistics system and the variable is difficult to acquire, a superior security status predicting model of mine production logistics system based on the improved particle swarm optimization and support vector machine (IPSO-SVM) is proposed in this paper. Firstly, through the linear adjustments of inertia weight and learning weights, the convergence speed and search accuracy are enhanced with the aim to deal with situations associated with the changeable complexity and the data acquisition difficulty. The improved particle swarm optimization (IPSO) is then introduced to resolve the problem of parameter settings in traditional support vector machines (SVM). At the same time, security status index system is built to determine the classification standards of safety status. The feasibility and effectiveness of this method is finally verified using the experimental results.

  16. A Statistical Parameter Analysis and SVM Based Fault Diagnosis Strategy for Dynamically Tuned Gyroscopes

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    Gyro's fault diagnosis plays a critical role in inertia navigation systems for higher reliability and precision. A new fault diagnosis strategy based on the statistical parameter analysis (SPA) and support vector machine(SVM) classification model was proposed for dynamically tuned gyroscopes (DTG). The SPA, a kind of time domain analysis approach, was introduced to compute a set of statistical parameters of vibration signal as the state features of DTG, with which the SVM model, a novel learning machine based on statistical learning theory (SLT), was applied and constructed to train and identify the working state of DTG. The experimental results verify that the proposed diagnostic strategy can simply and effectively extract the state features of DTG, and it outperforms the radial-basis function (RBF) neural network based diagnostic method and can more reliably and accurately diagnose the working state of DTG.

  17. A Hybrid SOM-SVM Approach for the Zebrafish Gene Expression Analysis

    Institute of Scientific and Technical Information of China (English)

    Wei Wu; Xin Liu; Min Xu; Jin-Rong Peng; Rudy Setiono

    2005-01-01

    Microarray technology can be employed to quantitatively measure the expression of thousands of genes in a single experiment. It has become one of the main tools for global gene expression analysis in molecular biology research in recent years. The large amount of expression data generated by this technology makes the study of certain complex biological problems possible, and machine learning methods are expected to play a crucial role in the analysis process. In this paper,we present our results from integrating the self-organizing map (SOM) and the support vector machine (SVM) for the analysis of the various functions of zebrafish genes based on their expression. The most distinctive characteristic of our zebrafish gene expression is that the number of samples of different classes is imbalanced. We discuss how SOM can be used as a data-filtering tool to improve the classification performance of the SVM on this data set.

  18. A Directed Acyclic Graph-Large Margin Distribution Machine Model for Music Symbol Classification.

    Science.gov (United States)

    Wen, Cuihong; Zhang, Jing; Rebelo, Ana; Cheng, Fanyong

    2016-01-01

    Optical Music Recognition (OMR) has received increasing attention in recent years. In this paper, we propose a classifier based on a new method named Directed Acyclic Graph-Large margin Distribution Machine (DAG-LDM). The DAG-LDM is an improvement of the Large margin Distribution Machine (LDM), which is a binary classifier that optimizes the margin distribution by maximizing the margin mean and minimizing the margin variance simultaneously. We modify the LDM to the DAG-LDM to solve the multi-class music symbol classification problem. Tests are conducted on more than 10000 music symbol images, obtained from handwritten and printed images of music scores. The proposed method provides superior classification capability and achieves much higher classification accuracy than the state-of-the-art algorithms such as Support Vector Machines (SVMs) and Neural Networks (NNs).

  19. A Directed Acyclic Graph-Large Margin Distribution Machine Model for Music Symbol Classification.

    Directory of Open Access Journals (Sweden)

    Cuihong Wen

    Full Text Available Optical Music Recognition (OMR has received increasing attention in recent years. In this paper, we propose a classifier based on a new method named Directed Acyclic Graph-Large margin Distribution Machine (DAG-LDM. The DAG-LDM is an improvement of the Large margin Distribution Machine (LDM, which is a binary classifier that optimizes the margin distribution by maximizing the margin mean and minimizing the margin variance simultaneously. We modify the LDM to the DAG-LDM to solve the multi-class music symbol classification problem. Tests are conducted on more than 10000 music symbol images, obtained from handwritten and printed images of music scores. The proposed method provides superior classification capability and achieves much higher classification accuracy than the state-of-the-art algorithms such as Support Vector Machines (SVMs and Neural Networks (NNs.

  20. CLASSIFICATION OF GEAR FAULTS USING HIGHER-ORDER STATISTICS AND SUPPORT VECTOR MACHINES

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    Gears alternately mesh and detach in driving process, and then working conditions of gears are alternately changing, so they are easy to be spalled and worn. But because of the effect of additive gaussian measurement noises, the signal-to-noises ratio is low; their fault features are difficult to extract. This study aims to propose an approach of gear faults classification,using the cumulants and support vector machines. The cumulants can eliminate the additive gaussian noises, boost the signal-to-noises ratio. Generalisation of support vector machines as classifier, which is employed structural risk minimisation principle, is superior to that of conventional neural networks, which is employed traditional empirical risk minimisation principle. Support vector machines as the classifier, and the third and fourth order cumulants as input, gears faults are successfully recognized. The experimental results show that the method of fault classification combining cumulants with support vector machines is very effective.

  1. Automatic classification of written descriptions by healthy adults: An overview of the application of natural language processing and machine learning techniques to clinical discourse analysis

    Directory of Open Access Journals (Sweden)

    Cíntia Matsuda Toledo

    Full Text Available Discourse production is an important aspect in the evaluation of brain-injured individuals. We believe that studies comparing the performance of brain-injured subjects with that of healthy controls must use groups with compatible education. A pioneering application of machine learning methods using Brazilian Portuguese for clinical purposes is described, highlighting education as an important variable in the Brazilian scenario.OBJECTIVE: The aims were to describe how to: (i develop machine learning classifiers using features generated by natural language processing tools to distinguish descriptions produced by healthy individuals into classes based on their years of education; and (ii automatically identify the features that best distinguish the groups.METHODS: The approach proposed here extracts linguistic features automatically from the written descriptions with the aid of two Natural Language Processing tools: Coh-Metrix-Port and AIC. It also includes nine task-specific features (three new ones, two extracted manually, besides description time; type of scene described - simple or complex; presentation order - which type of picture was described first; and age. In this study, the descriptions by 144 of the subjects studied in Toledo18 were used, which included 200 healthy Brazilians of both genders.RESULTS AND CONCLUSION:A Support Vector Machine (SVM with a radial basis function (RBF kernel is the most recommended approach for the binary classification of our data, classifying three of the four initial classes. CfsSubsetEval (CFS is a strong candidate to replace manual feature selection methods.

  2. 基于支持向量机的遥感影像分类比较研究%Comparative Study on Classification of Remote Seining Image by Support Vector Machine

    Institute of Scientific and Technical Information of China (English)

    王小明; 毛梦祺; 张昌景; 许勇

    2013-01-01

    Support vector machine (SVM) is an artificial intelligent algorithm based on theory of statistics learning. It is a promising classification algorithm and can overcome the limitation of traditional classification algorithm such as small data set, nonlinear, overfit-ting, high dimension and local minimum. In this paper, the TM image of Landsat - 5 is used for classification by the method of sup port vector machine. The results and precisions of classification are compared between the different parameter combinations. Further more, precisions are compared between the SVM and traditional algorithm. The results indicate that SVM classification algorithm has the advantage of broad parameters range without prior knowledge of image and samples. The precision of SVM algorithm is much higher than traditional algorithm adapting to the area without in situ measurement.%支持向量机是建立在统计学习理论基础上的一种新的人工智能算法,较好地克服了传统分类方法中存在的小样本、非线性、过学习、高维数、局部极小点等问题,是一种极具潜力的遥感影像分类算法.本研究采用Landsat-5的TM影像,用支持向量分类法对影像进行分类,分析了支持向量机不同参数组合情况下的分类精度,并对支持向量分类法与传统分类方法进行了比较,发现支持向量分类算法具有参数选择范围宽,不要求对待分类区域地物光谱特征和影像分布特征具有先验知识,分类精度高等特点,对于在没有现场同步实测数据的区域进行精确的分类具有特别重要的价值.

  3. Machine learning versus knowledge based classification of legal texts

    NARCIS (Netherlands)

    de Maat, E.; Krabben, K.; Winkels, R.

    2010-01-01

    This paper presents results of an experiment in which we used machine learning (ML) techniques to classify sentences in Dutch legislation. These results are compared to the results of a pattern-based classifier. Overall, the ML classifier performs as accurate (>90%) as the pattern based one, but

  4. Video Shot Boundary Detection in MPEG Compressed Sequences Using SVM Learning

    Institute of Scientific and Technical Information of China (English)

    GUO Lihua; YANG Shutang; LI Jianhua; TONG Zhipeng

    2003-01-01

    A number of automated video shot boundary detection methods for indexing a video sequence to facilitate browsing and retrieval have been proposed in recent years. Among these methods,the dissolve shot boundary isn't accurately detected because it involves the camera operation and object movement. In this paper, a method based on support vector machine (SVM) is proposed to detect the dissolve shot boundary in MPEG compressed sequence. The problem of detection between the dissolve shot boundary and other boundaries is considered as two-class classification in our method. Features from the compressed sequences are directly extracted without decoding them, and the optimal class boundary between two classes are learned from training data by using SVM. Experiments, which compare various classification methods, show that using proposed method encourages performance of video shot boundary detection.

  5. A SVM-based method for sentiment analysis in Persian language

    Science.gov (United States)

    Hajmohammadi, Mohammad Sadegh; Ibrahim, Roliana

    2013-03-01

    Persian language is the official language of Iran, Tajikistan and Afghanistan. Local online users often represent their opinions and experiences on the web with written Persian. Although the information in those reviews is valuable to potential consumers and sellers, the huge amount of web reviews make it difficult to give an unbiased evaluation to a product. In this paper, standard machine learning techniques SVM and naive Bayes are incorporated into the domain of online Persian Movie reviews to automatically classify user reviews as positive or negative and performance of these two classifiers is compared with each other in this language. The effects of feature presentations on classification performance are discussed. We find that accuracy is influenced by interaction between the classification models and the feature options. The SVM classifier achieves as well as or better accuracy than naive Bayes in Persian movie. Unigrams are proved better features than bigrams and trigrams in capturing Persian sentiment orientation.

  6. Experimental analysis of the performance of machine learning algorithms in the classification of navigation accident records

    Directory of Open Access Journals (Sweden)

    REIS, M V. S. de A.

    2017-06-01

    Full Text Available This paper aims to evaluate the use of machine learning techniques in a database of marine accidents. We analyzed and evaluated the main causes and types of marine accidents in the Northern Fluminense region. For this, machine learning techniques were used. The study showed that the modeling can be done in a satisfactory manner using different configurations of classification algorithms, varying the activation functions and training parameters. The SMO (Sequential Minimal Optimization algorithm showed the best performance result.

  7. A Comparison of Machine Learning Methods in a High-Dimensional Classification Problem

    Directory of Open Access Journals (Sweden)

    Zekić-Sušac Marijana

    2014-09-01

    Full Text Available Background: Large-dimensional data modelling often relies on variable reduction methods in the pre-processing and in the post-processing stage. However, such a reduction usually provides less information and yields a lower accuracy of the model. Objectives: The aim of this paper is to assess the high-dimensional classification problem of recognizing entrepreneurial intentions of students by machine learning methods. Methods/Approach: Four methods were tested: artificial neural networks, CART classification trees, support vector machines, and k-nearest neighbour on the same dataset in order to compare their efficiency in the sense of classification accuracy. The performance of each method was compared on ten subsamples in a 10-fold cross-validation procedure in order to assess computing sensitivity and specificity of each model. Results: The artificial neural network model based on multilayer perceptron yielded a higher classification rate than the models produced by other methods. The pairwise t-test showed a statistical significance between the artificial neural network and the k-nearest neighbour model, while the difference among other methods was not statistically significant. Conclusions: Tested machine learning methods are able to learn fast and achieve high classification accuracy. However, further advancement can be assured by testing a few additional methodological refinements in machine learning methods.

  8. Land Use Classification using Support Vector Machine and Maximum Likelihood Algorithms by Landsat 5 TM Images

    Directory of Open Access Journals (Sweden)

    Abbas TAATI

    2015-08-01

    Full Text Available Nowadays, remote sensing images have been identified and exploited as the latest information to study land cover and land uses. These digital images are of significant importance, since they can present timely information, and capable of providing land use maps. The aim of this study is to create land use classification using a support vector machine (SVM and maximum likelihood classifier (MLC in Qazvin, Iran, by TM images of the Landsat 5 satellite. In the pre-processing stage, the necessary corrections were applied to the images. In order to evaluate the accuracy of the 2 algorithms, the overall accuracy and kappa coefficient were used. The evaluation results verified that the SVM algorithm with an overall accuracy of 86.67 % and a kappa coefficient of 0.82 has a higher accuracy than the MLC algorithm in land use mapping. Therefore, this algorithm has been suggested to be applied as an optimal classifier for extraction of land use maps due to its higher accuracy and better consistency within the study area.

  9. Binary Color Classification For Brain Computer Interface Using Neural Networks And Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Charmi Sunil Mehta

    2014-04-01

    Full Text Available As the power of modern computers grows alongside our understanding of the human brain, we move a step closer in transforming some pretty spectacular science fiction into reality. The advent of Brain Computer Interface (BCI is indeed leading us to a burgeoning era of complete automation empowering our interaction with computer not only with robustness but with also a gift of intelligence. For the fraction of our society suffering from severe motor disabilities BCI has offered a novel solution of overcoming the problems faced in communicating and environment control. Thus the purpose of our current research is to harness the brain‟s ability to generate Visually Evoked Potentials (VEPs by capturing the response of the brain to the transitions of color from grey to green and grey to red. Our prime focus is to explore EEG-based signal processing techniques in order to classify two colors; which can be further deployed in future by coupling the actuators so as to perform few basic tasks. The extracted EEG features are classified using Support Vector Machines (SVM and Artificial Neural Networks (ANN. We recorded 100% accuracy on testing the model after training and validation process. Moreover, we obtained 90% accuracy on re-testing the model with all samples acquired for the task using Quadratic SVM classifier.

  10. Vision-Based Perception and Classification of Mosquitoes Using Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Masataka Fuchida

    2017-01-01

    Full Text Available The need for a novel automated mosquito perception and classification method is becoming increasingly essential in recent years, with steeply increasing number of mosquito-borne diseases and associated casualties. There exist remote sensing and GIS-based methods for mapping potential mosquito inhabitants and locations that are prone to mosquito-borne diseases, but these methods generally do not account for species-wise identification of mosquitoes in closed-perimeter regions. Traditional methods for mosquito classification involve highly manual processes requiring tedious sample collection and supervised laboratory analysis. In this research work, we present the design and experimental validation of an automated vision-based mosquito classification module that can deploy in closed-perimeter mosquito inhabitants. The module is capable of identifying mosquitoes from other bugs such as bees and flies by extracting the morphological features, followed by support vector machine-based classification. In addition, this paper presents the results of three variants of support vector machine classifier in the context of mosquito classification problem. This vision-based approach to the mosquito classification problem presents an efficient alternative to the conventional methods for mosquito surveillance, mapping and sample image collection. Experimental results involving classification between mosquitoes and a predefined set of other bugs using multiple classification strategies demonstrate the efficacy and validity of the proposed approach with a maximum recall of 98%.

  11. Advances in Patient Classification for Traditional Chinese Medicine: A Machine Learning Perspective

    Science.gov (United States)

    Zhao, Changbo; Li, Guo-Zheng; Wang, Chengjun; Niu, Jinling

    2015-01-01

    As a complementary and alternative medicine in medical field, traditional Chinese medicine (TCM) has drawn great attention in the domestic field and overseas. In practice, TCM provides a quite distinct methodology to patient diagnosis and treatment compared to western medicine (WM). Syndrome (ZHENG or pattern) is differentiated by a set of symptoms and signs examined from an individual by four main diagnostic methods: inspection, auscultation and olfaction, interrogation, and palpation which reflects the pathological and physiological changes of disease occurrence and development. Patient classification is to divide patients into several classes based on different criteria. In this paper, from the machine learning perspective, a survey on patient classification issue will be summarized on three major aspects of TCM: sign classification, syndrome differentiation, and disease classification. With the consideration of different diagnostic data analyzed by different computational methods, we present the overview for four subfields of TCM diagnosis, respectively. For each subfield, we design a rectangular reference list with applications in the horizontal direction and machine learning algorithms in the longitudinal direction. According to the current development of objective TCM diagnosis for patient classification, a discussion of the research issues around machine learning techniques with applications to TCM diagnosis is given to facilitate the further research for TCM patient classification. PMID:26246834

  12. Construction accident narrative classification: An evaluation of text mining techniques.

    Science.gov (United States)

    Goh, Yang Miang; Ubeynarayana, C U

    2017-08-31

    Learning from past accidents is fundamental to accident prevention. Thus, accident and near miss reporting are encouraged by organizations and regulators. However, for organizations managing large safety databases, the time taken to accurately classify accident and near miss narratives will be very significant. This study aims to evaluate the utility of various text mining classification techniques in classifying 1000 publicly available construction accident narratives obtained from the US OSHA website. The study evaluated six machine learning algorithms, including support vector machine (SVM), linear regression (LR), random forest (RF), k-nearest neighbor (KNN), decision tree (DT) and Naive Bayes (NB), and found that SVM produced the best performance in classifying the test set of 251 cases. Further experimentation with tokenization of the processed text and non-linear SVM were also conducted. In addition, a grid search was conducted on the hyperparameters of the SVM models. It was found that the best performing classifiers were linear SVM with unigram tokenization and radial basis function (RBF) SVM with uni-gram tokenization. In view of its relative simplicity, the linear SVM is recommended. Across the 11 labels of accident causes or types, the precision of the linear SVM ranged from 0.5 to 1, recall ranged from 0.36 to 0.9 and F1 score was between 0.45 and 0.92. The reasons for misclassification were discussed and suggestions on ways to improve the performance were provided. Copyright © 2017 Elsevier Ltd. All rights reserved.

  13. Automatic Parameters Selection for SVM Based on PSO

    Institute of Scientific and Technical Information of China (English)

    ZHANG Mingfeng; ZHU Yinghua; ZHENG Xu; LIU Yu

    2007-01-01

    Motivated by the fact that automatic parameters selection for Support Vector Machine (SVM) is an important issue to make SVM practically useful and the common used Leave-One-Out (LOO) method is complex calculation and time consuming,an effective strategy for automatic parameters selection for SVM is proposed by using the Particle Swarm Optimization (PSO) in this paper.Simulation results of practice data model demonstrate the effectiveness and high efficiency of the proposed approach.

  14. Improved Approach Based on SVM for License Plate Character Recognition

    Institute of Scientific and Technical Information of China (English)

    WANG Xiao-hua; WANG Xiao-guang

    2005-01-01

    An improved approach based on support vector machine (SVM) called the center distance ratio method is presented for license plate character recognition. First the support vectors are pre-extracted. A minimal set called the margin vector set, which contains all support vectors, is extracted. These margin vectors compose new training data and construct the classifier by using the general SVM optimized. The experimental results show that the improved SVM method does well at correct rate and training speed.

  15. Evaluating machine learning classification for financial trading: An empirical approach

    OpenAIRE

    Gerlein, EA; McGinnity, M; Belatreche, A; Coleman, S.

    2016-01-01

    Technical and quantitative analysis in financial trading use mathematical and statistical tools to help investors decide on the optimum moment to initiate and close orders. While these traditional approaches have served their purpose to some extent, new techniques arising from the field of computational intelligence such as machine learning and data mining have emerged to analyse financial information. While the main financial engineering research has focused on complex computational models s...

  16. Fall classification by machine learning using mobile phones.

    Directory of Open Access Journals (Sweden)

    Mark V Albert

    Full Text Available Fall prevention is a critical component of health care; falls are a common source of injury in the elderly and are associated with significant levels of mortality and morbidity. Automatically detecting falls can allow rapid response to potential emergencies; in addition, knowing the cause or manner of a fall can be beneficial for prevention studies or a more tailored emergency response. The purpose of this study is to demonstrate techniques to not only reliably detect a fall but also to automatically classify the type. We asked 15 subjects to simulate four different types of falls-left and right lateral, forward trips, and backward slips-while wearing mobile phones and previously validated, dedicated accelerometers. Nine subjects also wore the devices for ten days, to provide data for comparison with the simulated falls. We applied five machine learning classifiers to a large time-series feature set to detect falls. Support vector machines and regularized logistic regression were able to identify a fall with 98% accuracy and classify the type of fall with 99% accuracy. This work demonstrates how current machine learning approaches can simplify data collection for prevention in fall-related research as well as improve rapid response to potential injuries due to falls.

  17. Ultrasonic fluid quantity measurement in dynamic vehicular applications a support vector machine approach

    CERN Document Server

    Terzic, Jenny; Nagarajah, Romesh; Alamgir, Muhammad

    2013-01-01

    Accurate fluid level measurement in dynamic environments can be assessed using a Support Vector Machine (SVM) approach. SVM is a supervised learning model that analyzes and recognizes patterns. It is a signal classification technique which has far greater accuracy than conventional signal averaging methods. Ultrasonic Fluid Quantity Measurement in Dynamic Vehicular Applications: A Support Vector Machine Approach describes the research and development of a fluid level measurement system for dynamic environments. The measurement system is based on a single ultrasonic sensor. A Support Vector Machines (SVM) based signal characterization and processing system has been developed to compensate for the effects of slosh and temperature variation in fluid level measurement systems used in dynamic environments including automotive applications. It has been demonstrated that a simple ν-SVM model with Radial Basis Function (RBF) Kernel with the inclusion of a Moving Median filter could be used to achieve the high levels...

  18. A Fast Reduced Kernel Extreme Learning Machine.

    Science.gov (United States)

    Deng, Wan-Yu; Ong, Yew-Soon; Zheng, Qing-Hua

    2016-04-01

    In this paper, we present a fast and accurate kernel-based supervised algorithm referred to as the Reduced Kernel Extreme Learning Machine (RKELM). In contrast to the work on Support Vector Machine (SVM) or Least Square SVM (LS-SVM), which identifies the support vectors or weight vectors iteratively, the proposed RKELM randomly selects a subset of the available data samples as support vectors (or mapping samples). By avoiding the iterative steps of SVM, significant cost savings in the training process can be readily attained, especially on Big datasets. RKELM is established based on the rigorous proof of universal learning involving reduced kernel-based SLFN. In particular, we prove that RKELM can approximate any nonlinear functions accurately under the condition of support vectors sufficiency. Experimental results on a wide variety of real world small instance size and large instance size applications in the context of binary classification, multi-class problem and regression are then reported to show that RKELM can perform at competitive level of generalized performance as the SVM/LS-SVM at only a fraction of the computational effort incurred.

  19. Vibration fault diagnosis for steam turbine by using support vector machine based on fruit fly optimization algorithm%基于 FOA -SVM 的汽轮机振动故障诊断

    Institute of Scientific and Technical Information of China (English)

    石志标; 苗莹

    2014-01-01

    为解决支持向量机算法(Support Vector Machine,SVM)的核函数参数及惩罚因子参数选取的盲目性,利用果蝇优化算法(Fruit Fly Optimization Algorithm,FOA)对 SVM中参数进行优化。提出基于 FOA 的 SVM故障诊断算法,并对汽轮机故障实验数据进行模式识别。该算法能对 SVM相关参数自动寻优,且能达到较理想的全局最优解。通过与常用的粒子群算法(Particle Swarm Optimization,PSO)与遗传算法(Genetic Algorithm,GA)优化后支持向量机进行对比。结果表明,FOA -SVM算法稳定、识别速度快、识别率高。%In order to solve the problem that the selection of the kernel function parameters and penalty factor parameters in the support vector machine(SVM)algorithm is blindfold,the fruit fly optimization algorithm (FOA)was applied to optimize the parameters in SVM.A fault diagnosis algorithm of SVM based on FOA was put forward,and then the pattern recognition of experimental turbine failure data was performed.The algorithm can optimize the SVMparameters automatically,and achieve ideal global optimal solution.Comparing with the SVMwhich is optimized by the commonly used methods of the particle swarm optimization(PSO)and the Genetic Algorithm (GA),the results demonstrate that FOA-SVMhas the fastest recognition speed and the highest recognition rate.

  20. Machine learning prediction for classification of outcomes in local minimisation

    Science.gov (United States)

    Das, Ritankar; Wales, David J.

    2017-01-01

    Machine learning schemes are employed to predict which local minimum will result from local energy minimisation of random starting configurations for a triatomic cluster. The input data consists of structural information at one or more of the configurations in optimisation sequences that converge to one of four distinct local minima. The ability to make reliable predictions, in terms of the energy or other properties of interest, could save significant computational resources in sampling procedures that involve systematic geometry optimisation. Results are compared for two energy minimisation schemes, and for neural network and quadratic functions of the inputs.

  1. A method for classification of network traffic based on C5.0 Machine Learning Algorithm

    DEFF Research Database (Denmark)

    Bujlow, Tomasz; Riaz, M. Tahir; Pedersen, Jens Myrup

    2012-01-01

    current network traffic. To overcome the drawbacks of existing methods for traffic classification, usage of C5.0 Machine Learning Algorithm (MLA) was proposed. On the basis of statistical traffic information received from volunteers and C5.0 algorithm we constructed a boosted classifier, which was shown...

  2. Curriculum Assessment Using Artificial Neural Network and Support Vector Machine Modeling Approaches: A Case Study. IR Applications. Volume 29

    Science.gov (United States)

    Chen, Chau-Kuang

    2010-01-01

    Artificial Neural Network (ANN) and Support Vector Machine (SVM) approaches have been on the cutting edge of science and technology for pattern recognition and data classification. In the ANN model, classification accuracy can be achieved by using the feed-forward of inputs, back-propagation of errors, and the adjustment of connection weights. In…

  3. SVM-PB-Pred: SVM based protein block prediction method using sequence profiles and secondary structures.

    Science.gov (United States)

    Suresh, V; Parthasarathy, S

    2014-01-01

    We developed a support vector machine based web server called SVM-PB-Pred, to predict the Protein Block for any given amino acid sequence. The input features of SVM-PB-Pred include i) sequence profiles (PSSM) and ii) actual secondary structures (SS) from DSSP method or predicted secondary structures from NPS@ and GOR4 methods. There were three combined input features PSSM+SS(DSSP), PSSM+SS(NPS@) and PSSM+SS(GOR4) used to test and train the SVM models. Similarly, four datasets RS90, DB433, LI1264 and SP1577 were used to develop the SVM models. These four SVM models developed were tested using three different benchmarking tests namely; (i) self consistency, (ii) seven fold cross validation test and (iii) independent case test. The maximum possible prediction accuracy of ~70% was observed in self consistency test for the SVM models of both LI1264 and SP1577 datasets, where PSSM+SS(DSSP) input features was used to test. The prediction accuracies were reduced to ~53% for PSSM+SS(NPS@) and ~43% for PSSM+SS(GOR4) in independent case test, for the SVM models of above two same datasets. Using our method, it is possible to predict the protein block letters for any query protein sequence with ~53% accuracy, when the SP1577 dataset and predicted secondary structure from NPS@ server were used. The SVM-PB-Pred server can be freely accessed through http://bioinfo.bdu.ac.in/~svmpbpred.

  4. On Machine-Learned Classification of Variable Stars with Sparse and Noisy Time-Series Data

    CERN Document Server

    Richards, Joseph W; Butler, Nathaniel R; Bloom, Joshua S; Brewer, John M; Crellin-Quick, Arien; Higgins, Justin; Kennedy, Rachel; Rischard, Maxime

    2011-01-01

    With the coming data deluge from synoptic surveys, there is a growing need for frameworks that can quickly and automatically produce calibrated classification probabilities for newly-observed variables based on a small number of time-series measurements. In this paper, we introduce a methodology for variable-star classification, drawing from modern machine-learning techniques. We describe how to homogenize the information gleaned from light curves by selection and computation of real-numbered metrics ("feature"), detail methods to robustly estimate periodic light-curve features, introduce tree-ensemble methods for accurate variable star classification, and show how to rigorously evaluate the classification results using cross validation. On a 25-class data set of 1542 well-studied variable stars, we achieve a 22.8% overall classification error using the random forest classifier; this represents a 24% improvement over the best previous classifier on these data. This methodology is effective for identifying sam...

  5. Hyperspectral remote sensing image classification based on decision level fusion

    Institute of Scientific and Technical Information of China (English)

    Peijun Du; Wei Zhang; Junshi Xia

    2011-01-01

    @@ To apply decision level fusion to hyperspectral remote sensing (HRS) image classification, three decision level fusion strategies are experimented on and compared, namely, linear consensus algorithm, improved evidence theory, and the proposed support vector machine (SVM) combiner.To evaluate the effects of the input features on classification performance, four schemes are used to organize input features for member classifiers.In the experiment, by using the operational modular imaging spectrometer (OMIS) II HRS image, the decision level fusion is shown as an effective way for improving the classification accuracy of the HRS image, and the proposed SVM combiner is especially suitable for decision level fusion.The results also indicate that the optimization of input features can improve the classification performance.%To apply decision level fusion to hyperspectral remote sensing (HRS) image classification, three decision level fusion strategies are experimented on and compared, namely, linear consensus algorithm, improved evidence theory, and the proposed support vector machine (SVM) combiner. To evaluate the effects of the input features on classification performance, four schemes are used to organize input features for member classifiers. In the experiment, by using the operational modular imaging spectrometer (OMIS) Ⅱ HRS image, the decision level fusion is shown as an effective way for improving the classification accuracy of the HRS image, and the proposed SVM combiner is especially suitable for decision level fusion. The results also indicate that the optimization of input features can improve the classification performance.

  6. Steerable Wavelet Machines (SWM): Learning Moving Frames for Texture Classification.

    Science.gov (United States)

    Depeursinge, Adrien; Puspoki, Zsuzsanna; Ward, John Paul; Unser, Michael

    2017-04-01

    We present texture operators encoding class-specific local organizations of image directions (LOIDs) in a rotation-invariant fashion. The LOIDs are key for visual understanding, and are at the origin of the success of the popular approaches, such as local binary patterns (LBPs) and the scale-invariant feature transform (SIFT). Whereas, LBPs and SIFT yield hand-crafted image representations, we propose to learn data-specific representations of the LOIDs in a rotation-invariant fashion. The image operators are based on steerable circular harmonic wavelets (CHWs), offering a rich and yet compact initial representation for characterizing natural textures. The joint location and orientation required to encode the LOIDs is preserved by using moving frames (MFs) texture representations built from locally-steered image gradients that are invariant to rigid motions. In a second step, we use support vector machines to learn a multi-class shaping matrix for the initial CHW representation, yielding data-driven MFs called steerable wavelet machines (SWMs). The SWM forward function is composed of linear operations (i.e., convolution and weighted combinations) interleaved with non-linear steermax operations. We experimentally demonstrate the effectiveness of the proposed operators for classifying natural textures. Our scheme outperforms recent approaches on several test suites of the Outex and the CUReT databases.

  7. Nonlinear Methodologies for Identifying Seismic Event and Nuclear Explosion Using Random Forest, Support Vector Machine, and Naive Bayes Classification

    Directory of Open Access Journals (Sweden)

    Longjun Dong

    2014-01-01

    Full Text Available The discrimination of seismic event and nuclear explosion is a complex and nonlinear system. The nonlinear methodologies including Random Forests (RF, Support Vector Machines (SVM, and Naïve Bayes Classifier (NBC were applied to discriminant seismic events. Twenty earthquakes and twenty-seven explosions with nine ratios of the energies contained within predetermined “velocity windows” and calculated distance are used in discriminators. Based on the one out cross-validation, ROC curve, calculated accuracy of training and test samples, and discriminating performances of RF, SVM, and NBC were discussed and compared. The result of RF method clearly shows the best predictive power with a maximum area of 0.975 under the ROC among RF, SVM, and NBC. The discriminant accuracies of RF, SVM, and NBC for test samples are 92.86%, 85.71%, and 92.86%, respectively. It has been demonstrated that the presented RF model can not only identify seismic event automatically with high accuracy, but also can sort the discriminant indicators according to calculated values of weights.

  8. Machine learning methods for the classification of gliomas: Initial results using features extracted from MR spectroscopy.

    Science.gov (United States)

    Ranjith, G; Parvathy, R; Vikas, V; Chandrasekharan, Kesavadas; Nair, Suresh

    2015-04-01

    With the advent of new imaging modalities, radiologists are faced with handling increasing volumes of data for diagnosis and treatment planning. The use of automated and intelligent systems is becoming essential in such a scenario. Machine learning, a branch of artificial intelligence, is increasingly being used in medical image analysis applications such as image segmentation, registration and computer-aided diagnosis and detection. Histopathological analysis is currently the gold standard for classification of brain tumors. The use of machine learning algorithms along with extraction of relevant features from magnetic resonance imaging (MRI) holds promise of replacing conventional invasive methods of tumor classification. The aim of the study is to classify gliomas into benign and malignant types using MRI data. Retrospective data from 28 patients who were diagnosed with glioma were used for the analysis. WHO Grade II (low-grade astrocytoma) was classified as benign while Grade III (anaplastic astrocytoma) and Grade IV (glioblastoma multiforme) were classified as malignant. Features were extracted from MR spectroscopy. The classification was done using four machine learning algorithms: multilayer perceptrons, support vector machine, random forest and locally weighted learning. Three of the four machine learning algorithms gave an area under ROC curve in excess of 0.80. Random forest gave the best performance in terms of AUC (0.911) while sensitivity was best for locally weighted learning (86.1%). The performance of different machine learning algorithms in the classification of gliomas is promising. An even better performance may be expected by integrating features extracted from other MR sequences. © The Author(s) 2015 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.

  9. Operator functional state classification using least-square support vector machine based recursive feature elimination technique.

    Science.gov (United States)

    Yin, Zhong; Zhang, Jianhua

    2014-01-01

    This paper proposed two psychophysiological-data-driven classification frameworks for operator functional states (OFS) assessment in safety-critical human-machine systems with stable generalization ability. The recursive feature elimination (RFE) and least square support vector machine (LSSVM) are combined and used for binary and multiclass feature selection. Besides typical binary LSSVM classifiers for two-class OFS assessment, two multiclass classifiers based on multiclass LSSVM-RFE and decision directed acyclic graph (DDAG) scheme are developed, one used for recognizing the high mental workload and fatigued state while the other for differentiating overloaded and base-line states from the normal states. Feature selection results have revealed that different dimensions of OFS can be characterized by specific set of psychophysiological features. Performance comparison studies show that reasonable high and stable classification accuracy of both classification frameworks can be achieved if the RFE procedure is properly implemented and utilized.

  10. Supply Chain Dynamic Performance Measurement Based on BSC and SVM

    Directory of Open Access Journals (Sweden)

    Yan Hong

    2013-01-01

    Full Text Available Now individual contest among enterprises has been turning into collective contest among supply chains. Supply chain management (SCM has been a major component of competitive strategy to enhance organizational productivity and profitability. In recent years, organizational performance measurement and metrics have received much attention from researchers and practitioners. The foundation of proper supply chain performance assessment system is the basis of its effective operation and management. Most of the traditional supply chain performance evaluation is a static evaluation, while the actual supply chain is a dynamic system, therefore need to adapt with ways to carry out the evaluation. In order to meet the needs of the dynamic alliance's overall performance evaluation, this paper extended the traditional four Balanced Scorecard dimension into five. On this basis, established the five Balanced Scorecard dimension of supply chain, and also established a three-layered of quantitative index system according to this model. Measured then each performance indexs value by using the theory of Fuzzy Analytic Hierarchy Process, meanwhile reduced the number of input of the Support Vector Machine (SVM by using classification method, finally, got performance evaluations result by using the weighted Least Squares Support Vector Machine (LS-SVM, which provides the basis for rational analysis and decision-making of the supply chain.

  11. Applications of PCA and SVM-PSO Based Real-Time Face Recognition System

    Directory of Open Access Journals (Sweden)

    Ming-Yuan Shieh

    2014-01-01

    Full Text Available This paper incorporates principal component analysis (PCA with support vector machine-particle swarm optimization (SVM-PSO for developing real-time face recognition systems. The integrated scheme aims to adopt the SVM-PSO method to improve the validity of PCA based image recognition systems on dynamically visual perception. The face recognition for most human-robot interaction applications is accomplished by PCA based method because of its dimensionality reduction. However, PCA based systems are only suitable for processing the faces with the same face expressions and/or under the same view directions. Since the facial feature selection process can be considered as a problem of global combinatorial optimization in machine learning, the SVM-PSO is usually used as an optimal classifier of the system. In this paper, the PSO is used to implement a feature selection, and the SVMs serve as fitness functions of the PSO for classification problems. Experimental results demonstrate that the proposed method simplifies features effectively and obtains higher classification accuracy.

  12. An SVM-based solution for fault detection in wind turbines.

    Science.gov (United States)

    Santos, Pedro; Villa, Luisa F; Reñones, Aníbal; Bustillo, Andres; Maudes, Jesús

    2015-03-09

    Research into fault diagnosis in machines with a wide range of variable loads and speeds, such as wind turbines, is of great industrial interest. Analysis of the power signals emitted by wind turbines for the diagnosis of mechanical faults in their mechanical transmission chain is insufficient. A successful diagnosis requires the inclusion of accelerometers to evaluate vibrations. This work presents a multi-sensory system for fault diagnosis in wind turbines, combined with a data-mining solution for the classification of the operational state of the turbine. The selected sensors are accelerometers, in which vibration signals are processed using angular resampling techniques and electrical, torque and speed measurements. Support vector machines (SVMs) are selected for the classification task, including two traditional and two promising new kernels. This multi-sensory system has been validated on a test-bed that simulates the real conditions of wind turbines with two fault typologies: misalignment and imbalance. Comparison of SVM performance with the results of artificial neural networks (ANNs) shows that linear kernel SVM outperforms other kernels and ANNs in terms of accuracy, training and tuning times. The suitability and superior performance of linear SVM is also experimentally analyzed, to conclude that this data acquisition technique generates linearly separable datasets.